Running jobs directly on the site WN


Often there are problems in accessing the site shared software area due to NFS or AFS breaking which means that the applications cannot be found by the job leading to the job failing. As a site admin or T1 LHCb contact, you may want to run an LHCb application directly on the site WNs in order to determine that everything is working. It should be noted that often the problems with NFS/AFS are only seen when there are many 10's or 100's of jobs simultaneously trying to access the same software area. The load on the NFS servers increase, leading to failures (i.e. stale NFS file handles). Therefore, following the steps below may result in a job which is successful even though the site still has problems under load.

Additionally, it can happen where NFS is incorrectly configured on a single machine at a site. In this case, only jobs going to that machine will fail. This can be identified by a single node at a site gobbling through many jobs, all of which fail.

Environment setup script

In order to get the same environment as the job which would run on the site, log onto the WN and source this script. You will have to modify the paths at the start to point to your shared software area.

# General setup environment
export HEPSOFT=/path/to/the/shared/software/area
export LHCB_DIR=$HEPSOFT/lhcb-soft
export CMTCONFIG=slc4_ia32_gcc34
export LCG_release_area=$MYSITEROOT/lcg/external


export EMACSDIR=$MYSITEROOT/lhcb/TOOLS/Tools/Emacs/pro

export CVS_RSH=ssh

#Only when starting an interactive session
if [[ $TERM != "dumb" || $ENVIRONMENT == "BATCH" ]]
   source $MYSITEROOT/scripts/

if [ -f $HOME/.hepix/cern-user-name ]
    export cernuser=`cat $HOME/.hepix/cern-user-name`
    export cernuser=$USER

export GETPACK_USER=$cernuser

if [ -d $HOME/cmtuser ]
    export User_release_area=$HOME/cmtuser
    echo "LHCb you don't have a cmtuser directory. I'll create one for you."
    mkdir $HOME/cmtuser
    export User_release_area=$HOME/cmtuser

#Old-style alias definitions
unalias getpack > /dev/null 2>&1
alias getpack="export USER=$cernuser; $MYSITEROOT/scripts/getpack -f ssh"

unalias DaVinciEnv > /dev/null 2>&1
alias DaVinciEnv="source $MYSITEROOT/scripts/ DaVinci"

unalias GangaEnv > /dev/null 2>&1
alias GangaEnv="source /exports/work/physics_ifp_ppe/ganga/install/etc/"

unalias SetupProject > /dev/null 2>&1
alias SetupProject="source $MYSITEROOT/scripts/"

unalias setenvDaVinci > /dev/null 2>&1
alias setenvDaVinci="source $MYSITEROOT/scripts/ DaVinci"

unalias setenvGauss > /dev/null 2>&1
alias setenvGauss="source $MYSITEROOT/scripts/ Gauss"

Running the job

Once the environment is correct, you can run the job by doing:


Where you get from somewhere like /afs/ (for a basic DaVinci job). Look in the Gauss directory for the Gauss options files etc.

If there are problems with the execution of the application then this is a good sign that Grid jobs running on the site will also have problems.

-- GreigCowan - 10 Dec 2008

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2008-12-10 - GreigCowan
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCb All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback