Pilot is started by APF wrapper(
https://www.racf.bnl.gov/experiments/usatlas/griddev/AutoPyFactory
). You can get a copy of the wrapper.
Here is an example to run pilot.
Run pilot with ES
Here is an example I run ES on lxplus. Before run ES, you may need to define some ES tasks(
https://twiki.cern.ch/twiki/bin/view/PanDA/EventServiceOperations#To_Test_The_Site) on one site. Then you can run on your local tasks without affecting prod tasks.
cd /tmp
dir=$(mktemp -d)
cd $dir
echo $dir
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh
localSetupRucioClients
source /cvmfs/atlas.cern.ch/repo/sw/local/setup-yampl.sh
export OSG_GRID=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/current/
export VO_ATLAS_SW_DIR=/cvmfs/atlas.cern.ch/repo/sw
source $OSG_GRID/setup.sh
export RUCIO_ACCOUNT=$USER
rm -f pilot.tar.gz
# prod dev pilot
#wget http://project-atlas-gmsb.web.cern.ch/project-atlas-gmsb/pilotcode-dev.tar.gz -O pilot.tar.gz
# pilot release candidate(pre-prod pilot)
#wget http://pandaserver.cern.ch:25085/cache/pilot/pilotcode-rc.tar.gz -O pilot.tar.gz
# prod pilot
wget http://pandaserver.cern.ch:25085/cache/pilot/pilotcode-PICARD.tar.gz -O pilot.tar.gz
tar xzf pilot.tar.gz
#python pilot.py -s BNL_PROD -h BNL_PROD-condor -w https://pandaserver.cern.ch -p 25443 -d /usatlas/u/wguan/panda/test -u ptest
#python pilot.py -s IN2P3-LPSC_CLOUD_MCORE -h IN2P3-LPSC_CLOUD_MCORE -w https://pandaserver.cern.ch -p 25443 -d /tmp/wguan/test2
#python pilot.py -s $site -h $queue -w https://pandaserver.cern.ch -p 25443 -d $dir -v AP_VALI
# "-u ptest" is special option to test pilots. When a task is defined with 'prodSourceLabel: ptest', prod pilot will not run it. Then we can use our local pilot with this option to run these tasks.
python pilot.py -s $site -h $queue -w https://pandaserver.cern.ch -p 25443 -d $dir -u ptest
Run pilot without wrapper
Here is an example.
cd /tmp/wguan
export OSG_GRID=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/current/
export VO_ATLAS_SW_DIR=/cvmfs/atlas.cern.ch/repo/sw
source $OSG_GRID/setup.sh
#export COPYTOOL=gfal-copy
#export COPYTOOLIN=gfal-copy
#export RUCIO_ACCOUNT=wguan
rm -f pilot.tar.gz
wget http://project-atlas-gmsb.web.cern.ch/project-atlas-gmsb/pilotcode-dev.tar.gz -O pilot.tar.gz
tar xzf pilot.tar.gz
#python pilot.py -s NERSC_Edison -h NERSC_Edison -w https://pandaserver.cern.ch -p 25443 -d /scratch2/scratchdirs/wguan/Edison/hpcf/edison02/wguan-pilot-dev-HPC_merge/ -N 2 -Q debug -u ptest
python pilot.py -s NERSC_Edison -h NERSC_Edison -w https://pandaserver.cern.ch -p 25443 -d /global/cscratch1/sd/wguan/pilot/hpcf/cori19/test1 -N 6 -Q debug
Run pilot example
Here is an example to start pilot on NERSC Edision.
#cat /project/projectdirs/atlas/pilot/RunPilot.sh
# if cvmfs
export OSG_GRID=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/current/
export VO_ATLAS_SW_DIR=/cvmfs/atlas.cern.ch/repo/sw
# NERSC
# export OSG_GRID=/global/project/projectdirs/atlas/pilot/grid_env
# export VO_ATLAS_SW_DIR=/project/projectdirs/atlas
rm -f wrapper-0.9.10.sh
wget http://wguan-wisc.web.cern.ch/wguan-wisc/wrapper-0.9.10_hpc.sh -O wrapper-0.9.10.sh
chmod +x wrapper-0.9.10.sh
# export COPYTOOL=gfal-copy
# export COPYTOOLIN=gfal-copy
./wrapper-0.9.10.sh --wrapperloglevel=debug --wrappergrid=OSG --wrapperwmsqueue=ANALY_WISC_ATLAS --wrapperbatchqueue=ANALY_WISC_ATLAS --wrappervo=ATLAS --wrappertarballurl=http://dev.racf.bnl.gov/dist/wrapper/wrapperplugins-0.9.10.tar.gz --wrapperpilotcodeurl=http://pandaserver.cern.ch:25085/cache/pilot --wrapperpilotcode=pilotcode-PICARD,pilotcode-rc -s ANALY_WISC_ATLAS -q ANALY_WISC_ATLAS -w https://aipanda007.cern.ch -p 25443 -d /tmp/wguan/
In this example, settting up COPYTOOLIN and COPYTOOLIN environment is for staging input files from SE and staging out output log files to SE. I only tested gfal site mover on HPC cray system.
" -d /scratch2/scratchdirs/wguan/Edison" is the pilot working directory. You need to change it to your writable directory.
Run pilot example 2
Here is an example to start pilot using runpilot3-wrapper.sh, you can try it on Lxplus.
#cat RunPilot.sh
export OSG_GRID=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/current/
export VO_ATLAS_SW_DIR=/cvmfs/atlas.cern.ch/repo/sw
rm -f runpilot3-wrapper.sh
wget https://raw.githubusercontent.com/ptrlv/adc/master/runpilot3-wrapper.sh -O runpilot3-wrapper.sh
chmod +x runpilot3-wrapper.sh
# define pilot code. If it's not defined, default one will be used.
# export PILOT_HTTP_SOURCES=http://project-atlas-gmsb.web.cern.ch/project-atlas-gmsb/pilotcode-dev.tar.gz
./runpilot3-wrapper.sh -s CERN-P1 -h CERN-P1-OpenStack -p 25443 -w https://pandaserver.cern.ch
Run pilot on Grid
In ATLAS we are using condor to submit pilots to Grid. From this link
http://apfmon.lancs.ac.uk
you can find the production apf condor machines.
There is another machine aipanda020 which is used for developing(pilot, apf). Here is an example on aipanda020.
[wguan@aipanda020 RemoteTests]$ cat Logs/gfal-copy/FZK-LCG2-all-prod-CEs/submit.jdl
CodeDir = /data/wguan/Panda/pilot/RemoteTests
LogDir = /data/wguan/Panda/pilot/RemoteTests/Logs/gfal-copy/FZK-LCG2-all-prod-CEs
notify_user = wguan@cern.ch
grid_resource = cream cream-ge-3-kit.gridka.de:8443/ce-cream/services/CREAM2 sge sl6
environment = " COPYTOOL=gfal-copy "
x509userproxy = /tmp/x509up_u23959_EU
executable = $(CodeDir)/wrapper/runpilot3-wrapper.sh
arguments = -s FZK-LCG2 -h FZK-LCG2-all-prod-CEs -p 25443 -w https://pandaserver.cern.ch -j false -k 3500 -u ptest
transfer_input_files = $(CodeDir)/wrapper/runpilot3-wrapper.sh
#globusrsl = (jobtype=single)(queue=paul_test_q)
globusrsl = (jobtype=single)
#+RACF_Group = "dq2test"
universe = grid
output = $(LogDir)/$(Cluster).$(Process).output
error = $(LogDir)/$(Cluster).$(Process).error
log = $(LogDir)/$(Cluster).$(Process).log
stream_output = False
stream_error = False
notification = Error
transfer_executable = True
should_transfer_files = YES
when_to_transfer_output = ON_EXIT_OR_EVICT
periodic_hold = GlobusResourceUnavailableTime =!= UNDEFINED &&(CurrentTime-GlobusResourceUnavailableTime>30)
periodic_remove = (JobStatus == 5 && (CurrentTime - EnteredCurrentStatus) > 3600) || (JobStatus == 1 && globusstatus =!= 1 && (CurrentTime - EnteredCurrentStatus) > 86400)
+Nonessential = True
copy_to_spool = false
globusrsl = (=atlasXL)
globusrsl = (jobtype=single)(=atlasXL)
queue 1
--
WenGuan - 2015-02-24