Submitting jobs to the CERN HTCondor pool
This twiki explains how to use the
preparelocal
CRAB
command to send jobs to the CERN condor pool (
http://batchdocs.web.cern.ch/batchdocs/index.html
)
Preliminary setup
Once you have
setup the environment, you need to create the
CRAB project directory for your task.
If you
- have already submitted the task you can
- simply
cd
to the project directory created at submission time
- or create it with the
crab remake
command;
- have not yet submitted the task, do it (if you do not want it to run on grid, kill it as soon as it reaches SUBMITTED status)
Once the CRAB project directory is created, and task has reached the SUBMITTED status execute
crab preparelocal --dir = <PROJECTDIR>
HTCondor submission
Add
#!/bin/bash
as first line of the
run_job.sh
file in the
<PROJECTDIR>/local
directory.
Create a "batch" subdirectory in order to keep HTCondor files separated:
mkdir <PROJECTDIR>/local/batch
cd <PROJECTDIR>/local/batch
and place the following
task.jdl
example file therein:
Universe = vanilla
Executable = ../run_job.sh
Arguments = $(I)
Log = log/job.$(Cluster).$(Process).log
Output = out/job.$(Cluster).$(Process).out
Error = err/job.$(Cluster).$(Process).err
transfer_input_files = ../CMSRunAnalysis.sh, ../CMSRunAnalysis.tar.gz, ../InputArgs.txt, ../Job.submit, ../cmscp.py, ../gWMS-CMSRunAnalysis.sh, ../input_files.tar.gz, ../run_and_lumis.tar.gz, ../sandbox.tar.gz
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
# Resources request
RequestCpus = 1
RequestMemory = 2000
+JobFlavour = "workday"
# Jobs selection
Queue I from (
1
2
3
4
)
This configuration example will submit only the first 4 jobs of the task.
In order to customise the resources request (number of CPUs, maximum memory, maximum runtime, etc.) and the jobs submission please refer to the CERN Batch Service documentation:
http://batchdocs.web.cern.ch/batchdocs/local/submit.html
Create the auxiliary directories and submit the jobs with:
mkdir out err log
condor_submit task.jdl
You can check the jobs status with:
condor_q -nobatch
Grid Proxy
Your jobs may need a grid proxy e.g. if they need to use xrootd to access remote files, either intentionally or due to automatic fallback from local
EOS read. Reading from
EOS locally at CERN is controlled via your local username and proxy is not needed for that.
HTCondor will not automatically make such proxy available to jobs (sort of obvious, since you can submit w/o creating it), so it is up to you to make this possible. There are (at leat) 3 different ways.
- place your proxy in your home area, rely on the fact that this is mounted on all batch nodes, and define an env. variable in the job to point there
- pros: recommended by CERN IT. Simple.
- cons: you can not use this on a different HTCondor pool where your home is not mounted on WN.
- pass your proxy to your job as input file and
- pros: recommended by CERN IT. Will work no matter where your jobs will run (flock to some distant pool ?). You do not need to care about your proxy after you submitted. Will work from any place which offers condor_submit.
- cons: complicated. No way to renew the proxy once the jobs have been submitted.
- place your proxy in your home areas as 1., rely on the fact that local htcondor scheduler mount your home disk (currently true at FNALLPC and CERN) and ask HTCondor to delegate the proxy from there to the running jobs
- pros: simple. Will work no matter where your jobs will run (flock to some distant pool ?). Proxy on running jobs will be automatically renewed by HTCondor
- cons: may not work at all sites where HTCondor is available.
In case 1. and 3. you need to use
voms-proxy-init -voms cms
as needed in order to have a valid proxy as long as your jobs may need it. Which is a bit more often than what's required in case you submit via CRAB, since CRAB server has a dedicated machinery to renew your proxy. In case 2. whatever you do locally has no effect on running jobs.
The first two methods are documented here:
http://batchdocs.web.cern.ch/batchdocs/tutorial/exercise2e_proxy.html
Instructions for the 3rd one (which is very similar to what CRAB does) are below:
1 mkdir -p $HOME/tmp
2 put your grid proxy in your home area with the following line in your login file:
for .cshrc: setenv X509_USER_PROXY $HOME/tmp/x509up
for .bashrc: export X509_USER_PROXY=$HOME/tmp/x509up
(then log out and log in again and do voms-proxy-init ....)
3. put this in your JDL
x509userproxy = $ENV(X509_USER_PROXY)
use_x509userproxy = True
--
MarcoMascheroni - 2018-06-06