Submitting jobs to the CERN HTCondor pool

This twiki explains how to use the preparelocal CRAB command to send jobs to the CERN condor pool (http://batchdocs.web.cern.ch/batchdocs/index.html)

Preliminary setup

Once you have setup the environment, you need to create the CRAB project directory for your task. If you

  • have already submitted the task you can
    • simply cd to the project directory created at submission time
    • or create it with the crab remake command;
  • have not yet submitted the task, do it (if you do not want it to run on grid, kill it as soon as it reaches SUBMITTED status)

Once the CRAB project directory is created, and task has reached the SUBMITTED status execute

crab preparelocal --dir = <PROJECTDIR>

HTCondor submission

Add #!/bin/bash as first line of the run_job.sh file in the <PROJECTDIR>/local directory.

Create a "batch" subdirectory in order to keep HTCondor files separated:

mkdir <PROJECTDIR>/local/batch
cd <PROJECTDIR>/local/batch

and place the following task.jdl example file therein:

Universe  = vanilla
Executable = ../run_job.sh
Arguments = $(I)
Log = log/job.$(Cluster).$(Process).log
Output = out/job.$(Cluster).$(Process).out
Error = err/job.$(Cluster).$(Process).err
transfer_input_files = ../CMSRunAnalysis.sh, ../CMSRunAnalysis.tar.gz, ../InputArgs.txt, ../Job.submit, ../cmscp.py, ../gWMS-CMSRunAnalysis.sh, ../input_files.tar.gz, ../run_and_lumis.tar.gz, ../sandbox.tar.gz
should_transfer_files = YES
when_to_transfer_output = ON_EXIT

# Resources request
RequestCpus = 1
RequestMemory = 2000
+JobFlavour = "workday"

# Jobs selection
Queue I from (
1
2
3
4
)

This configuration example will submit only the first 4 jobs of the task. In order to customise the resources request (number of CPUs, maximum memory, maximum runtime, etc.) and the jobs submission please refer to the CERN Batch Service documentation: http://batchdocs.web.cern.ch/batchdocs/local/submit.html

Create the auxiliary directories and submit the jobs with:

mkdir out err log
condor_submit task.jdl

You can check the jobs status with:

condor_q -nobatch

Grid Proxy

Your jobs may need a grid proxy e.g. if they need to use xrootd to access remote files, either intentionally or due to automatic fallback from local EOS read. Reading from EOS locally at CERN is controlled via your local username and proxy is not needed for that.

HTCondor will not automatically make such proxy available to jobs (sort of obvious, since you can submit w/o creating it), so it is up to you to make this possible. There are (at leat) 3 different ways.

  1. place your proxy in your home area, rely on the fact that this is mounted on all batch nodes, and define an env. variable in the job to point there
    • pros: recommended by CERN IT. Simple.
    • cons: you can not use this on a different HTCondor pool where your home is not mounted on WN.
  2. pass your proxy to your job as input file and
    • pros: recommended by CERN IT. Will work no matter where your jobs will run (flock to some distant pool ?). You do not need to care about your proxy after you submitted. Will work from any place which offers condor_submit.
    • cons: complicated. No way to renew the proxy once the jobs have been submitted.
  3. place your proxy in your home areas as 1., rely on the fact that local htcondor scheduler mount your home disk (currently true at FNALLPC and CERN) and ask HTCondor to delegate the proxy from there to the running jobs
    • pros: simple. Will work no matter where your jobs will run (flock to some distant pool ?). Proxy on running jobs will be automatically renewed by HTCondor
    • cons: may not work at all sites where HTCondor is available.

In case 1. and 3. you need to use voms-proxy-init -voms cms as needed in order to have a valid proxy as long as your jobs may need it. Which is a bit more often than what's required in case you submit via CRAB, since CRAB server has a dedicated machinery to renew your proxy. In case 2. whatever you do locally has no effect on running jobs.

The first two methods are documented here: http://batchdocs.web.cern.ch/batchdocs/tutorial/exercise2e_proxy.html Instructions for the 3rd one (which is very similar to what CRAB does) are below:

  1  mkdir -p $HOME/tmp
  2 put your grid proxy in your home area with the following line in your login file:
    for .cshrc: setenv X509_USER_PROXY $HOME/tmp/x509up
    for .bashrc: export X509_USER_PROXY=$HOME/tmp/x509up
    (then log out and log in again and do voms-proxy-init ....)
  3. put this in your JDL
  x509userproxy = $ENV(X509_USER_PROXY)
  use_x509userproxy = True

-- MarcoMascheroni - 2018-06-06

Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r6 - 2023-05-12 - StefanoBelforte
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback