CRAB Logo

CRAB commands

Complete: 3 Go to SWGuideCrab

CRAB commands

The following table provides a list of the currently available CRAB commands (ordered alphabetically) with a short description. The next sections give a more detailed description of the commands and usage examples.

Command Description
checkusername Check username extraction from SiteDB.
checkwrite Check write permission into a site.
getlog Retrieve the log files from a task.
getoutput Retrieve the output files from a task.
kill Kill all jobs in a task.
preparelocal Prepare a directory and the relative scripts to execute a job locally.
proceed Submit a task that was uploaded with crab submit --dryrun.
purge Clean-up the user's directory in the schedd's and in CRAB cache.
remake Recreate a CRAB project directory.
report Get a task final report with the number of analyzed files, events and luminosity sections.
resubmit Resubmit the failed jobs in a task.
status Report the states of jobs in a task (and more).
submit Submit a task.
tasks List all user tasks known to the server.
uploadlog Uploads the crab log file to the CRAB cache in the server.

To run a CRAB command, one has to type:

crab <command>

One can also get a list of available commands invoking the crab help menu:

crab -h

The screen output is something similar to this:

Usage: crab [options] COMMAND [command-options] [args]

Options:
  --version    show program's version number and exit
  -h, --help   show this help message and exit
  -q, --quiet  don't print any messages to stdout
  -d, --debug  print extra messages to stdout

Valid commands are: 
  checkusername
  checkwrite (chk)
  getlog (log)
  getoutput (output) (out)
  kill
  purge
  remake (rmk)
  report (rep)
  resubmit
  status (st)
  submit (sub)
  tasks
  uploadlog (uplog)
To get single command help run:
  crab command --help|-h

For more information on how to run CRAB-3 please follow this link:
https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookCRAB3Tutorial

Individual commands also provide a help menu showing the options available for the command. To see the help menu for a specific command, just add the -h option:

crab <command> -h

Specifying the task in CRAB commands

Most of the CRAB commands refer to a task and therefore require to provide the corresponding task name as an input. To pass the task name, one has to actually run the command passing the CRAB project directory name in the --dir/-d option. Relative paths and full paths can be used. CRAB will extract the task name from the .requestcache file present in the CRAB project directory. Thus, a CRAB command that requires a task name would always be run passing the CRAB project directory name:

crab <command> --dir=<CRAB-project-directory>

The .crab3 cache file

Every time such a CRAB command is executed, the CRAB project directory name (in full path format) to which the command refers to is saved in a file named .crab3 located by default in the user's home directory.

Caching the CRAB project directory name allows the user to not have to explicitly specify it repeatedly in consecutive CRAB commands; if the user doesn't specify a CRAB project directory, then the cached one is used (this is true for all commands, except for crab kill). This is a nice feature to save some typing, but should be used with care. For example, if CRAB commands are being executed by a script in this short form, and while the script is running the user executes another CRAB command for a different task, then this other CRAB project directory name will be cached, with an obvious effect in the script.

The user can change the location for the .crab3 file by means of the environment variable CRAB3_CACHE_FILE:

export CRAB3_CACHE_FILE=<full-path-to-the-directory-where-to-save-the-.crab3-file>

Since environment variables are specific to the shell session, this feature allows the user to have two different shell sessions, with different locations of the .crab3 files (by means of setting the corresponding CRAB3_CACHE_FILE variables to different directories), and execute in each shell CRAB commands referring to two different tasks without mixing the .crab3 files.

note.gif Note: The CRAB3_CACHE_FILE environment variable can only be used to set the location of the .crab3 cache file; the name of the file can not be changed.

CRAB commands in detail

crab checkusername

The crab checkusername command tries to retrieve the users' CERN (primary account) username from SiteDB. It is equivalent to the so called standalone mode in the CRAB2 check_HN_name.py script.

Below it is shown the crab checkusername screen output for the case of successful retrieval of my CERN (primary account) username from SiteDB.

crab checkusername

Retrieving DN from proxy...
DN is: /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=atanasi/CN=710186/CN=Andres Jorge Tanasijczuk
Retrieving username from SiteDB...
Username is: atanasi

crab checkwrite

The crab checkwrite command can be used by a user to check if he/she has write permission in a given LFN directory path (by default /store/user/<username>/) in a given site. The syntax to be used is:

crab checkwrite --site=<site-name> --lfn=<lfn-path>

This command creates a dummy (non-empty) file crab3checkwrite.<retry-counter>.tmp in the current directory and attempts to copy it into the specified LFN directory path (or into /store/user/<username>/ if the option --lfn is not given, where username is the one retrieved from SiteDB, i.e. the CERN primary account username) in the specified site using the gfal-copy command (or the lcg-cp command if gfal-copy is not available). After that, it deletes the local file crab3checkwrite.<retry-counter>.tmp and, if the remote copy was successful, attempts to delete the remotely copied file using the gfal-rm command (or lcg-del command if gfal-rm is not available). If crab checkwrite succeeds, the user can, in principle (read the note below), write to the site. Below are two examples where I check if I have write permission in T2_US_Nebraska, where I know I do have permission, and in T2_US_Caltech, where I didn't ask and I don't have write permission:

crab checkwrite --site=T2_US_Nebraska

Attempting to copy (dummy) file crab3checkwrite.0.tmp to /store/user/atanasi on site T2_US_Nebraska
Executing command: lcg-cp -v -b -D srmv2 --connect-timeout 180 /afs/cern.ch/work/a/atanasi/CRAB3-tutorial/CMSSW_7_0_5/src/crab3checkwrite.0.tmp 'srm://dcache07.unl.edu:8443/srm/v2/server?SFN=/mnt/hadoop/user/uscms01/pnfs/unl.edu/data4/cms/store/user/atanasi/crab3checkwrite.0.tmp'
Please wait...
Successfully ran lcg-cp
Successfully copied file crab3checkwrite.0.tmp to /store/user/atanasi on site T2_US_Nebraska
Attempting to delete file srm://dcache07.unl.edu:8443/srm/v2/server?SFN=/mnt/hadoop/user/uscms01/pnfs/unl.edu/data4/cms/store/user/atanasi/crab3checkwrite.0.tmp from site T2_US_Nebraska
Executing command: lcg-del --connect-timeout 180 -b -l -D srmv2 'srm://dcache07.unl.edu:8443/srm/v2/server?SFN=/mnt/hadoop/user/uscms01/pnfs/unl.edu/data4/cms/store/user/atanasi/crab3checkwrite.0.tmp'
Please wait...
Successfully ran lcg-del
Successfully deleted file srm://dcache07.unl.edu:8443/srm/v2/server?SFN=/mnt/hadoop/user/uscms01/pnfs/unl.edu/data4/cms/store/user/atanasi/crab3checkwrite.0.tmp from site T2_US_Nebraska
Success: Able to write to /store/user/atanasi on site T2_US_Nebraska
NOTE: you cannot write to a site if you did not ask permission

crab checkwrite --site=T2_US_Caltech

Attempting to copy (dummy) file crab3checkwrite.0.tmp to /store/user/atanasi on site T2_US_Caltech
Executing command: lcg-cp -v -b -D srmv2 --connect-timeout 180 /afs/cern.ch/work/a/atanasi/CRAB3-tutorial/CMSSW_7_0_5/src/crab3checkwrite.0.tmp 'srm://cit-se.ultralight.org:8443/srm/v2/server?SFN=/mnt/hadoop/store/user/atanasi/crab3checkwrite.0.tmp'
Please wait...
Failed running lcg-cp
  Stderr:
    Using grid catalog type: UNKNOWN
    Using grid catalog : (null)
    VO name: cms
    Checksum type: None
    Destination SE type: SRMv2
    [SE][Mkdir][SRM_FAILURE] httpg://cit-se.ultralight.org:8443/srm/v2/server: srm://cit-se.ultralight.org:8443/srm/v2/server?SFN=/mnt/hadoop/store/user/atanasi: Error:/bin/mkdir: cannot create directory `/mnt/hadoop/store/user/atanasi': Permission denied
    Ref-u uscms4275  /bin/mkdir /mnt/hadoop/store/user/atanasi
    [SE][PrepareToPut][SRM_FAILURE] httpg://cit-se.ultralight.org:8443/srm/v2/server: <none>
    lcg_cp: Invalid argument
    
Error: Unable to write to /store/user/atanasi on site T2_US_Caltech
       You may want to contact the site administrators sending them the 'crab checkwrite' output as printed above
NOTE: you cannot write to a site if you did not ask permission

Finally, there are cases in which crab checkwrite may fail to perform the check (e.g. because of a connection problem with the site, etc.). Here is an example:

crab checkwrite --site=T2_RU_SINP

Attempting to copy (dummy) file crab3checkwrite.0.tmp to /store/user/atanasi on site T2_RU_SINP
Executing command: lcg-cp -v -b -D srmv2 --connect-timeout 180 /afs/cern.ch/work/a/atanasi/CRAB3-tutorial/CMSSW_7_0_5/src/crab3checkwrite.0.tmp 'srm://lcg58.sinp.msu.ru:8446/srm/managerv2?SFN=/dpm/sinp.msu.ru/home/cms/store/user/atanasi/crab3checkwrite.0.tmp'
Please wait...
Failed running lcg-cp
  Stderr:
    Using grid catalog type: UNKNOWN
    Using grid catalog : (null)
    VO name: cms
    Checksum type: None
    Destination SE type: SRMv2
    [SE][Ls][] httpg://lcg58.sinp.msu.ru:8446/srm/managerv2: CGSI-gSOAP running on lxplus0111.cern.ch reports Error reading token data header: Connection closed
    [SE][PrepareToPut][] httpg://lcg58.sinp.msu.ru:8446/srm/managerv2: CGSI-gSOAP running on lxplus0111.cern.ch reports Error reading token data header: Connection closed
    lcg_cp: Communication error on send
    
Unable to check write permission to /store/user/atanasi on site T2_RU_SINP
Please try again later or contact the site administrators sending them the 'crab checkwrite' output as printed above
NOTE: you cannot write to a site if you did not ask permission

Thus, if crab checkwrite does not give a positive answer, it either means that 1) the user is really not allowed to write into the given LFN path in the given site (like my example above for T2_US_Caltech), in which case CRAB will not be able to do the stageouts, or 2) crab checkwrite was unable to perform the check (like my example above for T2_RU_SINP), in which case one can not say anything about what will happen with CRAB stageouts. In case of 1), if the user thinks that he/she should have the permission to write into the specified site and LFN directory path, then the recommendation is to contact the site administrators, maybe sending them the crab checkwrite output. In case of 2), the fact that crab checkwrite is unable to perform the check doesn't mean that CRAB will not be able to stageout files; all it means is that crab checkwrite could not do the check. Again, the point is that crab checkwrite uses gfal (or lcg) commands, while CRAB (actually ASO) uses FTS for doing stageouts. And it may be that a site has a problem with gfal (or lcg) commands, but the stageout will work fine with FTS. The user can submit a small test task (e.g. 1 job) and see if the files are staged out or not.

note.gif Note: There are many sites that have not yet implemented a write permission policy, but this doesn't mean that users are free to use storage space on those sites. In terms of the crab checkwrite command, a user may see that the command succeeds for a given site even if he/she has not requested the write permission. However, if the user starts to write into such a site, he/she may be banned. Thus, users should ALWAYS ask for write permission before attempting to write into a site.

crab getlog

This command retrieves the log files of a number of jobs specified by the -q/--quantity option (to retrieve all job logs use -q="all"). This parameter is ignored in case the --jobids is used, which allows to specify the job ids of the logs (comma separated list of integers.) to retrieve. The default path where the retrieved files are stored is the results directory under the corresponding crab project directory, to change it use the --outputpath option. For more options run crab getlog --help

crab getoutput

This command retrieves the output files of a number of jobs specified by the -q/--quantity option (to retrieve all job outputs use -q="all"). This parameter is ignored in case the --jobids is used, which allows to specify the job ids of the output (comma separated list of integers.) to retrieve. The default path where the retrieved files are stored is the results directory under the corresponding crab project directory, to change it use the --outputpath option. For more options run crab getoutput --help

note.gif Note: crab getoutput is meant and designed for inspection/debug purposes. Users must not use it to download a large fraction of the output dataset. If local access is really needed the user shall ask for a central replica of the output dataset to a site where he/she has an account.

crab kill

This command is used to completely and finally terminate all CRAB server actions on a task. It addresses the use case when the internal CRAB submission, bookkeeping and retry logic could not manage to lead a task to successful completion, even when helped by crab resubmit commands, and the user wants to make sure that balls stopped rolling before creating a recovery task.

This will forcefully terminate all running or pending jobs and transfers, with no resubmission possible.

crab preparelocal

Once a task has been submitted, this command prepares a directory and the relative scripts to execute a job locally.
It can execute a specific job on the fly if the --jobid option is passed. The instructions to submit jobs to a HTCondor pool are available here.
For more options: crab preparelocal --help

crab proceed

This command is used to resume the submission to the Grid for a task in UPLOADED status after a dry run. For example, if we are satisfied with the splitting result in the example shown in crab submit --dryrun, we can submit these jobs to the Grid with:

crab proceed -d crab_projects/crab_tutorial_May2015_Data_analysis

Sending the request to the server
To check task progress, use 'crab status'
Task continuation request successfully sent to the CRAB3 server
Log file is /afs/cern.ch/user/a/atanasi/CRAB3-tutorial/CMSSW_7_3_5_patch2/src/crab_projects/crab_tutorial_May2015_Data_analysis/crab.log

crab purge

The crab purge command can be used to remove 1) task directories from the CRAB scheduler machines (schedd for short), and/or 2) files from the CRAB cache. The command requires a task to be specified via the --dir/-d option. Only files corresponding to the specified task will be removed. Only tasks in a terminal state (COMPLETED, FINISHED, FAILED, KILLED or KILLFAILED) can be purged. By default, both the schedd and the CRAB cache will be purged. If the user wants to only purge the schedd or only purge the CRAB cache, then he/she has to add the option --schedd or --cache respectively (specifying both options will result in not purging anything). Purging the schedds and the CRAB cache is something a user would do to avoid filling up his/her quota on these resources. However, since files on these resources are automatically deleted after some period, a user should rarely need to use the crab purge command for this. For more information about user quota on the schedds see User quota in the CRAB scheduler machines. For more information about user quota on the CRAB cache see User quota in the CRAB cache.

Below is an example of what the crab purge command would print out when executing it for a task that has not been purged before (either automatically or by the user) and assuming the purging is successful.

crab purge --dir=<CRAB-project-directory>

Getting the tarball hash key
Checking task status
Task status: <status-of-this-task>
Tarball hashkey: <hash-key-of-this-tasks-input-sandbox>
Attempting to remove task file from crab server cache
Success: Successfully removed task files from crab server cache
Getting the schedd address
Attempting to remove task from schedd
Success: Successfully removed task from schedd
Log file is <location-of-crab.log-file-for-this-task>

crab remake

The crab remake command creates a CRAB project directory for a given task with empty directories input and results and with a basic .requestcache file. The command needs as input the task name, which the user can obtain for example from the crab tasks command or from dashboard monitoring web page (see section Task monitoring in the CRAB tutorial (introductory)). Below is an example of what the crab remake command would print out on the screen when executing it for a given task.

crab remake --task=<task-name>

Remaking <crab-project-directory>
Remaking the .requestcache for <task-name>
Success: Finish making <crab-project-directory>/.requestcache 
Log file is <current-working-directory>/crab.log

The crab remake command can be used to recover a CRAB project directory when it has been lost. Recovering a CRAB project directory is useful for storing outputs and logs in the results directory, for being able to execute CRAB commands that require the CRAB project directory to be provided in the --dir/-d option (probably we will add the feature that this option accepts a task-name as well, but this is not the case so far), etc.

crab report

The main purpose of the crab report command is to provide the user with the following two lists of luminosity sections:

  1. the lumis that have been successfully processed;
  2. the lumis that should be, but were not (yet) processed.

The terms "successfully processed" and "not processed" are ambiguous, because the publication step is not considered part of the condor job, meaning that a job can be in status finished, but its output files not yet published. To avoid confusion, the report command provides for point 1. two lists of lumis:

  • the lumis that have been processed by jobs in status finished (no matter the status of the publication);
  • the lumis in the output dataset(s) in DBS (no matter which task has published them).

On the other hand, for point 2. the report command provides an option for the user to specify if the listed lumis should be the ones in jobs that are not in status finished or the lumis that were not published.

The files containing the processed/published lumis are useful for calculating the luminosity in the analysis, while the files containing the not processed/published lumis are useful for running a recovery task.

There are also additional lumis summary files provided. The next is a list of all the lumis summary files returned by crab report:

  • inputDatasetLumis.json: Contains the lumis in the input dataset. Extracted from DBS at task submission time.
  • inputDatasetDuplicateLumis.json: Contains the lumis in the input dataset that are split across more than one input file. Extracted from DBS at task submission time.
  • lumisToProcess.json: Contains the lumis that the task had to process. This should be equal to the lumis in the input dataset filtered by the input lumi-mask and run-range and after applying the cut on the total number of lumis to process (Data.totalUnits). (Warning: since the cut on Data.totalUnits is applied after job splitting, it is not easy to predict which lumis will survive the cut.)
  • processedLumis.json: Contains the lumis that have been processed by jobs in status finished.
  • One of:
    • notFinishedLumis.json: Contains the lumis that had to be processed by jobs that are not in status finished. This file is provided if the value of the --recovery option is notFinished. This list of lumis is simply the subtraction of the lumis in lumisToProcess.json minus the lumis in processedLumis.json.
    • notPublishedLumis.json: Contains the lumis that the task had to process and are not published (if the task is publishing in more than one output dataset, then "published" here means published in all the output datasets). This file is provided if the value of the --recovery option is notPublished. This list of lumis is calculated as the subtraction of the lumis in lumisToProcess.json minus the lumis that are published (in all the tasks' output datasets). Can not be provided if publication was disabled.
    • failedLumis.json: Contains the lumis that had to be processed by jobs that are in status failed. This file is provided if the value of the --recovery option is failed.
  • outputDatasetsLumis.json: Contains the lumis in each of the output datasets. Not provided if publication was disabled.
  • outputFilesDuplicateLumis: Contains the lumis in the output datasets that are split across more than one output file.

If a lumi summary file is empty, it is not saved to the output directory. For each saved file a message is given in the command output.

The crab report command provides also a summary of how many files/events have been read and how many events have been written. The number of files/events read considers only primary input files; secondary input files, like for example pile-up, are not considered.

The table below shows the options accepted by the crab report command with a short description.

Option Description
--dir Path to the CRAB project directory for which the command should be executed.
--outputdir Directory where to write the lumis summary files. Defaults to the subdirectory results in the corresponding CRAB project directory. If the directory does not exist, it is created.
--recovery This option is to specify the method to be used to calculate the "not processed" lumis. Accepted values are: notFinished (the default), notPublished, failed. See the explanation about the contents of the lumis summary files above for more details.
--proxy Use the given proxy. Skip Grid proxy creation and myproxy delegation.
--instance Use the given instance of CRAB server.

Below is an example of the output of crab report in a task with publication:

Summary from jobs in status 'finished':
  Number of files processed: 
  Number of events read: 
  Number of events written in EDM files: 
  Number of events written in TFileService files: 
  Number of events written in other type of files: 
  Processed lumis written to processedLumis.json
  Warning: 'notFinished' lumis written to notFinishedLumis.json
           The 'notFinished' lumis were calculated as: the lumis to process minus the processed lumis.
Summary from output datasets in DBS:
  Number of events:
    <output-dataset-name-1>: 
    <output-dataset-name-2>: 
    
  Output datasets lumis written to outputDatasetsLumis.json
Additional report lumis files:
  Input dataset lumis (from DBS, at task submission time) written to inputDatasetLumis.json
  Lumis to process written to lumisToProcess.json

If there would be some lumi split across input files (we usually refer to this with the term "duplicate lumi"), an additional message would appear in the "Additional report lumis files" section:

  Input dataset duplicate lumis (from DBS, at task submission time) written to inputDatasetDuplicateLumis.json

Note that CRAB can usually handle a duplicate lumi making sure all portions of the lumi are analyzed by a unique job. So this message is mostly for informative purpose. A related warning message should also appear in the output of the crab status command. But if for some reason CRAB was not able to handle the duplicate lumi, the output files would also have the lumi split across files and there would be an additional message in the output of crab report in the "Summary from jobs in status 'finished'" section:

  Warning: Duplicate lumis in output files written to outputFilesDuplicateLumis.json

crab resubmit

This command is meant to complement internal CRAB automatic resubmission of failed jobs. Therefore it only works on failed jobs. (note: in practice also finished jobs can be resubmitted, but the usefulness of resubmitting such jobs is questionable and doing so can interfere with ongoing file transfers. Should be used with extreme care).

When jobs are still in failed status after CRAB tried its best, it is possible that another submission attempt succeeds in cases like:

  • a failure that CRAB deemed fatal, is determined to be random and transient after detailed examination
  • all CRAB resubmissions failed due to a problem that is now solved (a site down for a day or a lost file now recovered e.g.)
  • the failure can be prevented by changing some of the job requirements in the resubmission (site white list, memory or time limit etc.)
  • other similar situations

The crab resubmit command resubmits specific jobs from a task, plus it allows to overwrite some parameters in the resubmitted jobs. If no job ids are given, all (and only) jobs in status failed will be resubmitted. Successfully completed jobs (i.e. jobs in status finished) can be resubmitted if the job id is explicitly given AND the --force option is used.

The table below shows the options accepted by the crab resubmit command with a short description.

Option Description
--publication Resubmit (all) the failed publications. This option can not be specified together with any of the following options: --jobids, --force, --sitewhitelist, --siteblacklist, --maxjobruntime, --maxmemory, --numcores, --priority.
--jobids List of job ids to resubmit. Only jobs in status failed or finished can be resubmitted. If not specified, all jobs in status failed will be resubmitted. Should be a comma separate list of job ids or ranges of job ids.
Example: --jobids=1,5-10,15, which will expand to 1,5,6,7,8,10,15. To resubmit jobs in finished status, the option --force should also be used.
--force This option is to force the resubmission of jobs that have finished successfully (i.e. jobs that are in finished status), in which case the job ids have to be also explicitly given in the --jobids option.
--sitewhitelist List of sites where the resubmitted jobs are allowed to run. Affects only the jobs that will be resubmitted by the current invocation of crab resubmit. If this option is not specified, CRAB will use the original site whitelist defined when the task was first submitted. To specify an empty site whitelist (i.e. no site whitelist) define --sitewhitelist=''. Should be a comma separated list of CMS site names. The wildcard * is accepted and will expand to all matching CMS site names.
Example: --sitewhitelist=T2_US_Florida,T2_US_MIT.
--siteblacklist List of sites where the resubmitted jobs are not allowed to run. Affects only the jobs that will be resubmitted by the current invocation of crab resubmit. If this option is not specified, CRAB will use the original site blacklist defined when the task was first submitted. To specify an empty site blacklist (i.e. no site black list) define --siteblacklist=''. Should be a comma separated list of CMS site names. The wildcard * is accepted and will expand to all matching CMS site names.
Example: --siteblacklist=T1_*, which will expand to all CMS site names starting with T1_.
--maxjobruntime The requested maximum walltime (in minutes) for the resubmitted jobs. Affects only the jobs that will be resubmitted by the current invocation of crab resubmit.
--maxmemory The requested maximum memory (in MB) for the resubmitted jobs. Affects only the jobs that will be resubmitted by the current invocation of crab resubmit.
--numcores The requested number of cores for the resubmitted jobs. Affects only the jobs that will be resubmitted by the current invocation of crab resubmit.
--priority The priority of the task with respect to the other tasks owned by the user. Affects only the jobs that will be resubmitted by the current invocation of crab resubmit. The default priority of a task is 10. The task priority is a "base" number from where the priority of the jobs in the task (the real meaningful parameter) is calculated: job_priority = task_priority + crab_retry_count(job id) + 10 for the first 5 jobs in a task (job id ≤ 5) and job_priority = task_priority + crab_retry_count(job id) otherwise. The crab_retry_count includes the retries of the whole DAG node and the retries of the post-job itself (if any).
--proxy Use the given proxy. Skip Grid proxy creation and myproxy delegation.
--instance Use the given instance of CRAB server.

note.gif Note: When resubmitting jobs, the number of available automatic retries is reset back to 0. In other words, each resubmitted job will be automatically retried (under a recoverable error) as many times as in a first/original submission. But the retry count itself is not reset; instead it will keep increasing by +1 for each retry. The retry count is shown by crab status --long in the "Retries" column.

crab status

Needs to be documented.

crab submit

Needs to be documented.

Task name
CRAB defines the name of the task at submission time using the following information: submission date and time, the username and the request name specified in the CRAB configuration. The task name has the form <YYMMDD>_<hhmmss>:<username>_crab_<request-name>.

note.gif Note: The crab submit command will not override a CRAB project directory that already exists; instead it will abort the submission and print a message like this:

Working area '<CRAB-project-directory>' already exists 
Please change the requestName in the config file

note.gif Note: The CRAB project directory is created even if the submission fails. Indeed, if something goes wrong with the submission, or if the submission is simply interrupted, and the user wishes to execute the submission command again without changing the request name, then he/she should first remove the CRAB project directory that has been created by the failed submission.

crab submit --wait

The crab submit command has an option --wait, which forces CRAB to recursively check the status of the task after submission to the server and until the task is successfully submitted by the server to the grid submission infrastructure (the task status is SUBMITTED, UPLOADED or UNKNOWN), the submission to the grid submission infrastructure fails (the task status is FAILED), or for a maximum of 15 minutes. For example:

crab submit -c <CRAB-configuration-file> --wait

would give a screen output that looks something like this for a successful submissio:

Will use CRAB configuration file <CRAB-configuration-file>
Sending the request to the server
Success: Your task has been delivered to the CRAB3 server.
Waiting for task to be processed
Checking task status
Task status: NEW
Please wait...
Task status:QUEUED
Please wait...
Task status:UNKNOWN
The CRAB3 server finished processing your task. Use 'crab status' to see if your jobs have been submitted successfully.

Log file is <CRAB-project-directory>/crab.log

or like this for a failed submission:

Will use CRAB configuration file <CRAB-configuration-file>
Sending the request to the server
Success: Your task has been delivered to the CRAB3 server.
Waiting for task to be processed
Checking task status
Task status: NEW
Please wait...
Task status: FAILED
Error: The submission of your task has failed. You can do 'crab status' to get the error message.
The CRAB3 server finished processing your task. Use 'crab status' to see if your jobs have been submitted successfully.

Log file is <CRAB-project-directory>/crab.log

In each Please wait... occurrence, the CRAB client waits 30 seconds before doing another crab status. The --wait option is then useful to free the user from having to keep executing crab status until he/she sees that the task has been either successfully submitted or the submission failed.

crab submit --dryrun

Choosing the CRAB configuration splitting parameters in order to have an efficient splitting of the task in jobs is in general not a trivial business. Choosing a small value for Data.unitsPerJob may end up generating too many short-running jobs, while a big value may end up generating few but very long-running jobs. In the second case, jobs may even not finish within the requested (or allowed) runtime limit (~21h:50m by default), in which case they will be killed once the runtime limit is reached, having wasted time and resources. Submitting one test job is probably the best way of estimating how long a job would run. Ideally the user needs to have an estimate of how long does it take to analyze a basic unit (a file, a luminosity section or set of events) and then extrapolate to estimate the jobs runtime and define the splitting by targeting 8 hours of job runtime.

The --dryrun option was introduced in the crab submit command in order to answer the following two questions without actually submitting any job:

  1. Given a choice of splitting parameters, how many jobs will the task have?
  2. How long would it take to run these jobs (and how much memory would they use)?

When submitting a task using the --dryrun option, the CRAB server will do the splitting of the task in jobs as usual, but will not submit the jobs to the Grid and instead it will:

  • write the splitting results into a json file named splitting-summary.json;
  • create an archive file named dry-run-sandbox.tar.gz containing the file splitting-summary.json and all the files necessary to run a job (job wrapper, user sandbox, etc);
  • upload the archive file dry-run-sandbox.tar.gz to the CRAB User File Cache.

The CRAB client will then download the archive file dry-run-sandbox.tar.gz from the CRAB User File Cache, unpack it in a temporary directory and run a mini test job over a few events on the user's local machine. When the test job finishes, the CRAB client will print out a summary with the result of the splitting and the expected job's performance (memory consumption and job runtime).

As an example, lets do a dry run for the example presented in the CRAB tutorial Running CMSSW analysis with CRAB on Data:

crab submit -c crabConfig_tutorial_Data_analysis.py --dryrun

Will use CRAB configuration file crabConfig_tutorial_Data_analysis.py
Sending the request to the server
Success: Your task has been delivered to the CRAB3 server.
Waiting for task to be processed
Checking task status
Task status: NEW
Please wait...
Task status: QUEUED
Please wait...
Task status: UPLOADED

Creating temporary directory for local test run (needed for timing estimates) in /tmp/atanasi/tmp1Jrjkh
Executing test, please wait...
Using LumiBased splitting

Task consists of 17 jobs to process 324 lumis
The estimated memory requirement is 681 MB
The longest job will process 20 lumis, with an estimated processing time of 218 minutes
The average job will process 19 lumis, with an estimated processing time of 177 minutes
The shortest job will process 4 lumis, with an estimated processing time of 24 minutes

Dry run requested: task paused
To continue processing, use 'crab proceed'

Log file is /afs/cern.ch/user/a/atanasi/CRAB3-tutorial/CMSSW_7_3_5_patch2/src/crab_projects/crab_tutorial_May2015_Data_analysis/crab.log

The important information to extract from the dry run is that the task will process 324 lumis in 17 jobs, that each job will analyze more or less 19 lumis and that such job runtime is estimated to be ~3 hours. It is worth to emphasize that the estimated job runtime may differ significantly from the actual job runtime (it may be overestimated or underestimated).

To skip the estimates of runtime and memory consumption, use the --skip-estimates option. In this case the CRAB client will not run the local test job, so splitting results will be shown faster.

As printed at the end of the crab submit --dryrun output, the user can submit the current task to the Grid using the crab proceed command.

crab tasks

Lists all tasks in the CRAB server database owned by the user. Shows also the task status as written in the database, which is of limited information since task status in the database is not updated once task is SUBMITTED.

By default, tasks submitted in the last 30 days are shown. This is to align with the fact that tasks that were not updated in the last 30 days are removed from the Grid schedulers. A task that was submitted more than 30 days ago may still be available in the Grid schedulers if an update to the task (a resubmission for example) was made within the last 30 days. So in general the tasks shown by default is only a subset of the user tasks still available in the Grid schedulers.

The table below shows the options accepted by the crab tasks command with a short description.

Option Description
--fromdate Show tasks submitted since the given date. Format of date must be YYYY-MM-DD. This option cannot be used together with --days.
--days Show tasks submitted in the last N days. This option cannot be used together with --fromdate.
--status Filter the output to keep only the tasks that are in the given status.
--proxy Use the given proxy. Skip Grid proxy creation and myproxy delegation.
--instance Use the given instance of CRAB server.

crab uploadlog

Needs to be documented.

CRAB project directory

The submission command creates a CRAB project directory for the corresponding task, where the CRAB and CMSSW configurations are cached for later usage, avoiding interference with other projects. The CRAB project directory is named crab_<request-name>, where request-name is as specified by the parameter General.requestName in the CRAB configuration. The CRAB project directory is created inside the directory specified by the parameter General.workArea (which defaults to the current working directory). Thus, using the parameter General.requestName, the user can choose the project name, so that it can later be distinguished from other CRAB projects in the same working area.

The CRAB project directory contains:

  • A crab.log file, containing log information from the CRAB commands that were executed using this project directory.
  • A .requestcache file with cached information of the task request and CRAB configuration.
  • A directory called inputs, containing:
    • A file named PSet.pkl containing a pickled version of the process object from the users CMSSW parameter-set configuration.
    • A python file named PSet.py which basically loads the pickled process object saved in PSet.pkl.
    • A zipped tarball of the input sandbox with the users code.
  • A directory called results, where the files retrieved via crab getlog, crab getoutput and crab report are put.

The .requestcache file

Each CRAB project directory contains a file named .requestcache with cached information of the task request and CRAB configuration. This information includes, among other things, the task name. When a user runs a command crab <command> --dir=<CRAB-project-directory>, CRAB will look for that file inside the specified directory, extract the task name from the file and use it as input for the command. If CRAB does not find the .requestcache file in the specified directory, or the directory itself doesn't exist, it will print a message saying so:

Cannot find .requestcache file in CRAB project directory <CRAB-project-directory>

or

<CRAB-project-directory> is not a valid CRAB project directory.

Or if the task name is not known by the server:

Error contacting the server.
Server answered with: Execution error
Log file is <CRAB-project-directory>/crab.log

This evidently means that, if the .requestcache file is lost or corrupted, the user will not be able to execute commands for the corresponding task. The crab remake command can be used in this case to re-create the CRAB project directory, including the (relevant piece of the) .requestcache file, so that other CRAB commands can be executed again for that task.

-- AndresTanasijczuk - 2015-01-08

Edit | Attach | Watch | Print version | History: r37 < r36 < r35 < r34 < r33 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r37 - 2019-01-23 - SergeyPolikarpov
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback