CRAB Logo

Software Guide on CRAB

Complete: 3
This page contains documentation about CRAB3. The old CRAB2 page has been moved to SWGuideCrab2.

CRAB is a utility to submit CMSSW jobs to distributed computing resources.

Introduction

CRAB is a utility to submit CMSSW jobs to distributed computing resources. By using CRAB you will be able to:

  • Access CMS data and Monte-Carlo which are distributed to CMS aligned centres worldwide.
  • Exploit the CPU and storage resources at CMS aligned centres.
Before starting to learn about CRAB, you may want to get an overview of the Grid model and of a typical analysis workflow. For that purpose read the following chapters of the CMS Offline WorkBook: For detailed documentation about CRAB, see section Advanced CRAB Documentation further down in this page.

Differences between CRAB2 and CRAB3

For those who already know about CRAB2 and have to face the transition to CRAB3, here is a list of architecture improvements (or just differences) in the new version.

  • Asynchronous copy of the output files to the destination storage element by an external service called Asynchronous Stage Out (ASO). This allows to have managed file transfers and releases the worker node immediately when the processing job has finished. Impact: the user will no longer have jobs that run and then fail at the end because they cannot move their data.
  • Automatic publication of the output dataset in DBS. No need to run the crab -publish command anymore.
  • Automatic job resubmission for certain failures.
  • Job splitting takes place in the server:
    • The crab -create command is therefore not available anymore and the workflow starts directly from the submission command.
    • The submission command does not accept anymore the range of jobs option (all jobs that are created by the server are also submitted to run).
  • The CRAB3 configuration file is written in python.

There are also features that are not yet implemented in CRAB3:

  • Storage in an arbitrary location is not supported. (CRAB3 only supports storage in /store/user/ and /store/group/.)
  • Local submission is not implemented.

An up-to-date roadmap to what works and what does not in CRAB3 is in CRAB3Functionalities.

The plan for migration user community from Crab2 to CRAB3 is in CRAB3Migration.

/afs/cern.ch/user/b/belforte/WORK/CRAB3/CRABClient

How is running CRAB3 is different than CRAB2

  • Commands do not require the - in front of them. That is, a CRAB2 command crab -<command> became crab <command> in CRAB3.
  • The --continue/-c option was replaced by --dir/-d.
  • The option used to specify the configuration file in the submission command is --config/-c.

CRAB3 documentation

CRAB3 releases
Summary of CRAB3 releases since October 2014 with list of new features, improvements and bug fixes.

Help Documentation for beginner users

Quickstart guide for CRAB3
A "cheat sheet" for getting started with CRAB3.
CRAB tutorial (introductory)
This is the most complete and recommended documentation about CRAB3 for beginners.
Help CRAB Troubleshooting
What to do if something does not work. A must read
CRAB tutorial (advanced)
Exercises for CRAB3 Advanced Tutorial 22 May 2015. Covers: dryrun, recovery task, scriptExe, LHE, CRABAPI library
CRAB configuration file
Documentation of the CRAB3 configuration file parameters.
CRAB commands
Documentation of the CRAB3 commands.
CRAB user functions
Documentation of the CRAB3 user functions.
CRAB advanced topics
Provisory page with documentation of different topics about CRAB3 (e.g. generating MC with LHE files, using scriptExe).
Tips and tricks about CRAB
Tips and tricks about CRAB3.
CRAB3 frequently asked questions
You may also want to consult the CRAB2 frequently asked questions page as some questions are common for CRAB2 and CRAB3.

Documentation for more expert users

Data handling in CRAB
An overview of how CRAB3 handles input data, output data and data publication.
CRAB3 task flow
A description of the steps involved in a typical CRAB3 task with description of CRAB3 architecture.
CRAB3 job states
In-depth information about job states in CRAB3.
CRAB3 client API
Details on how to use the CRAB3 Client library API.

Documentation for CRAB3 operators, developers, admins, experts

DEPRECATION NOTE:

TECHNICAL DOCUMENTATION IS BEING MOVED TO https://cmscrab.docs.cern.ch via https://gitlab.cern.ch/crab3/cms-crab-docs, check status of migrations of each topic below

New operator TODO list
TODO list new operators should go through

Installation

CRAB3 frontend installation
DEPRECATED documentation about how to deploy the CRAB server frontend (i.e. CRAB REST Interface and CRAB Cache) on a private VM for development. Production/pre-prod and test istances are deployed via K8s.
CRAB Docker and Kubernetes
Documentation on how to build/deploy CRAB Server on Docker + Kubernetes
CRAB TaskWorker Install CRAB3 Task Worker installation
Documentation of the current Puppet profile of the project
CRAB Publisher installation
Documentation about how to deploy the CRAB Standalone Publisher (i.e. not part of ASO server) WORK IN PROGRESS
TW and Publisher deployment on Docker
Documentation about how to deploy CRAB TaskWorker and Publisher as Docker Containers
.
CRAB3 Schedd (Deprecated) CRAB3 Schedd installation and deployment
Documentation for CRAB3 Schedd deployment.
CRAB3 client (Deprecated) CRAB3 client installation
Documentation for users and operators of the CRAB3 client.
ASO installation and deployment
DEPRECATED/OBSOLETE Documentation for ASO deployment and operations.
Notes about server deployment
Notes about the current deployment of CRAB3 servers.
CRAB logstash deployment on K8s
Documentation on how to deploy/operate logstash on CRAB k8s cluster

Operators duties

  • Constant:
    • act on current plan agreed in weekly CRAB DevOps meeting, log relevant operational changes in e-log
    • check Slack/MatterMost/Mail routinely for important communications of sudden problem report
  • Every day:
    • at least twice a day check dashboards in last 2 and 7 days and look for unusual patterns
  • Every week:
  • Every month:
    • pay attention to monthly mails from automatic Oracle procedure which drops old partitions and ensure that old partition are properly removed
    • make sure to keep documentation up to date with whatever changes have been done recently
  • Every year:
    • make sure that yearly certificate renewal for all services and hosts is done automatically and properly on all hosts which we operate well before current certificates expire to guaranteed smooth operations

Technical documents

Twiki pages:

CRAB Overview CrabOverview(Deprecated)
A high level overview of CRAB components and functions
Operator debugging tips
Handy commands for operators.
CRAB3 State Machine
Documentation about the Taks status tracking inside CRAB Server Data Base
CMS job exit codes
List of known exit codes for CMSSW and CRAB3.
Notes about CRAB3 service Certificates management
Documentation about the management of all the service certificates in the CRAB3 machines.
Notes about CRAB3 Oracle Database management
Documentation about the management of the oracle database in the CRAB3 project.
Inside Documentation
Miscellaneous collection of useful insights into how CRAB pieces work internally
CRAB3 Puppet (deprecated) CRAB3 Puppet profile
Documentation of the current Puppet profile of the project
Rucio Cheat Sheet
quick ref. to what's useful/used from Rucio in CRAB
CRAB Web Site
A place where to put files for access via http

Other useful documents, which are not in this twiki:

getting started on using S3 for CRAB
quick ref.and history of how we set things up
CRAB K8s clusters
Guide to CRAB's own K8s clusters in CMS CRAB OpenStack project
CRAB Prodution update
How to plan and execute a major CRAB update involving REST/TW/PUB/Client, from May 2021
S3 for CRAB
High level design of how to use S3 in CRAB
S3 CRAB Cache
Design of S3-based CRABCache service
CRAB python3 memory studies
To find out why CRAB3 python3 version use a lot memory (5x) more than python2.

Code repositories

CrabCodeDevelopment
Short guide to using github for Crab
CRAB Client
CRAB Server
REST, CrabCache, TW, new Publisher
ASO (old)
old ASO + Publisher in separate server
Submission Infrastructure Scripts
includes e.g. all scripts that we run in various ways on CRAB schedd's
Oracle Database management scripts
Puppet

Build and Release management

Notes about CRAB3 release management
Documentation about releasing CRAB3 patches
Notes about CRAB3 validation
Some notes about the deployment procedure and validation process of new CRAB3 versions
Notes about CRAB3 CI/CD
CRAB CI/CD documentation
Notes about CRAB3 testing
CRAB testing documentation

Monitoring

CRAB Monitoring links for users (and ops):
CRAB Monitoring links for operators
Old Monitoring:

Links to relevant projects

FNAL LPC Submission
submission via CRAB3 to FNAL LPC
glideInWMS for CMS documentation
main entry to Submission Infrastructure

Local Submission

CRAB vs CAF in Run 2
Notes about CAF use cases after discussion between Stefano and several DPG people.
Local Resource Provisioning
Notes on the different possibilities to submit to resources dedicated to local users.
Submitting jobs to the CERN HTCondor pool using 'crab preparelocal'
A different approach on how to independently submit CRAB jobs to a local batch system.

Getting started

This section describes how to begin using CRAB to perform your analysis on the Grid.

Prerequisites

To use CRAB to submit CMSSW jobs to the Grid, you must meet some prerequisites, which you can find in the CRAB Prerequisites page.

Basic workflows

Some basic workflows are documented in the CRAB3 Tutorial page. If you have never used CRAB before, it is a good place to learn how to use the tool.

Help Getting support

BEFORE CONTACTING Computing Tools, PLEASE CONSULT THE CRAB TROUBLESHOOTING GUIDE AT THIS LINK:
CRAB3Troubleshoot

and the CRAB3 Frequently Asked Questions page (and maybe even the CRAB2 Frequently Asked Questions page).

A common mistake is to test the code locally only on a small number of events which is not representative of the whole dataset. If the log of a failed job contains a CMSSW error (e.g. exit code 134, 139), please run your code interactively on the very same event on which the CRAB job failed.

All CRAB users must subscribe to the CERN Computing Announcements HyperNews Forum. It is very low traffic and if there is a global problem you will know about it faster in that way. *Note:* this HN forum will be deprecated soon as part of CMS migration from HyperNews to CMS-Talk, but the replacement has not been defined yet as of February 23, 2022.

To send feedback and/or get support about CRAB3, please make an entry in the Computing Tools CMS-Talk forum. You can do it either via e-mail to cmstalk+comptools@dovecotmtaNOSPAMPLEASE.cern.ch or using the web page above. All CRAB users may find it useful to subscribe to this forum.

When contacting Computing Tools, you must:

  • write a brief description of the problem and copy/paste text of relevant error messages
  • try to be specific ("I have errors, please help me" will not get you an useful reply)
  • give pointers to relevant task information , e.g. by copy/pasting the first few lines form crab status output. In general make sure that you report at least:
    • task name (and a job number, if applicable)
    • the URL printed in green by the crab status command at the line: Task URL to use for HELP
  • if a problem on the client side is suspected, upload the crab.log file (use the crab uploadlog command) corresponding to the task for which you are reporting the problem
  • This command:
    crab status -d  crab_20170616_143152
    Produces:
    CRAB project directory:       /afs/cern.ch/work/b/belforte/CRAB3/TC3/crab_20170616_143152
    Task name:          170616_123158:belforte_crab_20170616_143152
    Grid scheduler:          crab3@vocms0197.cern.ch
    Status on the CRAB server: SUBMITTED
    Task URL to use for HELP:    https://cmsweb.cern.ch/crabserver/ui/task/170616_123158%3Abelforte_crab_20170616_143152
    Dashboard monitoring URL:    http://dashb-cms-job.cern.ch/dashboard/templates/task-analysis/#user=belforte&refresh=0&table=Jobs&p=1&records=25&activemenu=2&status=&site=&tid=170616_123158%3Abelforte_crab_20170616_143152
    Status on the scheduler:    COMPLETED
    
    Jobs status:                    finished      100.0% (2/2)
    So you would have to paste in your mail at least this URL
    https://cmsweb.cern.ch/crabserver/ui/task/170616_123158%3Abelforte_crab_20170616_143152
    while of course also copying/pasting the full text above would be good, and even better.

Other relevant CMS HyperNews forums

Useful links

-- MarcoCalloni - 27-Jan-2010

Edit | Attach | Watch | Print version | History: r194 < r193 < r192 < r191 < r190 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r194 - 2023-03-21 - StefanoBelforte



 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback