Configuration: Basics

Introduction

CRAB is configured by a configuration file called crab.cfg. The configuration file should be located within the CMSSW user project directory at the same location as the CMSSW parameter-set to be used by CRAB. It's basic content is described in the following.

Basic crab.cfg

The minimal CRAB configuration file has the following content:

[CRAB]
jobtype                = cmssw
scheduler              = edg 

[CMSSW]

datasetpath            = /RelVal120Higgs-ZZ-4Mu/FEVT/CMSSW_1_2_0-FEVT-1166242770
pset                   = io.cfg

total_number_of_events = 100
events_per_job         = 10

output_file            = output.root

[USER]
return_data            = 1

use_central_bossDB     = 0

use_boss_rt            = 0

[EDG]
lcg_version            = 2
rb                     = CERN
proxy_server           = myproxy.cern.ch 
virtual_organization   = cms
retry_count            = 2
lcg_catalog_type       = lfc
lfc_host               = lfc-cms-test.cern.ch
lfc_home               = /grid/cms

The CRAB configuration file is structured into sections and it is important in which section a specific configuration item is listed. The section in the configuration file given above are

[CRAB]
[CMSSW]
[USER]
[EDG]

Basic parameters

[CRAB] section

Parameter Description
jobtype The jobtype defines the kind of job CRAB should run. As CMSSW only knows one jobtype, this is always cmssw
scheduler The scheduler defines which GRID middleware is to be used by CRAB. There are 3 different schedulers for EGEE and one special scheduler only for OSG:
Scheduler Description
edg Default access mode to all EGEE and OSG resources using the resource broker.
glite New access mode to all EGEE and OSG resources using the new gLite resource broker.
glitecoll New access mode to all EGEE and OSG resources using the new gLite resource broker in high performance bulk mode.
condor_g Direct access mode to only OSG sites (requires local Condor scheduler (see Local user interface for sh family or Local user interface for csh family)).

[CMSSW] section

Parameter Description
datasetpath The datasetpath identifies the dataset you want to access. It can be queried by using the CMS data discovery page: http://cmsdbs.cern.ch/discovery/. More information is given at Dataset discovery and job configuration.
pset The name of the CMSSW parameter-set of your CMSSW job. The parameter-set has to be in the same directory as the CRAB configuration file.
total_number_of_events Total number of events to be processed by CRAB. If set to -1, all events of the selected dataset are processed. More information is given at Dataset discovery and job configuration.
events_per_job Number of events per job. CRAB will create as many jobs as needed to process the total_number_of_events. Due to technical reasons, the number of jobs may be larger than the mathematical number of jobs (total_number_of_events/events_per_job) due to constraints for the job splitting. More information is given at Dataset discovery and job configuration.
outputfile Comma-separated list of output filenames. Usually the filename selected in the PoolOutputSoure of the CMSSW parameter-set but can also hold user-specific output filenames like histogram files, etc. . These name is used by CRAB when generating the output filenames of the individual jobs. CRAB automatically adds job identifiers to the output filenames of the individual jobs so that the user can distinguish them. For example, if the output filename is output.root and the selected CRAB configuration results in 10 jobs, the output filenames of the individual jobs are named: output_00001.root, output_00002.root, ...

[USER] section

Parameter Description
return_data Defines the way CRAB handles user output. Default is 1 for using the GRID middleware sandbox. Attention: the sandbox is limited to 100 MB. More information is given at Output handling
use_central_bossDB BOSS specific parameter.
use_boss_rt Boss specific parameter.

[EDG] section

Parameter Description
lcg_version EGEE resource broker specific information.
rb Defines which resource broker configuration should be used. If set to CERN, the official CERN configuration is downloaded from cmsdoc.cern.ch, if set to CNAF, the configuration for the CNAF resource broker is downloaded. If this parameter is commented out, the default of the used user interface is used.
proxy_server Defines the grid proxy server name
virtual_organization Has to be: cms
retry_count Resource broker parameter, defines how often the resource broker should try to resubmit a job before giving up.
lcg_catalog_type LFC catalog specific parameter.
lfc_host LFC catalog specific parameter.
lfc_home LFC catalog specific parameter.

Previous: CRAB setup Top: Main page Next: Input and Output handling
Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2007-03-19 - OliverGutsche
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback