Configuration: Input and Output handling

Introduction

CRAB handles insertion and extraction of code and files to and from the workernode the actual job is running in several ways. While there is only one standard handling of input, output has to be handled differently dependent on the output data volume.

CRAB uses per default the built in input and output capabilities of the GRID middleware.

  • Using the EGEE resource broker with the schedulers edg,glite,glitecoll, you will use the sandbox functionality of the resource broker. CRAB sends the user's input consisting of the packed user code and possible additional input files to the resource broker during submission. After job completion, CRAB retrieves the output from the resource broker. There is no direct communication between the user's machine and the target GRID farm, this is handled by the resource broker.
  • Using the OSG direct submisson, the local Condor scheduler communicates with the target GRID farm and takes care of input and output transportation. There is direct communication between the user's machine and the target GRID farm. This is the reason why the local Condor scheduler has to be running and connected to the network at all times during the job execution.

Please note that input and output sandboxes are limited to 100MB. If the input sandbox limit is exceeded, the job will fail to submit. If the output sandbox limit is exceeded, the output files will be truncated and can lead to corrupt output files. Standard output, standard error and logs are handled by CRAB automatically and count towards the 100 MB limit.

In the case of larger output, CRAB provides the possibility to stage out the output directly to a storage element. A rule of thumb: If the output contains a CMSSW root file with more than 10 events, please use the storage element output handling.

All output modes are described in the following.

Default mode using the GRID middleware sandbox

All output files in addition to the standard output and error and logs are defined in the CRAB configuration file in the [CMSSW] section by defining the list of output files as a comma-separated list:

output_file            = output.root

To use the GRID middleware sanbox, set return_data in the [USER] section of the config file to true:

return_data = 1

Storage element mode

The storage element mode uses a specified GRID server called a storage element to store the output in a specified directory. This is useful to users who create large output files. To use this mode, set copy_data to true in the [USER] section:

copy_data = 1

and set return_data to false to prevent your sandbox from overflowing (you will still get your standard output and error from CRAB through your sandbox):

return_data = 0

Specify the address of the GRID server and the full path of the directory where your output should be stored. For example, a storage element at CERN using the srm protocol looks like:

storage_element = srm.cern.ch
storage_path = /srm/managerv1?SFN=/castor/cern.ch/user/u/username/subdir

An srm storage element at FNAL looks like:

storage_element = cmssrm.fnal.gov 
storage_path = /srm/managerv1?SFN=/resilient/username/subdir

You will need to get the storage_element and storage_path names from the server where you wish to send your data.

Your user account needs permission to access the server and write into that directory. Additionally, since your user information is not sent with CRAB jobs but your certificate proxy is, the directory must be group writeable (in this case, the group is cms). To set permissions at any site running dCache behind its storage element server (any OSG site such as FNAL), ssh to the site and use mkdir and chmod as normal. For example, at FNAL to use the resilient space writing not to tape:

mkdir /pnfs/cms/WAX/resilient/username/subdir
chmod +775 /pnfs/cms/WAX/resilient/username/subdir

replacing username with your username. Note that not all filesystem commands will work on dCache.

To set permissions at any site running Castor (such as CERN), ssh to the site and use rfmkdir and rfchmod. For example, at CERN:

rfmkdir /castor/cern.ch/user/u/username/subdir 
rfchmod +775 /castor/cern.ch/user/u/username/subdir

replacing username with your username.

Each output file specified by output_file will be placed in the specified directory on the defined storage element. CRAB jobs will be not able to overwrite existing files but will created directory structures if not existing.

Storage element mode using the LCG catalog

The second mode is a GRID registration option using LCG tools. This option can only be used when copy_data (the previous option) is set to true. It is primarily useful as a backup option if your file transfer to the storage element fails. It will attempt to copy your output to a different storage element if the copy to the storage element you specified failed. If your copy succeeded, this option will still register your file and will leave your output at the storage element you specified. Your output can then be found by referencing its Logical File Name (LFN), which is specified in the crab.cfg file. This option will not register your output to the DBS/DLS databases. To use this mode, set register_data to true in the [USER] section:

register_data = 1

Specify the logical file name (LFN) directory. This directory should be unique as this will be a global registration name. For example:

lfn_dir = myusername/subdir

Make sure the LFC (file catalog) options are set in crab.cfg under [EDG]. In most cases, they should be set to the values:

lcg_catalog_type = lfc 
lfc_host = lfc-cms-test.cern.ch 
lfc_home = /grid/cms

Each output file specified by output_file will be registered with the LFN lfn:lfc_home/lfn_dir/output_file_job#.ext. This output mode currently does not function on OSG sites.

See Job output retrieval below for instructions on how to get output back to your computer.

Previous: Basics Top: Main page Next: Dataset discovery and job configuration
Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2007-03-19 - OliverGutsche
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback