How to run a private MC production

In this page I collected some notes on how to privately generate Montecarlo samples in CMSSW using the LHC Computing Grid. The examples below have been tested with CMSSW_2_1_11 and CRAB_2_4_0.

Disclaimer: The recipes described here are intended to simply be notes and are not guaranteed in any way to up-to-date or be correct nor complete. Please refer to the CMS Software Guide for a complete reference.

Details about crab and the grid are available at:

The GEN-RAW-HLT step

To prepare the GEN-RAW-HLT cfg.py file, the following procedure can be followed:

  1. Prepare cff.py file with the source definition and place it in the python directory of your package MyDir/MyPack/python/my_source_cff.py
  2. Compile and install the file
    cd MyDir/MyPack/python/
    scramv1 b
    
  3. Go to your test directory to prepare the cfg.py file. For this step, it is very convenient to use the cmsDriver.py utility.
    eval `scramv1 ru -sh`
    cmsDriver.py MyDir/MyPack/my_source_cff --eventcontent FEVTSIM \
         --fileout=gen_raw_hlt.root \
         --python_filename=gen_raw_hlt.py \
         -s GEN:ProductionFilterSequence,SIM,DIGI,L1,DIGI2RAW,HLT \
         --datatier GEN-SIM-RAW-HLT \
         --conditions FrontierConditions_GlobalTag,IDEAL_V9::All -n 100 --no_exec
    
  4. Prepare the corresponding crab.cfg
    [CRAB]
    jobtype                   = cmssw
    scheduler               = glitecoll
    
    [CMSSW]
    
    datasetpath           = None
    output_file      = gen_raw_hlt.root
    pset             = gen_raw_hlt.py
    increment_seeds  = sourceSeed
    
    total_number_of_events  = 1000
    events_per_job          = 1000
    
    
    [USER]
    debug_wrapper = 1 
    return_data = 0
    
    copy_data = 1
    storage_element = srm01.lip.pt 
    storage_path = /srm/managerv2?SFN=/lustre/lip.pt/data/cms
    lfn=/store/user
    
    publish_data = 1
    publish_data_name = <output_dataset_name>
    dbs_url_for_publication = https://cmsdbsprod.cern.ch:8443/cms_dbs_prod_local_09_writer/servlet/DBSServlet
    
    use_central_bossDB      = 0
    use_boss_rt             = 0
    
    thresholdLevel          = 10
    eMail                   = pasquale.musella@cern.ch
    
    [EDG]
    rb                      = CERN
    proxy_server            = myproxy.cern.ch
    virtual_organization    = cms
    retry_count             = 2
    lcg_catalog_type        = lfc
    lfc_host                = lfc-cms-test.cern.ch
    lfc_home                = /grid/cms
    
    ce_black_list=T1*, ucsd.edu, lal.in2p3.fr
    
    # ce_white_list=lip.pt
    

The RECO step

  1. The reco step runs on the output of the previous step
    eval `scramv1 ru -sh`
    cmsDriver.py reco --eventcontent FEVTSIM \
         --filein=file:gen_raw_hlt.root \
         --python_filename=reco.py \
         --fileout=reco.root \
         -s RAW2DIGI,RECO \
         --datatier GEN-SIM-HLT-RECO \
         --conditions FrontierConditions_GlobalTag,IDEAL_V9::All -n 100 --no_exec
    
  2. Prepare the corresponding crab.cfg
    [CRAB]
    jobtype                 = cmssw
    scheduler               = glitecoll
    
    [CMSSW]
    
    #my data set
    datasetpath             = None
    dbs_url=https://cmsdbsprod.cern.ch:8443/cms_dbs_prod_local_09_writer/servlet/DBSServlet
    output_file             = reco.root
    pset                    = reco.py
    # increment_seeds           = sourceSeed
    
    total_number_of_events  = -1
    events_per_job          = 1000
    
    
    [USER]
    debug_wrapper = 1 
    return_data             = 0
    
    copy_data = 1
    storage_element = srm01.lip.pt 
    storage_path = /srm/managerv2?SFN=/lustre/lip.pt/data/cms
    lfn=/store/user
    
    publish_data = 1
    publish_data_name = WgammaMuNu
    dbs_url_for_publication = https://cmsdbsprod.cern.ch:8443/cms_dbs_prod_local_09_writer/servlet/DBSServlet
    
    use_central_bossDB      = 0
    use_boss_rt             = 0
    
    thresholdLevel          = 10
    eMail                   = pasquale.musella@cern.ch
    
    [EDG]
    rb                      = CERN
    proxy_server            = myproxy.cern.ch
    virtual_organization    = cms
    retry_count             = 2
    lcg_catalog_type        = lfc
    lfc_host                = lfc-cms-test.cern.ch
    lfc_home                = /grid/cms
    
    ce_black_list=T1*, ucsd.edu, lal.in2p3.fr
    
    ce_white_list=lip.pt
    

Publishing data into DBS

Instructions on how to publish data in DBS can be found here.

  • First of all, make sure that you have write access to a T2.
  • The relevant parameters to be set in the USER section of the crab configuration file file are:
    [USER] 
    publish_data = 1
    publish_data_name = <your_dataset_name>
    dbs_url_for_publication = https://cmsdbsprod.cern.ch:8443/cms_dbs_prod_local_09_writer/servlet/DBSServlet
    
  • It is very important to set the lfn parameter to /store/user. Otherwise you will not be able to read back your data.
    Also, crab will add /user_name to the lfn automatically.
    copy_data = 1
    storage_element = srm01.lip.pt
    storage_path = /srm/managerv2?SFN=/lustre/lip.pt/data/cms
    lfn=/store/user
    
  • Because of a problem in the file copy (that should have been fixed in CRAB_2_4_X), you should make sure that the parent of the destination directory exists in your storage area exists. If your dataset has a parent, this will be
    /<parent_dataset_root>/<you_dataset_name>
    
    If your sample does not have a parent (datasetpath is None in the [CMSSW] section, e.g. if you are on the GEN step), then it will be
    /<you_dataset_name>/<you_dataset_name>
    
  • In order to publish the dataset, after the jobs have finished, it's enough to
    crab -publish
    
    The dataset will appear at this url: https://cmsweb.cern.ch/dbs_discovery/_advanced?dbsInst=cms_dbs_prod_local_07&userMode=expert
Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatcfg crab_gen_raw_hlt.cfg r1 manage 1.2 K 2008-11-14 - 14:05 PasqualeMusella  
Unknown file formatcfg crab_pat.cfg r1 manage 1.2 K 2008-11-14 - 14:06 PasqualeMusella  
Unknown file formatcfg crab_reco.cfg r1 manage 1.1 K 2008-11-14 - 14:06 PasqualeMusella  
Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2008-11-14 - PasqualeMusella
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback