-- OleksiyAtramentov - 10 Mar 2009

Private MC Production and Publication in DBS

I put together instructions on generation of private MC with an emphasis on publication of the corresponding dataset(s) in the DBS.

There are three important steps:

Getting a write permission at T2/T3 site

For your samples to be useful to other members of the group you are supposed to stage out your files to a T2/T3 site. You institution either has it's own storage element (SE) or it is assigned one. (Or you can always ask a friendly institution to give you a write access to their SE.)

In order for you to use a chosen SE you need to know storage_element, storage_path, and user_remote_dir for [USER] section of your crab.cfg.

If you already know your admin ask him/her about that.

If you don't ... a map between your institution and the storage element is not easy to come by. Eric Vaandering gave me this link to US CMS sites. I am still looking for the rest of the T2 sites.

For example, if you are at the US institution you need to look up SE from this link. E.g., for most of the universities your host site is FNAL, hence: storage_element = cmssrm.fnal.gov, storage_path = /srm/managerv2?SFN=/11, and user_remote_dir = /store/user/<your HN name>

It is my understanding that the storage_path has to be deduced by crab from the name of SE, but it does not work yet and your admin has to tell you that.

Now that you know your SE what is left is to ask the site admin (ibid) to grant you a writing permission. The admin will have to create a directory for you: /store/user/<your HN name>.

If you are at FNAL, your files will be put here: /pnfs/cms/WAX/11/store/user/<your HN name>.

You are almost done, now choose dbs_url_for_publication. I understand that some groups started having their own dbs, I am using a test one that works: dbs_url_for_publication=https://cmsdbsprod.cern.ch:8443/cms_dbs_prod_local_09_writer/servlet/DBSServlet

Gen to Raw Step

For some reason (related to cff's at HLT stage, I guess) GEN-RAW stage has to be done separately.

You have two ways to get a proper cfg.py file. You can either go to the dbs and copy&paste config from there or you can start with a simple cff file and generate config with a proper provenance using cmsDriver.py utility. Let's look at the latter.

As an example you can take configuration files here: Configuration/GenProduction/python/*_cff.py

Then, once you have your cff.py file do this:

cmsDriver.py your_cff.py --eventcontent FEVTSIM \
     --fileout=gen_raw_hlt.root \
     --python_filename=gen_raw_hlt.py \
     -s GEN:ProductionFilterSequence,SIM,DIGI,L1,DIGI2RAW,HLT \
     --datatier GEN-SIM-RAW-HLT \
     --conditions FrontierConditions_GlobalTag,IDEAL_V11::All -n 10 --no_exec

After this step you can run cmsRun gen_raw_hlt.py and check generated events in gen_raw_hlt.root file.

Once/When this worked you can proceed with generating and publishing GEN-RAW dataset.

The following crab_gen2raw.cfg will take care of this step

CRAB]
jobtype                    = cmssw
scheduler                =  glite
#scheduler              = condor_g
[CMSSW]

datasetpath         = None
output_file           = gen_raw_hlt.root
pset                     = gen_raw_hlt.py
increment_seeds  = sourceSeed

total_number_of_events  = 10
events_per_job          = 10

[USER]
debug_wrapper = 1 
return_data = 0

copy_data = 1

storage_element = cmssrm.fnal.gov
storage_path = /srm/managerv2?SFN=/11
user_remote_dir = /store/user/oatramen
publish_data = 1
publish_data_name = wjet_test
dbs_url_for_publication = https://cmsdbsprod.cern.ch:8443/cms_dbs_prod_local_09_writer/servlet/DBSServlet

eMail                   = oleksiy.atramentov@cern.ch

[EDG]
rb                            = CERN
proxy_server            = myproxy.cern.ch
virtual_organization = cms
retry_count              = 2
lcg_catalog_type       = lfc
lfc_host                    = lfc-cms-test.cern.ch
lfc_home                   = /grid/cms

ce_black_list=T1*, ucsd.edu, lal.in2p3.fr

Don't forget to issue crab -publish after your jobs have finished .

Now, to look up your dataset you should find dataset where dataset like *your_dataset_name* in this DB instance:

https://cmsweb.cern.ch/dbs_discovery/_advanced?dbsInst=cms_dbs_prod_local_09&userMode=expert.

Raw to Reco Step

Before you run on the dataset you have just created produce the corresponding config using cmsDriver.py utility:

cmsDriver.py reco --eventcontent FEVTSIM \
     --filein=file:gen_raw_hlt.root \
     --python_filename=reco.py \
     --fileout=reco.root \
     -s RAW2DIGI,RECO \
     --datatier GEN-SIM-HLT-RECO \
     --conditions FrontierConditions_GlobalTag,IDEAL_V11::All -n 10 --no_exec

This produces you reco.py config that you put in your final crab_raw2reco.cfg

[CRAB]
jobtype                   = cmssw
#scheduler               = glite
scheduler        = condor_g
[CMSSW]

datasetpath          = /oleksiy_wjet_test/-oleksiy_wjet_test-cf7e1503d47247a8e532ee6618579d8c/USER
dbs_url                = https://cmsdbsprod.cern.ch:8443/cms_dbs_prod_local_09_writer/servlet/DBSServlet
output_file           = reco.root
pset                     = reco.py
increment_seeds  = sourceSeed

total_number_of_events  = -1
events_per_job          = 1000

[USER]
debug_wrapper = 1 
return_data = 0

copy_data = 1

storage_element = cmssrm.fnal.gov
storage_path = /srm/managerv2?SFN=/11
user_remote_dir = /store/user/oatramen
publish_data = 1
publish_data_name = oleksiy_wjet_test-reco
dbs_url_for_publication = https://cmsdbsprod.cern.ch:8443/cms_dbs_prod_local_09_writer/servlet/DBSServlet

eMail                   = oleksiy.atramentov@cern.ch

[EDG]
rb                      = CERN
proxy_server            = myproxy.cern.ch
virtual_organization    = cms
retry_count             = 2
lcg_catalog_type        = lfc
lfc_host                = lfc-cms-test.cern.ch
lfc_home                = /grid/cms

ce_black_list=T1*, ucsd.edu, lal.in2p3.fr

Note, that you have to explicitly specify the url to the dbs where you published the GEN-RAW dataset:

dbs_url =https://cmsdbsprod.cern.ch:8443/cms_dbs_prod_local_09_writer/servlet/DBSServlet

That's it! Now, you can proceed to production of PAT files with hacked pat::Photon that has additional showershape and conversion information. Follow this link.

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2009-03-17 - OleksiyAtramentov
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback