9.6 How to Pick Events (Interactive and CRAB3)

Complete: 5
Detailed Review status

Goals of this page:

The goal of this page is to help the users get a copy of a subset of events from a dataset. This utility is a python script called edmPickEvents.py and can run an interactive job for a small number of events (now user can use additional command-line option to pick more than 20 events when running interactively), or can set up a CRAB job.

As a recent improvement, this script will allow you access events from data that is not local by using xrootd and CRAB3.

If you plan on making very large "skims' (e.g., all W candidates), please consider collaborating with others as to minimize the number of identical collections we have.

Contents

How to setup the environment to run edmPickEvents.py

The script edmPickEvents.py is part of the PhysicsTools/Utilities package. The version that uses DAS instead of the outphased DBS2 is integrated into CMSSW_5_3_18 and later and the version that uses CRAB3 instead of CRAB2 is integrated into CMSSW_5_3_29 and later (CMSSW_5_3_X release cycle) and CMSSW_7_4_7 and later.

ssh lxplus.cern.ch
cmsrel CMSSW_7_4_7
cd CMSSW_7_4_7/src/
cmsenv

How to Run edmPickEvents.py

The script edmPickEvents.py can be run interactively or via a CRAB job. If you have few events to pick, run it interactively. Now you can pick more than 20 events interactively using the new command line option :

edmPickEvents.py --maxEventsInteractive=30 "/MET/Run2015A-PromptReco-v1/RECO"  events.txt

In case you have a lot of events to pick, submit a CRAB job. This version of script is compatible with CRAB3 and produces a crabConfig which can be directly used by the user to submit the jobs. One can change some of the crab parameters (if needed, like site to store the output files, num of lumi etc).

Run edmPickEvents.py Interactively

To run the script interactively do the following, for example

edmPickEvents.py  "/MET/Run2015A-PromptReco-v1/RECO" 248038:25:12714964

where

*/MET/Run2015A-PromptReco-v1/RECO = is the dataset you want to pick events from. You can only do one per job.

  • 248038:25:12714964 is one event you want to pick. The syntax is Run:Lumi:Event (one could also put more than one event by separating the different events with a comma: Run1:Lumi1:Event1, Run2:Lumi2:Event2.

or

edmPickEvents.py "/MET/Run2015A-PromptReco-v1/RECO" events.txt

where

  • events.txt is a text file that contains = 176163:41:69046624 = ( and others if desired, but one by line )

and the screen output would look like this:

edmCopyPickMerge outputFile=pickevents.root \
  eventsToProcess=248038:12714964 \
  inputFiles=/store/data/Run2015A/MET/RECO/PromptReco-v1/000/248/038/00000/546E74FC-5D14-E511-9DCE-02163E0145BA.root


In this case, the user can either paste the above edmCopyPickMerge output or run edmPickEvents.py with the --runInteractive flag which will run it for you (warning, this can take a long time).

This will create a ROOT file called pickevents.root in the same directory you executed the command from. Also note that edmCopyPickMerge script locates edmPickEvents.py configuration file and then uses it with cmsRun.

Run edmPickEvents.py with CRAB

BEWARE AS THE CRAB CONFIG THAT edmPickEvents puts out might still be crab2 if you use releases before CMSSW_7_4_7!!!

If you are running over a large number of event, if you just don't want to wait for a long job to finish, you can use edmPickEvents.py to setup a CRAB job for you.

To run a CRAB job:

  • First setup the Crab environment following the instructions at SWGuideCrab

  • Then run the script as follows ( In this case we are running the two events via a CRAB job) :

edmPickEvents.py "/MET/Run2015A-PromptReco-v1/RECO" events.txt --crab 

When you run this it gives the following screen output

Please visit CRAB twiki for instructions on how to setup environment for CRAB:
https://twiki.cern.ch/twiki/bin/viewauth/CMS/SWGuideCrab

Setup your environment for CRAB and edit pickevents_crab.py to make any desired changed.  Then run:

crab submit -c pickevents_crab.py

This will create the configuration file called pickevents_crab.config for you

  • If desired, you can modify the contents of pickEvents_crab.config. Most people should find the defaults sufficient.

  • Then submit a Crab job

crab submit -c pickevents_crab.py

  • It will create a CRAB working directory according to current time. YYMMDD_HHMMSS.
  • You can check the status of your job via

crab status

Once the CRAB job finishes successfully, you can get the output ROOT file in the craboutput /res directory:

crab -getoutput

To merge the output of CRAB, you can follow the easy instructions here:

ls -1 pick*root | perl -ne 'print "file:$_"' > myFiles.txt
edmCopyPickMerge inputFiles_load=myFiles.txt outputFile=pickevents_merged.root \
   maxSize=1000000

where maxSize=1000000 means 1 million Kb (or 1 Gb).

(In cmsRun, local files must be noted with the prefix file:.)

Review Status

Editor/Reviewer and date Comments
DinkoFerencek - 22 June 2015 No longer necessary to merge a branch from the cms-analysis-tools GitHub area
RamanKhurana - 8 July 2015 Added instructions for CMSSW_7_4_X compatible with CRAB3
DinkoFerencek - 12 July 2015 Updated instructions indicating that the version compatible with CRAB3 is now integrated in CMSSW

-- SudhirMalik - 09-Sep-2010

Edit | Attach | Watch | Print version | History: r22 < r21 < r20 < r19 < r18 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r22 - 2017-08-03 - PatriziaBarria


ESSENTIALS

ADVANCED TOPICS


 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback