Chapter 3: Getting Started with Data Analysis



3.1 Analysis Overview: an Introduction

Complete: 5
Detailed Review status

Goals of this page:

This page presents a big-picture overview of performing an analysis at CMS.

  • The first task is to describe how the data flows within CMS, from data taking through various layers of skimming. This also introduces a concept of a data tier (RECO, AOD) and defines all of them. It also introduces the PAT data format which is described in detail in Chapter 4. This is the scope of this section.
  • We need to understand the most important CMS data formats, RECO and AOD, so they are described next. PAT is also mentioned, although it will be described in detail later.
  • Finally, we explore two options for a quantitative analysis of CMS events:
    • FW Lite -- using ROOT enhanced with libraries that can understand CMS data formats and aid in fetching object collections from the event
    • the full Framework -- using C++ modules in cmsRun

The data flow, from detector to analysis

(For a more thorough overview, please see WorkBookComputingModel; this section necessarily distills the information which was presented there in much more detail.)

To enable the most effective access to CMS data, the data are first split into Physics Datasets (PDs) and then the events are filtered. The division into the Physics Datasets is done based on the trigger decision. The primary datasets are structured and placed to make life as easy as possible, e.g. to minimize the need of an average user to run on very large amounts of data. The datasets group or split triggers in order to achieve balance in their size.

However, the Primary Datasets will be too large to make direct access by users reasonable or even feasible. The main strategy in dealing with such a large number of events is to filter them, and do that in layers of ever-tighter event selection. (After all, the Level 1 trigger and HLT are doing the same online.) The process of selecting events and saving them in output is called `skimming'. The intended modus operandi of CMS analysis groups is the following:

  1. the primary datasets and skims are produced; they are defined using the trigger information (for stability) and produced centrally on Tier 1 systems
  2. the secondary skims are produced by the physics groups (say a Higgs group) by running on the primary skims; the secondary skims are usually produced by group members running on the Tier 2 clusters assigned to the given group
  3. optionally, the user then skims once again, applying an ever tighter event selection
  4. the final sample (with almost final cuts) can then be analyzed by FW Lite. It can also be analyzed by the full framework, however we recommend using FW Lite as it is interactive and far more portable

The primary skims (step 1 above) reduce the size of the primary datasets in order to reduce the time of subsequent layers of skimming. The target of the primary skims is a reduction of about a factor of 10 in size with respect to the primary datasets.

The secondary skimming (step 2 above) must be tight enough to make the secondary skims feasible in terms of size. And yet it must not be too tight since otherwise certain analyses might find themselves starved for data. However, in this case what is `tight' is analysis-dependent, so it is vital for the group members to be involved in the definition of their group's secondary skims!

The user selection (step 3) is made on the Tier 2 by the user, and it's the main opportunity to reduce the size of the samples the user will need to deal with (and lug around). In many cases, this is where the preliminary event selection is done, and thus it is the foundation of the analysis. It is expected that the user may need to re-run this step (e.g., in case of finding out that the cuts were too tight), but this is not a problem since the tertiary skims are being run on the secondary skims which are already reduced in size.

That being said, it is important to tune the user's skim to be as close to `just right' as possible: the event selection should be looser than they are expected to be after the final cut optimization, but not too loose -- otherwise the skimming would not serve its purpose. If done right, this will not only save your own time, but also preserve the collaboration's CPU resources.

Reduction in event size: CMS Data Formats and Data Tiers

(For a more thorough overview, please see WorkBookComputingModel; this section necessarily distills the information which was presented there in much more detail.)

In addition to the reduction of the number of events, in steps 1-3 it is also possible to reduce the size of each event by

  • removing unneeded collections (e.g. after we make PAT candidates, for most purposes the rest of the AOD information is not needed); this is called stripping or slimming.
  • removing unneeded information from objects; this is called thinning . It is an advanced topic; it's still experimental and not covered in here.

Stripping, slimming and thinning in the context of analysis is discussed more below.

Starting from the detector output ("RAW" data), the information is being refined and what is not needed is being dropped. This defines the CMS data tiers. Each bit of data in an event must be written in a supported data format. A data format is essentially a C++ class, where a class defines a data structure (a data type with data members). The term data format can be used to refer to the format of the data written using the class (e.g., data format as a sort of template), or to the instantiated class object itself. The DataFormats package and the SimDataFormats package (for simulated data) in the CMSSW CVS repository contain all the supported data formats that can be written to an Event file. So, for example, if you wish to add data to an Event, your EDProducer module must instantiate one or more of these data format classes.

Data formats (classes) for reconstructed data, for example, include Reco.Track, Reco.TrackExtra, and many more. See the Offline Guide section SWGuideRecoDataTable for the full listing.

Data Tiers: Reconstructed (RECO) Data and Analysis Object Data (AOD)

Event information from each step in the simulation and reconstruction chain is logically grouped into what we call a data tier, which has already been introduced in the Workbook section describing the Computing Model. Examples of data tiers include RAW and RECO, and for MC, GEN, SIM and DIGI. A data tier may contain multiple data formats, as mentioned above for reconstructed data. A given dataset may consist of multiple data tiers, e.g., the term GenSimDigi includes the generation (MC), the simulation (Geant) and digitalization steps. The most important tiers from a physicist's point of view are RECO (all reconstructed objects and hits) and AOD (a smaller subset of RECO which is needed by analysis).

RECO data contains objects from all stages of reconstruction. AOD data are derived from the RECO information to provide data for physics analyses in a convenient, compact format. Typically, physics analyses don't require you to rerun the reconstruction process on the data. Most physics analyses can run on AOD data.

whats_in_aod_reco.gif

RECO

RECO is the name of the data-tier which contains objects created by the event reconstruction program. It is derived from RAW data and provides access to reconstructed physics objects for physics analysis in a convenient format. Event reconstruction is structured in several hierarchical steps:

  1. Detector-specific processing: Starting from detector data unpacking and decoding, detector calibration constants are applied and cluster or hit objects are reconstructed.
  2. Tracking: Hits in the silicon and muon detectors are used to reconstruct global tracks. Pattern recognition in the tracker is the most CPU-intensive task.
  3. Vertexing: Reconstructs primary and secondary vertex candidates.
  4. Particle identification: Produces the objects most associated with physics analyses. Using a wide variety of sophisticated algorithms, standard physics object candidates are created (electrons, photons, muons, missing transverse energy and jets; heavy-quarks, tau decay).

The normal completion of the reconstruction task will result in a full set of these reconstructed objects usable by CMS physicists in their analyses. You would only need to rerun these algorithms if your analysis requires you to take account of such things as trial calibrations, novel algorithms etc.

Reconstruction is expensive in terms of CPU and is dominated by tracking. The RECO data-tier will provide compact information for analysis to avoid the necessity to access the RAW data for most analysis. Following the hierarchy of event reconstruction, RECO will contain objects from all stages of reconstruction. At the lowest level it will be reconstructed hits, clusters and segments. Based on these objects reconstructed tracks and vertices are stored. At the highest level reconstructed jets, muons, electrons, b-jets, etc. are stored. A direct reference from high-level objects to low-level objects will be possible, to avoid duplication of information. In addition the RECO format will preserve links to the RAW information.

The RECO data includes quantities required for typical analysis usage patterns such as: track re-finding, calorimeter reclustering, and jet energy calibration. The RECO event content is documented in the Offline Guide at RECO Data Format Table.

AOD

AOD are derived from the RECO information to provide data for physics analysis in a convenient, compact format. AOD data are usable directly by physics analyses. AOD data will be produced by the same, or subsequent, processing steps as produce the RECO data; and AOD data will be made easily available at multiple sites to CMS members. The AOD will contain enough information about the event to support all the typical usage patterns of a physics analysis. Thus, it will contain a copy of all the high-level physics objects (such as muons, electrons, taus, etc.), plus a summary of the RECO information sufficient to support typical analysis actions such as track refitting with improved alignment or kinematic constraints, re-evaluation of energy and/or position of ECAL clusters based on analysis-specific corrections. The AOD, because of the limited size that will not allow it to contain all the hits, will typically not support the application of novel pattern recognition techniques, nor the application of new calibration constants, which would typically require the use of RECO or RAW information.

The AOD data tier will contain physics objects: tracks with associated Hits, calorimetric clusters with associated Hits, vertices, jets and high-level physics objects (electrons, muons, Z boson candidates, and so on).

Because the AOD data tier is relatively compact, all Tier-1 computing centres are able to keep a full copy of the AOD, while they will hold only a subset of the RAW and RECO data tiers. The AOD event content is documented in the Offline Guide at AOD Data Format Table.

PAT

The information is stored in RECO and AOD in a way that uses the least amount of space and allows for the greatest flexibility. This is particularly true for DataFormats that contain objects that link to each other. However, accessing these links between RECO or AOD objects requires more experience with C++. To simplify the user's analysis, a set of new data formats are created, which aggregate the related RECO information. These new formats, along with the tools used to make and manipulate them, are called Physics Analysis Toolkit, or PAT. The PAT is de facto the way how the users will access the physics objects which are the output of RECO.

PAT's content is flexible -- it is up to the user to define it. For this reason, PAT is not a data tier. The content of PAT may change from one analysis to another, let alone from one PAG to another. However, PAT defines a standard for the physics objects and variables stored in those physics objects. It is like a menu in a restaurant -- every patron can choose different things from the menu, but everybody is reading from the same menu. This facilitates sharing both tools and people between analyses and physics groups.

PAT is discussed in more detail in WorkBookPATTupleCreationExercise. Here we continue the story of defining the user content of an analysis, in which PAT plays a crucial role.

Group and user skims: RECO, AOD and PAT-tuples

Now we can refine the descriptions of primary, group and user-defined skims, with some examples. In almost all cases the primary skims will read AOD and produce AOD with a reduced number of events. (During the physics commissioning, the primary skims may also read and write RECO instead of AOD.) The group and user skims may also read and write AOD (or RECO). However, they could also produce PAT-tuples, as decided by the group or the user. As an illustration, these steps could be:

  1. primary skims read AOD, write AOD.
  2. group-wide skim filters events in AOD, and produces PAT with lots of information. (Such PAT-tuples are sometimes called for as they need to benefit multiple efforts within the group)
  3. the user modifies the PAT workflow to read PAT and produce another version of PAT, but with much smaller content (stripping/slimming), and possibly even compressed PAT object (thinning).

All the operations that involve skimming, stripping, and thinning are done within the full-Framework. Therefore, every user needs to at least know what these jobs do in each of steps, even if s/he does not need to make any changes to any of the processing steps. However, it is more likely that some changes will be needed, especially in the last stage where the skimming and further processing is ran by the user. In some cases, the user may even need to write Framework modules like EDProducers -- to add new DataFormats to the events, or EDAnalyzers -- to compute quantities that require access to conditions.

In the above example, the end of the skimming chain produces a "PAT-tuple", which should be small enough to easily fit onto a laptop. Moreover, it should also fit within a memory of the ROOT process, thus facilitating interactive speed on par with TTrees. However, to be able to read CMS data (RECO, AOD or PAT) from ROOT, we need to teach ROOT to understand CMS DataFormats by loading the DataFormats libraries themselves, accompanied also by a couple of helper classes that simplify the user's manipulation of CMS "events" in ROOT. ROOT with these additional libraries installed is called Framework-lite, or FW Lite.

Tools for interactive analysis: FW Lite, edmBrowser, Fireworks

The interactive stage is where most of the analysis is actually done, and where most of the `analysis time' is actually spent. Every analysis is different, and many take a number of twists and turns towards its conclusion, solving an array of riddles on the way. However, most analyses need (or could benefit from):

  • a way to examine the content of CMS data files, especially PAT-tuples. CMS has several tools that can examine the file content, including stand-alone executables edmDumpEventContent (which dumps a list of the collections present in the file to the terminal), and edmBrowser (which has a nice graphical interface).
  • a way to obtain the history of the file. The CMS files contain embedded information sufficient to tell the history of the objects in the file. This information is called provenance, and is crucial for the analysis, as it allows the user to establish with certainty what kind of operations (corrections, calibrations, algorithms) were performed on the data present in the file. The stand-alone executable edmProvDump prints the provenance to the screen.
  • a way to visualize the event. CMS has two event viewers: Iguana is geared toward the detailed description of the event, and is described later, in the advanced section of the workbook. In contrast, the main objective of Fireworks is to display analysis-level quantities. Moreover, Fireworks is well-suited for investigating events in CMS data files, since it can read them directly.
  • a way to manipulate data quantitatively. In HEP, most quantitative analysis of data is performed within ROOT framework. ROOT has, over the years, subsumed an impressive collection of statistical and other tools, including the fitting package RooFit, or a collection of multi-variate analysis tools, TMVA. ROOT can access the CMS data directly, provided the DataFormats libraries are loaded, turning it into FW Lite.

The following pages in this Chapter of the WorkBook will illustrate each of these steps, especially data analysis (including making plots) in FW Lite (WorkBookFWLite). But first, the choice of a release is discussed (WorkBookWhichRelease), and the ways to get data are illustrated (WorkBookDataSamples). At the end of the exercise in WorkBookDataSamples, we will end up with one or more small files, which we explore next, first using command-line utilities, and then with graphical tools like edmBrowser (WorkBookEdmInfoOnDataFile) and Fireworks event display (WorkBookFireworks).

Review status

Reviewer/Editor and Date (copy from screen) Comments

PetarMaksimovic - 20 Jun 2009 Created.
PetarMaksimovic - 30 Nov 2009 Some clean-up.
XuanChen - 17 Jul 2014 Changed the links from cvs to github

I went through chapter 3 section 1. The information is relevant and clear. I created a few links that were sugested by Kati L. P.

created link at " edmDumpEventContent "

created link at " edmBrowser "

created link at " provenance "

created link at " edmProvDump "

created link at " Iguana "

created link at " Fireworks "

Responsible: SalvatoreRappoccio
Last reviewed by: PetarMaksimovic - 2 March 2009



3.2 Which CMSSW release to use

Complete: 5
Detailed Review status

For the impatient user!

  • For data and MC:
    ssh lxplus.cern.ch
    cmsrel CMSSW_9_4_0
    cd CMSSW_9_4_0/src
    cmsenv
    

Current Analysis Release

The currently recommended Analysis Release is 8.0.X, which was used to process all 2016 data. For 2017 data analysis, 9.4.X will be used to process all 2017 data.

For the latest and greatest, always update to the latest tags in the bug fix page.

You can also find more Frequently Asked Questions here. If you are having trouble, please look through these FAQ's first since many times your question is already answered.

Goals of this page

This page is intended to provide a reference point for users wanting to know which CMSSW release to use for analysing Monte Carlo data produced with different releases and for running CMSSW WorkBook exercises and tutorials. It contains information about the latest releases used, notifications of new releases upcoming, and important notes about intercompatibility of releases for data/MC and tutorial running.

Contents

CMSSW releases

CMSSW code development proceeds in releases indicated by the series of numbers such as 7_4_15. The first number in series indicates a major cycle, the second number a major release with new features with respect to the preceeding release, and the third number a release with some updates and bug fixes to the preceeding release.

You can find out the up-to-date list of available releases in https://cmssdt.cern.ch/SDT/cgi-bin/ReleasesXML These are the releases which are deployed to the different sites where you can run analysis jobs over the grid (see chapter 5) and only these releases can be expected to work. As the general rule, you should always use the last release in the release cycle, i.e. with the highest available value for the third number.

You can find out the existing releases installed on the local user interface by typing

scram list -a

Each major release is preceeded by a series of prereleases, indicated by pre in the release name. The scope of these prereleases is to test the code which is going to be released and it is very likely that they contain some errors which are then fixed in the release itself. For this reason you should not start developing your code on the prereleases (unless you are doing agreed code development which you need to test and which will be included in the release). Furthermore, to save disk space, public prerelease areas including the libraries are removed fairly soon after the release and you will not be able to continue working in your local prerelease area.

The new releases and the (pre)releases to be removed are announced in the software release announcements hypernews forum. Subscribing to this hypernews is a rather low traffic way of finding out that a (pre)release is being proposed for deletion, a prior warning is always sent to this forum.

The CMSSW release schedule is summarized in the Release Schedule page.

Available releases

MC data is not compatible between major release series - as such, you can't analyse Monte Carlo created for a much earlier release with code from a later release. This will improve with more backwards compatibility in the future as the analysis code stabilises more.

10_3_X is used for PbPb data collection, processing and MC Campaign

10_2_X is used for 2018 processing and MC Production

10_1_X is used for HLT during pp data-taking

10_0_X is used for 2018 development

9_4_X is used for V2 DIGI-RECO and reprocessing all 2017 data

9_3_X is used for HGCal phase 2 production samples and 2017 V2 GEN-SIM samples

9_2_X is used for trigger and prompt reconstruction for 2017 data taking

9_1_X is used for phase 2 production samples and 2017 development

9_0_X is used for phase 2 production samples and 2017 development

8_1_X is used for 2017 development

8_0_X is used for 2016 data taking and MC

7_6_X is in use for reprocessing the 2015 data and MC starting in November

7_5_X was used for heavy ion data in 2015

7_4_X is the currently recommended analysis release for 2015 data and MC

7_3_X is the currently recommended Analysis Release for Phys14 excercise on MC.

5_3_X is the currently recommended Analysis Release for 2012 Data/MC.

4_4_4 is the currently recommended analysis release for data and MC reconstructed with 4.4.x.

4_2_8_patch7 is the currently recommended analysis release for data and MC reconstructed with 4.2.x.

4_1_5 is the currently recommended analysis release for data and MC reconstructed with 4.1.x.

3_8_X was the "Moriond 2011" recommended analysis release. There is a break in compatibility in the PAT-tuples themselves in 387 to account for a (required) change in PAT jet energy corrections.

3_7_X series is the past release for special cases of ICHEP2011 analysis.

3_6_X series is the past release for ICHEP2011 analysis 7 TeV current collisions and Monte Carlo. It is the recommended analysis release for analyze 35x RECO and RE-RECO.

3_5_X series was meant for first data processing for 7 TeV collisions.

3_4_X Beginning from 3_4_X, releases are only available for the SLC5 (Scientific Linux 5 or slc5_ia32_gcc434 ) architecture. The 3_4_2 release is also available as an analysis release for use with MC produced with 3_1_X. The 3_4_2 release can read the MC samples produced with 3_1_X and yield identical results, so it is strongly recommended to use 3_4_2 or later in analysis of the December 2009 collision data.

3_3_X series was meant for December 2009 collisions data taking.

3_2_X series was used for data-taking in CRAFT 2009.

3_1_X series was used in a big MC data production from summer 2009.

3_0_X series is a technical release which should not be used.

2_2_X is compatible with the large amount of MC data samples generated in 2009. It is also used to read the real data from the cosmic runs in 2008 and 2009. The 2_2_X series will be used until a sufficient amount of MC or real data will be available with 3_1_X.

Some older releases may be still available for finishing ongoing studies.

Backward compatibility is always ensured for raw data. Inside a release cycle for all other official data as well.

Release Notes

The summary of the new features added in each cycle for every software area can be found in this page, linked under the Offline main page. A detailed list of differences in terms of package tags can be extracted from the CMS Tag Collector.

Need a release which is being deprecated?

The CMSSW releases have a limited lifetime and they will be deprecated and removed. If a release is officially deprecated, it is strongly implied that CMS wants one to move to later, non-deprecated releases.

Under very rare and special circumstances ( e.g. student finishing thesis etc.), we do allow sites to keep deprecated releases. If you need to use a CMSSW release that is being centrally deprecated and keep it at a particular site, the correct channel is to place your request to the convener of the PAG/POG/DPG group your analysis is attached to; this convener might then forward your request to sites associated to the PAG/POG/DPG group, and the local site admin might decide to locally keep a release (as well as he might store the corresponding data in a local site area).

Review status

Reviewer/Editor and Date (copy from screen) Comments
HamedBakhshianSohi - 2015-02-22 release 7_3 information is added
KatiLassilaPerini - 15 May 2009 update the available release with Andreas Pfeiffer
CMSUserSupport - 27 Apr 2007 updated the version list to include 1_4_x and 1_5_x

Responsible: SudhirMalik
Last reviewed by: PetarMaksimovic - 03-Dec-2009



3.3.1 Copy and Merge Files

Complete: 5
Detailed Review status

Contents:

Goals of this page:

The goal of this page is to learn how to work with data samples by copying few events from a data file to your local area. You would also learn how to merge data files. Note that the data ROOT files are in EDM format, so they are also called EDM files. The full information on finding data samples is given in WorkBookLocatingDataSamples.

Access Data In a CMSSW Job

When you copy a few events to your working directory like in the above example, it is mainly to run a test code quickly or just for the sake of learning etc. But normally you would access the data in your cmsRun job by accessing it in a local storage element ( for example, castor at CERN or say at Fermilab). You will read more about it in Section 4.1.1. Before that you would need some experience using cmsRun and understanding python configuration files since you will be doing more work than just copying data.

Examples of accessing the data

When you start your analysis, you will locate your data in Data Aggregation System (DAS) which is described in WorkBookLocatingDataSamples. In this page we give instructions on how to get started with a small data sample. The Dataset Bookkeeping Service (DBS) is no longer used.

Note: Successful running the examples on this page needs grid certificate, e.g.:

> source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.sh

> glite-voms-proxy-init --voms cms

Users intending to reproduce this exercise at the LPC should log into cmslpc-sl5 nodes and execute instead, on bash:

source /cvmfs/cms.cern.ch/cmsset_default.sh
voms-proxy-init -voms cms

and on tcsh:

source /cvmfs/cms.cern.ch/cmsset_default.csh
voms-proxy-init -voms cms

For details please read section:
https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookStartingGrid

Copy Data Locally

The first example of accessing data is to copy a small amount of data from the local storage element (e.g. castor at CERN) to your own area and study the data directly with FWlite. You may choose different data source looking at DAS and verify the CMSSW version using edm tools. Let us start with a very simple python configuration script as shown below and call it copy_cfg.py:

import FWCore.ParameterSet.Config as cms

# Give the process a name
process = cms.Process("PickEvent")

# Tell the process which files to use as the source
process.source = cms.Source ("PoolSource",
          fileNames = cms.untracked.vstring ("/store/relval/CMSSW_5_3_15/RelValPyquen_ZeemumuJets_pt10_2760GeV/DQM/PU_STARTHI53V10A_TEST_feb14-v3/00000/FE0AF9FB-C196-E311-8678-0025904CF75A.root")
)

# tell the process to only run over 100 events (-1 would mean run over
#  everything
process.maxEvents = cms.untracked.PSet(
            input = cms.untracked.int32 (100)

)

# Tell the process what filename to use to save the output
process.Out = cms.OutputModule("PoolOutputModule",
         fileName = cms.untracked.string ("MyOutputFile.root")
)

# make sure everything is hooked up
process.end = cms.EndPath(process.Out)

Save these lines in a file named copy_cfg.py.

Before you run this script, first setup the CMSSW release as below: ( cmsrel command is needed only if you do not have yet the CMSSW_directory) :

ssh lxplus.cern.ch
cd ~/scratch0
cmsrel CMSSW_5_3_7
cd CMSSW_5_3_7/src
cmsenv
source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.sh

glite-voms-proxy-init --voms cms 

and then run the script as follows:

cmsRun copy_cfg.py

Users intending to reproduce this exercise on LPC machines should log into cmslpc-sl5.fnal.gov with their respective usernames and do instead, on bash:

source /cvmfs/cms.cern.ch/cmsset_default.sh
voms-proxy-init -voms cms
cd nobackup/
export SCRAM_ARC=slc5_amd64_gcc462
cmsrel CMSSW_5_3_7
cd CMSSW_5_3_7/src/
cmsenv
cmsRun copy_cfg.py

and on tcsh:

source /cvmfs/cms.cern.ch/cmsset_default.csh
voms-proxy-init -voms cms
cd nobackup/
setenv SCRAM_ARC slc5_amd64_gcc462
cmsrel CMSSW_5_3_7
cd CMSSW_5_3_7/src/
cmsenv
cmsRun copy_cfg.py

When you run this command the output will look like this:

26-Feb-2014 17:19:17 CET  Initiating request to open file root://eoscms//eos/cms/store/relval/CMSSW_5_3_15/RelValPyquen_ZeemumuJets_pt10_2760GeV/DQM/PU_STARTHI53V10A_TEST_feb14-v3/00000/FE0AF9FB-C196-E311-8678-0025904CF75A.root?svcClass=default
140226 17:19:17 30642 Xrd: GoToAnotherServer: Going to: lxfsra06a03.cern.ch:1095
26-Feb-2014 17:19:18 CET  Successfully opened file root://eoscms//eos/cms/store/relval/CMSSW_5_3_15/RelValPyquen_ZeemumuJets_pt10_2760GeV/DQM/PU_STARTHI53V10A_TEST_feb14-v3/00000/FE0AF9FB-C196-E311-8678-0025904CF75A.root?svcClass=default
Begin processing the 1st record. Run 1, Event 1, LumiSection 666666 at 26-Feb-2014 17:19:21.639 CET
Begin processing the 2nd record. Run 1, Event 2, LumiSection 666666 at 26-Feb-2014 17:19:21.640 CET
...................
...................
...................
Begin processing the 99th record. Run 1, Event 84, LumiSection 666682 at 26-Feb-2014 17:19:21.852 CET
Begin processing the 100th record. Run 1, Event 85, LumiSection 666682 at 26-Feb-2014 17:19:21.853 CET
26-Feb-2014 17:19:43 CET  Closed file root://eoscms//eos/cms/store/relval/CMSSW_5_3_15/RelValPyquen_ZeemumuJets_pt10_2760GeV/DQM/PU_STARTHI53V10A_TEST_feb14-v3/00000/FE0AF9FB-C196-E311-8678-0025904CF75A.root?svcClass=default

=============================================

MessageLogger Summary

 type     category        sev    module        subroutine        count    total
 ---- -------------------- -- ---------------- ----------------  -----    -----
    1 fileAction           -s file_close                             1        1
    2 fileAction           -s file_open                              2        2

 type    category    Examples: run/evt        run/evt          run/evt
 ---- -------------------- ---------------- ---------------- ----------------
    1 fileAction           PostEndRun                        
    2 fileAction           pre-events       pre-events       

Severity    # Occurrences   Total Occurrences
--------    -------------   -----------------
System                  3                   3



The execution of the above command will result in copying 100 events from /store/relval/CMSSW_5_3_15/RelValPyquen_ZeemumuJets_pt10_2760GeV/DQM/PU_STARTHI53V10A_TEST_feb14-v3/00000/FE0AF9FB-C196-E311-8678-0025904CF75A.root to an output file called MyOutputFile.root.

Introduction to copyPickMerge_cfg.py and edmCopyPickMerge

However, there is a more elegant and simple way to copy events. This elegant way gets rid of modifying the copy_cfg.py kind of file every time you need to change the input/output file name or number of events.

You may look inside copyPickMerge_cfg.py to find out that it is very similar to the copy_cfg.py configuration above, except that it is setup that you can change many options ( e.g., the input and output files) from the command line instead of having to edit the file.

The important lines to observe inside copyPickMerge_cfg.py are:

   21fileNames = cms.untracked.vstring (options.inputFiles),

takes the name of the input file(s) as a string.

   30input = cms.untracked.int32 (options.maxEvents)

is used to specify the number of events to be read/copied, and

   35fileName = cms.untracked.string (options.outputFile)

is used to specify the name of the output ROOT file. They serve the same purpose as the following three lines taken from the copy_cfg.py above:

...
          fileNames = cms.untracked.vstring ("/store/relval/CMSSW_5_3_15/RelValPyquen_ZeemumuJets_pt10_2760GeV/DQM/PU_STARTHI53V10A_TEST_feb14-v3/00000/FE0AF9FB-C196-E311-8678-0025904CF75A.root")
...
            input = cms.untracked.int32 (100)
...
         fileName = cms.untracked.string ("MyOutputFile.root")
...

but there is no need to edit this file every time a change is needed, instead, the input parameters are just given from the command line.

You may copy/paste the code lines inside copyPickMerge_cfg.py in your local directory, and you could accomplish the same thing you did with copy_cfg.py above by:

cmsRun copyPickMerge_cfg.py inputFiles=/store/relval/CMSSW_5_3_15/RelValPyquen_ZeemumuJets_pt10_2760GeV/DQM/PU_STARTHI53V10A_TEST_feb14-v3/00000/FE0AF9FB-C196-E311-8678-0025904CF75A.root outputFile=MyOutputFile.root maxEvents=100

Since part of the beauty of copyPickMerge_cfg.py is that you don't have to edit it, we put it in CVS in CMS.PhysicsTools/Utilities/Configuration. To facilitate using it, there is an edm utility called edmCopyPickMerge, located in the same package, that locates the python configuration copyPickMerge_cfg.py uses it with cmsRun. If you don't initialize the grid environment including the certificate, the data file from which you are trying to copy events should be available locally. If you have the grid certificated initialized as metioned above, i.e.

source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.sh
glite-voms-proxy-init --voms cms

the script will try to find the right files for you from a remote storage element.

Just type and use edmCopyPickMerge as follows to copy say, 100 events, from a file available locally

edmCopyPickMerge \
  inputFiles=/store/relval/CMSSW_5_3_15/RelValPyquen_ZeemumuJets_pt10_2760GeV/DQM/PU_STAR
THI53V10A_TEST_feb14-v3/00000/FE0AF9FB-C196-E311-8678-0025904CF75A.root \  outputFile=MyOutputFile.root \
  maxEvents=100 

When you execute the above command, the output should look like this.

26-Feb-2014 17:36:16 CET  Initiating request to open file root://eoscms//eos/cms/store/relval/CMSSW_5_3_15/RelValPyquen_ZeemumuJets_pt10_2760GeV/DQM/PU_STARTHI53V10A_TEST_feb14-v3/00000/FE0AF9FB-C196-E311-8678-0025904CF75A.root?svcClass=default
140226 17:36:16 4176 Xrd: GoToAnotherServer: Going to: lxfsra06a03.cern.ch:1095
26-Feb-2014 17:36:17 CET  Successfully opened file root://eoscms//eos/cms/store/relval/CMSSW_5_3_15/RelValPyquen_ZeemumuJets_pt10_2760GeV/DQM/PU_STARTHI53V10A_TEST_feb14-v3/00000/FE0AF9FB-C196-E311-8678-0025904CF75A.root?svcClass=default
Begin processing the 1st record. Run 1, Event 1, LumiSection 666666 at 26-Feb-2014 17:36:20.551 CET
Begin processing the 2nd record. Run 1, Event 2, LumiSection 666666 at 26-Feb-2014 17:36:20.552 CET
..........................
..........................
..........................
Begin processing the 100th record. Run 1, Event 85, LumiSection 666682 at 26-Feb-2014 17:36:20.770 CET
26-Feb-2014 17:36:42 CET  Closed file root://eoscms//eos/cms/store/relval/CMSSW_5_3_15/RelValPyquen_ZeemumuJets_pt10_2760GeV/DQM/PU_STARTHI53V10A_TEST_feb14-v3/00000/FE0AF9FB-C196-E311-8678-0025904CF75A.root?svcClass=default

=============================================

MessageLogger Summary

 type     category        sev    module        subroutine        count    total
 ---- -------------------- -- ---------------- ----------------  -----    -----
    1 fileAction           -s file_close                             1        1
    2 fileAction           -s file_open                              2        2

 type    category    Examples: run/evt        run/evt          run/evt
 ---- -------------------- ---------------- ---------------- ----------------
    1 fileAction           PostEndRun                        
    2 fileAction           pre-events       pre-events       

Severity    # Occurrences   Total Occurrences
--------    -------------   -----------------
System                  3                   3

A successful copying of 100 events will result in an output ROOT file called MyOutputFile_numEvent100.root . If you do not specify the name of the output file then a file with a default name output_numEvent100.root is created. Make sure you have enough disk space to write the file out.

If you do not have the data file located locally, you can also run a Grid Job. For more information on this part and other details have a look at WorkBookPickEvents

Merge EDM files

To merge EDM files, one can again use edmCopyPickMerge utility which is in CMSSW, any current version.

To merge several files together:

edmCopyPickMerge inputFiles=first.root,second.root,third.root outputFile=output.root maxSize=100000

where the input files are first.root, second.root, and third.root and the output file is output.root or

edmCopyPickMerge inputFiles_load=listOfInputFiles.txt outputFile=output.root maxSize=100000

where listOfInputFiles.txt is a text file containing a list of input files (one file per line) and output.root is the output file and 1000000 is the maximum size of the output file in Kb ( e.g., 1000000 Kb = 1 Gb).

Important: In cmsRun, when giving it local files as input, the file names must be prefixed by file:. For example, first.root would be written file:first.root.

How to copy a particular event

Note: edmPickEvents.py is a tool that will find the necessary files and run the configuration file below given a dataset name and a list of events.

There is a standard config file that helps you extracting single events from CMS data files. The file and the events can be specified at command line:

cmsRun pickEvent_cfg.py inputFiles=file1.root \
       eventsToProcess=123592:334:755009,123592:23:392793,123592:42:79142 \
       outputFile=output.root

The config file pickEvent_cfg.py is as follows:

import FWCore.ParameterSet.Config as cms
from FWCore.ParameterSet.VarParsing import VarParsing

options = VarParsing ('analysis')
# add a list of strings for events to process
options.register ('eventsToProcess',
                                  '',
                                  VarParsing.multiplicity.list,
                                  VarParsing.varType.string,
                                  "Events to process")
options.parseArguments()

process = cms.Process("PickEvent")
process.source = cms.Source ("PoolSource",
          fileNames = cms.untracked.vstring (options.inputFiles),
          eventsToProcess = cms.untracked.VEventRange (options.eventsToProcess)                               
)

process.Out = cms.OutputModule("PoolOutputModule",
        fileName = cms.untracked.string (options.outputFile)
)

process.end = cms.EndPath(process.Out)

Note: In 123592:334:755009, the first entry is the RUN number, the second entry is the LUMI block number, and the third entry the EVENT number. If the specified event is not found, the config file will not complain but will also not write that event to the output. So one needs to know which event to copy. Also make sure you have the privilege to write the output file to a directory like shown above ( output.root). Also make sure you have enough space to copy.

Important: In cmsRun, when giving it local files as input, the file names must be prefixed by file:. For example, first.root would be written file:first.root.

Find Collision Data

The updated information on

Review status

Reviewer/Editor and Date (copy from screen) Comments
XuanChen - 07 Jul 2014 Updated cvs to github
AntonioMorelosPineda - 26-Feb-2014 Update sample file
HengneLiUVa - 24-May-2013 add note of requirement of grid env.
AntonioMorelosPineda - 18-May-2013 Updates to 5_3_7 files
KatiLassilaPerini - 24-Mar-2011 Updates to 4_1_3 files
KatiLassilaPerini 11 Dec 2009 this page now explains how to get some events quickly, all further details are in WorkBookLocatingDataSamples and in WorkBookDataManagementBackground
SudhirMalik- 4 Nov 2009 updated examples to CMSSW_3_3_1, updated DBS snapshots
KatiLassilaPerini - 28 Feb 2008 removed the LPC samples

I went through chapter 3 section 3 subsection 1. The information is relevant and clear.

I updated a coment on DAS, DBS is no longer used.

Responsible: SudhirMalik
Last reviewed by: AntonioMorelosPineda - 18 May 2013



3.3.2 EDM tools to list event content and examine provenance

Complete: 4
Detailed Review status

Goals of this page:

This page describes how to:

  • see which collections are in present in the events using edmDumpEventContent
  • find out how these collections were made by edmProvDump

Contents

Listing the content of a data file with edmDumpEventContent

Shows which products exist in the file:

NOTE: You may use the MYCOPY.root file to explore these EDM tools. The exact contents may be different but concept remains the same.

unix> edmDumpEventContent --help
usage: edmDumpEventContent [options] templates.root
Prints out info on edm file.

options:
  -h, --help     show this help message and exit
  --name         print out only branch names
  --all          Print out everything: type, module, label, process, and
                 branch name
  --regex=REGEX  Filter results based on regex

unix> edmDumpEventContent simemu100.root
L1CMS.GlobalTriggerObjectMapRecord    "hltL1GtObjectMap"      ""            "HLT."         
L1CMS.GlobalTriggerReadoutRecord      "hltGtDigis"            ""            "HLT."         
L1MuGMTReadoutCollection          "hltGtDigis"            ""            "HLT."         
(snipped)

If you want to print out -all information, but only for those branches that, for example, contain the words recoMuon or calojet

unix> edmDumpEventContent --all --regex recoMuon --regex caloJet relvalZmumu_300_pre10.root
vector<reco::CaloJet>             "iterativeCone5CaloJets"    ""            "VALIDATION."   : recoCaloJets_iterativeCone5CaloJets__VALIDATION
vector<reco::CaloJet>             "kt4CaloJets"           ""            "VALIDATION."   : recoCaloJets_kt4CaloJets__VALIDATION
vector<reco::CaloJet>             "kt6CaloJets"           ""            "VALIDATION."   : recoCaloJets_kt6CaloJets__VALIDATION
vector<reco::CaloJet>             "sisCone5CaloJets"      ""            "VALIDATION."   : recoCaloJets_sisCone5CaloJets__VALIDATION
vector<reco::CaloJet>             "sisCone7CaloJets"      ""            "VALIDATION."   : recoCaloJets_sisCone7CaloJets__VALIDATION
vector<reco::Muon>                "muons"                 ""            "VALIDATION."   : recoMuons_muons__VALIDATION

Just running it on a pat-tuple (e.g. output of "StarterKit" process) would result in

unix> edmDumpEventContents my-pat-tuple.root

vector             "iterativeCone5CMS.CaloJets"    ""            "RECO."        
vector             "kt4CMS.CaloJets"           ""            "RECO."        
vector             "kt6CMS.CaloJets"           ""            "RECO."        
vector             "sisCone5CMS.CaloJets"      ""            "RECO."        
vector             "sisCone7CMS.CaloJets"      ""            "RECO."        
vector             "selectedLayer1Electrons"    ""            "StarterKit."  
vector                  "selectedLayer1Jets"    ""            "StarterKit."  
vector                  "layer1METs"            ""            "StarterKit."  
vector                 "selectedLayer1Muons"    ""            "StarterKit."  
vector               "selectedLayer1Photons"    ""            "StarterKit."  
vector                  "selectedLayer1Taus"    ""            "StarterKit." 

Learning about the history of a file with edmProvDump

Prints out all the tracked parameters which were used to create this file.

usage: edmProvDump [options] <root-file>
--sort - sorts the resulting dump so that it can be reliably diff'd to a dump from a different file.

Doing it on the above PAT-tuple would give

Event filtering information for 4 processing steps is available.
The ParameterSets will be printed out, with the oldest printed first.
ParameterSetID: 1258c9d7981c36123064b138d9233d2d
{
  SelectEvents: vstring tracked  = {'generation_step'}
}{
}{
}
     -------------------------
ParameterSetID: 38559c871fba28d992ead51549367f83
{
}{
}{
}
     -------------------------
ParameterSetID: 38559c871fba28d992ead51549367f83
{
}{
}{
}
     -------------------------
ParameterSetID: b850a832c27d7cd97df0b911d2dfd8c6
{
  SelectEvents: vstring tracked  = {'p'}
}{
}{
}
     -------------------------
Processing History:
  HLT '' '"CMSSW_3_1_1"' [1]  (63a3825c6b98e7b503180638b154b2a1)
    RECO '' '"CMSSW_3_1_1"' [1]  (51094fc9c1ad98ce2f8f56ab219380b4)
      StarterKit '' '"CMSSW_3_1_1"' [1]  (31561e50c5737ab27fd03e377818e261)
  RECO '' '"CMSSW_3_1_1"' [2]  (51094fc9c1ad98ce2f8f56ab219380b4)
    StarterKit '' '"CMSSW_3_1_1"' [1]  (31561e50c5737ab27fd03e377818e261)
---------Event---------
Module: genCandidatesForMET HLT
 PSet id:b3d215eebed2e6ff54e7fee00fc2aec3
 products: {
  recoGenParticlesRefs_genCandidatesForMET__HLT.
}
 parameters: {
  @module_label: string tracked  = 'genCandidatesForMET'
  @module_type: string tracked  = 'InputGenJetsParticleSelector'
  excludeFromResonancePids: vuint32 tracked  = {}
  excludeResonances: bool tracked  = false
  ignoreParticleIDs: vuint32 tracked  = {1000022,2000012,2000014,2000016,1000039,5000039,4000012,9900012,9900014,9900016,39,12,13,14,16}
  partonicFinalState: bool tracked  = false
  src: InputTag tracked  = genParticles
  tausAsJets: bool tracked  = false
}{
}{
}
...
and so on for each of the modules used in the previous processing of this file. At the end, the Event Setup producers (ESSources and ESModules) will also be shown:
...
ESModule: TrackerGeometricDetESModule StarterKit
 parameters: {
  @module_label: string tracked  = ''
  @module_type: string tracked  = 'TrackerGeometricDetESModule'
  fromDDD: bool tracked  = true
}{
}{
}
ESModule: TrackerRecoGeometryESProducer StarterKit
 parameters: {
  @module_label: string tracked  = ''
  @module_type: string tracked  = 'TrackerRecoGeometryESProducer'
}{
}{
}
ESModule: TransientTrackBuilderESProducer StarterKit
 parameters: {
  @module_label: string tracked  = ''
  @module_type: string tracked  = 'TransientTrackBuilderESProducer'
  ComponentName: string tracked  = 'TransientTrackBuilder'
}{
}{
}
ESModule: VolumeBasedMagneticFieldESProducer StarterKit
 parameters: {
  @module_label: string tracked  = ''
  @module_type: string tracked  = 'VolumeBasedMagneticFieldESProducer'
  overrideMasterSector: bool tracked  = false
  paramLabel: string tracked  = 'parametrizedField'
  scalingFactors: vdouble tracked  = {1,1,0.994,1.004,1.004,1.005,1.004,1.004,0.994,0.965,0.958,0.958,0.953,0.958,0.958,0.965,0.918,0.924,0.924,0.906,0.924,0.924,0.918,0.991,0.998,0.998,0
.978,0.998,0.998,0.991,0.991,0.998,0.998,0.978,0.998,0.998,0.991,0.991,0.998,0.998,0.978,0.998,0.998,0.991}
  scalingVolumes: vint32 tracked  = {+14100,+14200,+17600,+17800,+17900,+18100,+18300,+18400,+18600,+23100,+23300,+23400,+23600,+23800,+23900,+24100,+28600,+28800,+28900,+29100,+29300,+29
400,+29600,+28609,+28809,+28909,+29109,+29309,+29409,+29609,+28610,+28810,+28910,+29110,+29310,+29410,+29610,+28611,+28811,+28911,+29111,+29311,+29411,+29611}
  useParametrizedTrackerField: bool tracked  = true
  version: string tracked  = 'grid_1103l_090322_3_8t'
}{
}{
}
ESModule: ZdcHardcodeGeometryEP StarterKit
 parameters: {
  @module_label: string tracked  = ''
  @module_type: string tracked  = 'ZdcHardcodeGeometryEP'
}{
}{
}

Review status

Reviewer/Editor and Date (copy from screen) Comments
PetarMaksimovic - 02 Aug 2009 Created.
Responsible: PetarMaksimovic
Last reviewed by: SudhirMalik - 26 Mar 2010

3.4 Physics Analysis Oriented Event Display ( Fireworks / cmsShow )

Fireworks for all release cycles are available as part of CMSSW distribution. See section Using cmsShow from a CMSSW release for details.

Latest version of Fireworks


May 26th 2019: release based on CMSSW_10_6_X


CMSSW_10_6_X data
Linux 64-bit: %FWLINUX_106% Mac 10.14, Mojave conda tool required, see installation instructions: %FWOSX_106%

CMSSW_10_2_X data
Linux 64-bit: %FWLINUX_102% Mac 10.14, Mojave system headers required: %FWOSXMOJAVE_102%
Mac 10.13, High Sierra : %FWOSX_102%

CMSSW_9_4_X data
Linux 64-bit: %FWLINUX_9_4% Mac High Sierra : %FWOSX_9_4%

Supported platforms
  • Linux: 64-bit SLC6 is the official platform. Fireworks works also on all newer GNU/Linux distributions; some Xorg / driver combinations can cause problems (e.g. Intel graphics cards).
  • Mac HighSierra: install Xcode command-line tools: xcode-select --install.
  • Mac Mojave: install XCode command line tools xcode-select --install and manually install headers open /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg

Contents

Introduction

Fireworks is the CMS event-display project and cmsShow is the official name of the executable. Both names are used interchangeably so don't get confused.

The core of Fireworks is built on top of the Event Data Model (EDM) and the light version of the software framework (FWLite). Event Visualization Environment (EVE) of ROOT is used to manage 3D and 2D views, selection, and user-interaction with the graphics windows. Several EVE components were developed by in collaboration between Fireworks and ROOT. In event display operation simple plugins are registered into the system to perform conversion from EDM collections into their visual representations. As a guiding principle, Fireworks shows only what is available in the EDM event-data, no reconstruction or result enhancement is performed internally. Visibility of collection elements can be filtered via a generic expression (PAT parser is used internally).

When run in a standalone mode (as cmsShow) Fireworks reads data from an EDM ROOT file (or a collection of files). Remote file access is possible for castor, dcache, http, and xrootd. Flexible event-filtering is supported by managing a set of filters that can be enabled, disabled, and-ed, or or-ed on the fly. TTree::Draw() is used internally to select events matching individual filtering expressions. Full event-navigation and direct event access by providing run, lumi and event ids is possible.

When run within the full CMSSW framework (as cmsShowFF) data is read from in-memory edm::Event after the standard event-processing is complete. A user can investigate the registered paths, modules and their execution status. Furthermore, module parameters can be changed and the event-processing repeated with changed parameters. Minimal event-navigation is supported and event-filtering is not supported in this mode.

The following considerations were made to make Fireworks usable by everybody:

  • Given that event displays are not everyday tools, program interface is optimized for intuitive and simple use.
  • 3D accelerator is recommended, but not required. The new event display is routinely used and tested on non-accelerated computers to make sure the performance is reasonable.
  • Distribute also as a stand alone player with all necessary components: no need for remote use over X11.

Availability

Fireworks is available as part of CMSSW release and as a stand-alone tarball. The stand-alone option is available for both 64-bit Linux (built on SLC5 but should run on all Linux distributions) and Mac OS X. All necessary components, including a proper version of ROOT, are distributed inside the tarball.

Using cmsShow from a CMSSW release

Fireworks is integrated to CMSSW since release CMSSW_3_1_0_pre6. To run cmsShow, setup the environment and run cmsShow with your sample file:

cmsShow data.root

At CERN you may try to use the afs area code without creating a SCRAM working area, e.g.:

cd /afs/cern.ch/cms/slc5_ia32_gcc434/cms/cmssw/CMSSW_3_9_5
eval `scramv1 runtime -csh` # or -sh if use bash
cd /tmp
cmsShow rfio:/castor/cern.ch/cms/store/relval/CMSSW_3_9_5/RelValTTbar/GEN-SIM-RECO/MC_39Y_V6-v1/0009/14F2D01A-9BFA-DF11-8B0F-0018F34D0D62.root

For advance users it is also possible to run cmsShow within full CMSSW framework. in this mode event display is executed with cmsShowFF script and python configuration file as an argument. See more info at WorkBookCmsShowFF.

Using cmsShow tarball

All that you need to get started is a proper download of the stand-alone event display for your operating system and a data file to look at. The display is distributed with 2 small EDM ROOT files containing collision data (data.root) and Monte Carlo events (mc.root). Running cmsShow is simple.
Unpack the tarball and run it:

tar xzf cmsShow-version.tar.gz 
cd cmsShow-version
./cmsShow data.root

Solution For Unsupported Operating Systems

Virtual Machine

It is possible to run cmsShow on 32 bit OS or Windows OS, if computer is 64-bit architecture, by creating a Linux 64-bit Virtual Machine. Detailed instructions are at VBox-Chapter01, here is an extract tested on Fedora and Snow Leopard:

Windows

There is no native support for Fireworks on Window. You can setup a virtual machine or try the following solutions:

  • Use Exeed3D:
    • Exceed3D is a more expensive version of the usual Exceed package which allows one to login to linux machines from windows.
    • If you have it installed, fireworks can be used straightforwardly, although it may be slighly slower than on a linux machine.

  • Runing on remote-desktop
    It is possible to get the event display running with proper version of Xming-mesa.The default OpenGL that Windows provides with Xming is not usable. One way to get proper OpenGL support is to use Xming with Mesa drivers (software rendering). Just install this X-server and use it. Useful links:
    Xming main web site
    Xming-mesa-6-9-0-31-setup.exe

ChangeLog

  • cmsShow-10.2, released 09-18-2018
    • Update DataFormats to CMSSW_10_2_4
    • Add more option for TGeo geometry display
    • Fix bug in parsing of parameters in OSX version

  • cmsShow-10.0, released 08-03-2018
    • Update DataFormats to CMSSW_10_0

  • cmsShow-9.4-1, released 16-01-2018
    • Fix X flip in pixel reco geometry

  • cmsShow-9.4, released 16-01-2018
    • Update DataFormats to CMSSW_9_4_2

  • cmsShow-9.2-1, released 21-07-2017
    • OSX: use native Cocoa GUI
    • Add option to shift projection origin and origin of Jets to one of primary vertices

  • cmsShow-9.2, released 20-06-2017
    • Update DataFormats to CMSSW_9_2_2
    • Fixes in pixel geometry
    • Add option to draw Jets from the BeamSpot

  • cmsShow-9.0-2, released 15-05-2017
    • Use 2107 geometry by the default

  • cmsShow-9.0-1.mac.tar.gz, released 26-04-2017
    • Fix crash in view controller (cocoa only)

  • cmsShow-9.0, released 26-04-2017
    • CMSSW_9_0_pre3 DataFormats
    • native Cocoa GUI on OSX

  • cmsShow-8.1-2, released 06-02-2017 for Linux OS
    • Fix crash on the next event: changes in gcc dependencies

  • cmsShow-8.1, released 06-01-2017
    • Update data formats to 8_1_0_pre10

  • cmsShow-8.0-2, released 01-08-2016
    • Improve ecal rechits detail views for reco::Electron, reco::Photon and reco::Muon
    • Add possibility to open ecal rechit detail view for any edm type
    • Fix visibility of 4th ME station in 3D view

  • cmsShow-8.0-1, released 18-05-2016
    • Introduce color palettes
    • Fix problem with detail view plugin detection
    • Improve table view content in standard configurations
    • Set clipping with offset in 3D region view

  • cmsShow-7.6-1, released 18-05-2016
    • Fix problem with detail view plugin detection

  • cmsShow-8.0, released 15-03-2016
    • CMSSW_8_0_2 DataFormats
    • Update reco geometry file (tag 2015)

  • cmsShow-7.6, released 29-01-2016
    • CMSSW_7_6_3 DataFormats

Links to further documentation

Presentations

Support and Feedback

Most error messages are not informative enough to guess what is wrong and so we need more information and detailed instructions how to reproduce your problem. Therefore, whenever you report a problem, please include the following information:

  • type of your operating system and its version,
  • output of "glxinfo"
  • if applicable, a crash report produced by running with debug option: cmsShow -d,
  • instructions how to reproduce the problem.

Please send all your requests and comments to hn-cms-visualization (remove SPAMNOT)

Review status

Reviewer/Editor and Date (copy from screen) Comments
AntonioMorelosPineda 15 Feb 2013 Upgrade to workbook style
MatevzTadel - 08 Mar 2011 Created.

Upgrade to workbook style, A.Morelos,15-Feb-2013

Responsible: MatevzTadel
Last reviewed by: AntonioMorelosPineda - 15 Feb 2013

Please send all your requests and comments to hn-cms-visualization (remove SPAMNOT)



3.4.2 Fireworks Tutorial

Getting started

WorkBookFireworks gives full set of instructions to get started. Here we just do the absolute minimum:

Linux box on FNAL network

/uscmst1/prod/sw/cmsshow/cmsShow22/cmsShow /uscmst1/prod/sw/cmsshow/cmsShow22/data.root

Linux box on CERN network

/afs/cern.ch/cms/fireworks/cmsShow22/cmsShow /afs/cern.ch/cms/fireworks/cmsShow22/data.root

Standalone Linux box

cd /tmp
wget http://cern.ch/cms-sdt/fireworks/cmsShow22.tar.gz
tar xzf cmsShow22.tar.gz
cd cmsShow22
./cmsShow data.root

Overview

The following 10 minutes video click gives clear introduction to the most basic aspects of the event display. (QuickTime player is required)

Video Introduction

  • Demonstration of various views
  • Adding and removing collections
  • Usage of configuration files
  • Event filtering

Physics Case

Drell-Yan (Z+jets)

In this example we will look at simulated Drell-Yan events from Fall08 Madgraph dataset: /ZJets-madgraph/Fall08_IDEAL_V9_v1/GEN-SIM-RECO. To demonstrate some important features of the event display we will look only at the events with at least one electron greater than 20GeV and missing-Et of the event greater than 30GeV. If you access original dataset files, you can limit events that you look at by typing
$Electrons.pt()>20&&$MET.pt()>30
in the event filter field. A small preselect data file can be downloaded from the web (~60Mb). No filtering is needed with this small file.
wget http://cern.ch/cms-sdt/fireworks/data/zjets_el20_met30.root

Drell-Yan events with Z->ee or Z->mumu have no natural sources of missing energy when missing Et is corrected for low response of muons in the calorimeter. Requiring one energetic electron we will see mostly Z->ee events.

Run 2 Event 7

It's a typical example of Z->ee+1jet, where jet calorimeter response underestimates the jet energy and leads to missing-Et well aligned with this jet. Put Rho-Phi view in the main view to see more better transverse energy balancing.

Run 2 Event 39

We see one electron not balanced by anything. To understand what happened, put Rho-Z view in the main view. Pay attention to the very forward region. You may find significant energy deposition there as well as GenParticle pointing toward it. Move your mouse over the particle line and you will get information about it: "gen e-, Pt: 15.9GeV" This is a case of limited tracker acceptance. The missing Et that we show here doesn't take into account very forward detector, which may lead to significant fake missing Et due to beam induced background.

Run 2 Event 175

Go to this event by typing 175 in the Event field. Switch to 3D view or Lego view as the main view. Here we see clearly a case of isolated electron and muon in the final state. To understand what happened here, we may look at the generator level information since it's available. Right click on the GenParticles item in the List View and you get the Collection controller. Let's specify that we only want to see leptons and neutrinos. In Filter type "abs(pdgId())>10&&abs(pdgId())<17". Now in Select you may try to highlight one by one different lepton types and they will be highlighted in all views. When you put abs(pdgId())==15, you would see 2 back to back tracks. So we have a case of Z->tau-tau, where one tau decays to e+2nus and the other to muon+2nus. Missing Et is aligned with muon due low response of muons in the calorimeter (MET was not corrected for muons).

Run 2 Event 443

2 muons and 2 electrons are reconstructed in this event. Opening list of muons and electrons in the list view and selecting them one by one, you may notice, that muon1 and electron1 are very close to each other and there is significant energy deposition in ECAL (red). The muon looks pretty good and it cannot radiate that much at this energy. So what do we have? It can be Z->mumugamma (internal radiation). Let's look at the generator level photons by filtering GenParticles with "pdgId()==22&&et()>10". Switch off electrons and muons to clean up the area and you will see 2 generator level photons pointing in the this area (why 2 not 1, it's a good questions to Madgraph experts).

Run 2 Event 487

Another case of Z->tautau->emu+4neutrinos.

Run 2 Event 20

Very nice example of fake missing Et due to jet energy underestimation in the calorimeter.

-- DmytroKovalskyi - 15 Jan 2009



3.5.1 Getting Started with FWLite

Complete: 5

Contents


Detailed Review status

3.5.1.1 The goals of this page:

In this section you will learn:

  • what FWLite is.
  • how to download it if you want to install it as a stand-alone package on your personal computer (laptop or desktop).
  • how to set up the ROOT environment (by adding commands to rootlogon.C).
  • how to explore CMS data interactively -- analogous to TTree::Draw().

FWLite can be downloaded with apt-get using instructions from this link. You can also jump directly to instructions how to write ROOT macros that analyze CMS data directly -- analogous to the result of TTree::MakeClass() -- in WorkBookFWLiteEventLoop or to examples in WorkBookFWLiteExamples.

3.5.1.2 What is FWLite

FWLite is plain ROOT that is capable to read CMS DataFormats together with some helper classes. You can use FWLite to analyse the data. CMS uses ROOT to make data objects persistent. CMS data formats are thus "ROOT-aware"; that is, once the shared libraries containing the ROOT-friendly description of CMS data formats are loaded into a ROOT session, these objects can be accessed and used directly from within ROOT like any other ROOT class!

FWLite (pronounced "framework-light") is just that -- a ROOT session with CMS data format libraries loaded. In addition, CMS provides a couple of classes that greatly simplify the access to the collections of CMS data objects. Moreover, these classes (Event and Handle) have the same name as analogous ones in the Full Framework; this mnemonic trick helps in making the code to access CMS collections very similar between the FWLite and the Full Framework (CMSSW). To learn about ROOT have a look to WorkBookBasicROOT or to the ROOT homepage.

3.5.1.3 How to downloading FWLite for your laptop

So now you are curious and would like to try FWLite. You want to analyse your data file -- say, a file with skimmed PAT-tuples that your group has produced -- and you may be in one of these two situations:

  1. You are on a Scientific Linux system with CMSSW installed.
  2. You are on some other machine, your own desktop or even more likely your own laptop.

Read the subsection below which applies to your case.

FWLite on a machine with CMSSW installed

If you are using a machine with CMSSW installed (currently available on either Scientific Linux 4 or Scientific Linux 5), then you don't have to do anything! FWLite is a part of the full framework (CMSSW).

Installing CMSSW Full Framework

If you wish to install the CMSSW full framework, you can follow the link here to install via apt-get.

FWLite on other flavors of Linux, Windows, and Mac

The final stage of the analysis is usually very interactive, does not require access to conditions, and is based on skimmed, stripped (and possibly thinned) data samples. The size of such sample is usually small enough to be copied to your desktop or laptop. Analysis in ROOT is particularly suitable for laptops, and could proceed without a network connection. The traditional way of exploring the data interactively is to make a TTree in the Full Framework, copy the ROOT file with it to one's laptop, install ROOT, and run. Since we want to enhance ROOT with CMS data format libraries, we need to install the CMS version of ROOT (which is safer anyway as CMS patches ROOT when required) and also install the libraries with the CMS data format and with the helper classes.

Most laptops run either Windows, or Mac OS, or one of the modern flavors of Linux (e.g. Ubuntu). The pages below dedicated to installations on each of these systems:

You are encouraged to try them out and give feedback to the Software Development Tools group by sending email to the sw-developmenttool HN.

3.5.1.4 How to setup your environment

(Note: this is a copy of WorkBookSetComputerNode#RooT, but with additional libraries loaded in rootlogon.C file.)

Before running ROOT on your data file, set up ROOT in order to load all the necessary CMS-specific libraries by default, by creating a file $HOME/rootlogon.C file with the following commands:

{
  // Set up FW Lite for automatic loading of CMS libraries
  // and data formats.   As you may have other user-defined setup
  // in your rootlogon.C, the CMS setup is executed only if the CMS
  // environment is set up.
  //
  TString cmsswbase = getenv("CMSSW_BASE");
  if (cmsswbase.Length() > 0) {
    //
    // The CMSSW environment is defined (this is true even for FW Lite)
    // so set up the rest.
    //
    cout << "Loading FW Lite setup." << endl;
    gSystem->Load("libFWCoreFWLite.so");
    FWLiteEnabler::enable();
    //AutoLibraryLoader::enable();
    gSystem->Load("libDataFormatsFWLite.so");
    gSystem->Load("libDataFormatsPatCandidates.so");
   }
}

Then edit your .rootrc file (this file can be in your home directory or in the current working directory, the latter taking the precedence - if it does not exist yet, you can create it) to point to rootlogon.C; e.g., add the line:

#  Tell ROOT where to find rootlogon.C: 
Rint.Logon: $(HOME)/rootlogon.C

Finally, you can start ROOT

$ root -l              #the -l, lower case L, is optional, and if used, omits the logo

and you are ready to play!

Additional information for Mac OS X Users

There are a few additional steps that are useful for Mac OS X users that are outlined in the README.txt of the FWLite tarballs, emphasized here. Instead of the usual CMSSW setup, please use (assuming you downloaded CMSSW_4_1_3_FWLITE and moved that folder under /Applications/FWLite ):

for csh:

setenv SCRAM_ARCH osx106_amd64_gcc421
setenv VO_CMS_SW_DIR /Applications/FWLite/CMSSW_4_1_3_FWLITE/
cd $VO_CMS_SW_DIR/
source $VO_CMS_SW_DIR/fwlite_setup.csh

for bash:

export SCRAM_ARCH=osx106_amd64_gcc421
export VO_CMS_SW_DIR=/Applications/FWLite/CMSSW_4_1_3_FWLITE/
cd $VO_CMS_SW_DIR/
source $VO_CMS_SW_DIR/fwlite_setup.sh

This will allow you to use FWLite effectively on the Mac. You can then do as normal:

cd $VO_CMS_SW_DIR/work/CMSSW_4_1_3_FWLITE/src/
cmsenv

or make a new project area as you prefer.

There are a few tricks you may have to do (depending on your Mac OS X version) to get python working correctly. These must be set again after each "cmsenv" command. These are the known fixes if you have python problems:

setenv PYTHONDIR /Applications/FWLite/CMSSW_3_5_6_FWLITE/osx105_ia32_gcc401/external/python/2.6.4-cms
setenv PYTHONHOME /Applications/FWLite/CMSSW_3_5_6_FWLITE/osx105_ia32_gcc401/external/python/2.6.4
setenv LD_LIBRARY_PATH $ROOTSYS/lib:$PYTHONDIR/lib:$LD_LIBRARY_PATH
setenv PYTHONPATH $ROOTSYS/lib:$PYTHONPATH

3.5.1.5 Your first plot in FWLite

From the ROOT prompt, open the MYCOPY.root file that you made in WorkBookDataSamples. It contains a collection of RECO objects. Opening the file may take some time.

root[0] TFile f("MYCOPY.root");
Look at the ROOT TBrowser to see which collections are in the file.
root[1] new TBrowser;

Double-click on ROOT Files, then MYCOPY.root, then Events (the event tree). Drill down through recoTracks_generalTracks__RECO to recoTracks_generalTracks__RECO.obj.

recoTracks_generalTracks_fwlite.png

What you see here are the methods of the stored recoTracks. Since covariance() returns a class, the icon shows a folder.

While you still have the ROOT browser displayed, go back to your ROOT session and set an alias for each of the branches you want to access (we'll choose recoTracks and recoTrackExtras). Use the branch name that's visible in the TBrowser, and add obj at the end, as follows:

root [8] Events->SetAlias("tracks","recoTracks_generalTracks__RECO.obj");        
root [9] Events->SetAlias("trackExtras","recoTrackExtras_generalTracks__RECO.obj");

Note: SetAlias commands work only if the ROOT browser is still up.

You can look at the examples contained in the file PhysicsTools/RecoExamples/test/trackPlots.C, and see how some of the examples work. Note that name of objects in trackPlots.C could be different. But gives you sense how to Draw etc.

// plot chi-squared divided by n. degrees of freedom
root [10] Events->Draw("tracks.normalizedChi2()", "tracks.chi2()<1000")        

(This image is shown for all the 20 events, each containing several tracks, from the full MYCOPY.root file.)

Now if you run this same "Events.Draw(...)" command in Bare ROOT (after setting the alias), you'll get the message
Part of the Variable "tracks.chi2()" exists but some of it is not accessible or useable.

Now run:

root[5] Events.Draw("trackExtras.recHitsSize()");

Of course, this approach does not scale when you want to simultaneously make more than a couple of histograms, and, especially, if you need to compute anything non-trivial. Eventually, you'll need to have a ROOT macro, which will do all the work. This is covered in section 3.5.2 Event Loop in FWLite.

Review status

Reviewer/Editor and Date (copy from screen) Comments
ElizaMelo - 16 Dec 2018 review & update the input file and rootlogon.C
AntonioMorelosPineda - 13 May 2013 review & update a link
RogerWolf - 22 Mar 2011 Update to CMSSW_4_1_3

Responsible: RogerWolf
Last reviewed by: AntonioMorelosPineda - 15 May 2013



3.5.2 Event Loop in FWLite

Complete: 5

Contents


Detailed Review status

The goals of this page:

In this page you will learn how to

  • write ROOT macros that analyze CMS data directly
  • compile these ROOT macros to make them faster and safer

Those users who like command line options, can have a look at SWGuideCommandLineParsing for command line parsing instructions.

The structure of a FWLite macro

Let us illustrate the access to EDM collections in FWLite with an example. But before we begin, the following lines need to be executed from within ROOT

{
   gSystem->Load("libFWCoreFWLite.so"); 
   FWLiteEnabler::enable();
   gSystem->Load("libDataFormatsFWLite.so");
   gSystem->Load("libDataFormatsPatCandidates.so");
}

as discussed in the previous section, it is best to place these lines inside the rootlogon.C file, so that they are executed automatically and you don't need to worry about them.

Here's a sketch of a macro that loops over events, and in each event retrieves an ED object:

void print_data()
{
   #include "DataFormats/FWLite/interface/Handle.h"
   TFile file("MYCOPY.root");

   fwlite::Event ev(&file);

   for( ev.toBegin(); ! ev.atEnd(); ++ev) {
       fwlite::Handle<std::vector<...> > objs;
       objs.getByLabel(ev,"....");
       // now can access data
       std::cout <<" size "<<objs.ptr()->size()<<std::endl;
       ...
   }
}

ALERT! Note: When building a macro, there is a particular order of calls to ROOT methods which are required in order for ROOT to function properly:

  1. Start the autoloader
  2. Create the TFile
  3. Load in the helper library gSystem->Load("libDataFormatsFWLite")
  4. Include "DataFormats/FWLite/interface/Handle.h" to get the fwlite::Handle
  5. Create an fwlite::Event by passing to the constructor a pointer to the TFile. The fwlite::Event allows you to use the same information you use when accessing data in cmsRun using the edm::Event.
  6. Create a for loop over events
    1. Start loop by calling toBegin() on the fwlite::Event
    2. For each iteration of the loop call atEnd on the fwlite::Event
    3. At the end of each iteration, increment the fwlite::Event by using the operator++ method
  7. When looping over the events
    1. Create an fwlite::Handle< ... > where the template argument is the C++ class you want to get from the event
    2. Call the getByLabel method of the fwlite::Handle passing the event and the strings or edm::InputTag used to specify the object
    3. The object retrieved from the handle is either a single number, or a collection of CMS objects (e.g. Tracks). In the latter case, you can loop the vector of objects using the standard iterator notation (same as in the FullFramework).

If you save the above macro in a file called print_data.C, you can load it and execute it from the interactive ROOT session with

root[] .L print_data.C
print_data()

or simply

.x print_data.C

However, note that in the above case the macro print_data.C is interpreted by CINT. This is okay for rapid prototyping, however, for real data processing one should always compile it with ACliC for several reasons:

  • The macro will be significantly faster, as ACliC invokes the native compiler (so on Linux it's the same version of GCC used to build ROOT)
  • CINT is good, but ii is not 100% reliable in its computations, and it is better not to risk running into one of the rare cases which are not correctly handled by CINT
  • Compiled code can be debugged, profiled, or processed with valgrind (especially when linked into a standalone executable)

Compling the macro with ACliC

Compiling the macro on the fly (while it's being loaded) is trivial. One simply adds a "+" at the end of the macro name:

   TFile f("....root");
   .x print_data.C+

ROOT then first invokes GCC from the release, makes a shared library, and loads it. Note that "++" would always cause compilation, while "+" would recompile an already existing macro only if the source code is newer than the compiled code, just like make.

ALERT! Note: There are several pieces of information you need to know about how ROOT compiles a macro:

  1. ROOT runs the CINT interpreter over the macro before compiling it in order to determine if it needs to generate dictionaries for some of the classes/functions mentioned in the macro. Unfortunately many of CMS's header files contain code that CINT can not parse. This means we must hide those headers from CINT but make sure they are visible to the compiler.
  2. After compiling the code, ROOT links the compiled macro against all libraries which have been loaded. If the macro uses a function or variable from a library which has not yet been loaded, then ROOT will issue a 'missing symbol' error.

With the above in mind, here are the steps needed to arrive at a compiled ROOT macro:

  1. Protect all headers using a
    #if !defined(__CINT__) && !defined(__MAKECINT__) #endif
    block
  2. Introduce the macro block as a function with the same name as the file (this keeps CINT from trying to fully parse the internals of the routine)
  3. In ROOT
    1. Load and start the autoloader
    2. Load libDataFormatsFWLite
    3. Create a TFile from one of the files you want to read. This will cause all the libraries for every class in the file to be loaded.
    4. Compile/link/execute the macro by doing .x <filename>++

An example macro template is shown below. The example assumes that the file is named print_data.C

#if !defined(__CINT__) && !defined(__MAKECINT__)
#include "DataFormats/FWLite/interface/Handle.h"
#include "DataFormats/FWLite/interface/Event.h"
//Headers for the data items
...
#endif
void print_data() {
  TFile file("....root");

   fwlite::Event ev(&file);

   for( ev.toBegin(); ! ev.atEnd(); ++ev) {
       fwlite::Handle<std::vector<...> > objs;
       objs.getByLabel(ev,"....");
       //now can access data
       std::cout <<" size "<<objs.ptr()->size()<<std::endl;
       ...
   }
}

ALERT! Note: A note on the preprocessor switches (testing on CINT and MAKECINT): The first step that ROOT takes when compiling a macro is to run the CINT interpreter over the macro in order to determine what class or function 'dictionaries' it must create. After that step, the regular C++ compiler is used to build the code. Unfortunately, CINT is incapable of properly parsing many of our header files. However, it turns out the headers are not needed by CINT but only by the compiler, therefore adding

#if !defined(__CINT__) && !defined(__MAKECINT__)
...
#endif

around the header files avoids the problem with CINT.

However, the compiler still needs to know where to find our header files. FWLite pre-configures ROOT to find CMS headers from the environment variables CMSSW_BASE and CMSSW_RELEASE_BASE. FWLite pre-configures ROOT to find standard externals header files (e.g. boost and CLHEP).

Running over a list of files

The fwlite::ChainEvent allows you to use the same information you use when accessing data in cmsRun using edm::Event:

  1. Start the autoloader
  2. Load in the helper library gSystem->Load("libDataFormatsFWLite")
  3. Create a std::vector<std::string> to hold the list of file names
  4. Include "DataFormats/FWLite/interface/Handle.h" to get the fwlite::Handle
  5. push_back each file name into the std::vector<std::string>
  6. Create an fwlite::ChainEvent by passing to the constructor the vector
  7. Create a for loop
    1. Start loop by calling toBegin() on the fwlite::ChainEvent
    2. For each iteration of the loop call atEnd on the fwlite::ChainEvent
    3. At the end of each iteration, increment the fwlite::Event by using the operator++ method
  8. When looping over the events
    1. Create an fwlite::Handle< ... > where the template argument is the C++ class you want to get from the event
    2. Call the getByLabel method of the fwlite::Handle passing it the event and the strings used to denote the object

An example macro template is shown below:

{
   gSystem->Load("libFWCoreFWLite.so"); 
   AutoLibraryLoader::enable();
   gSystem->Load("libDataFormatsFWLite.so");

   #include "DataFormats/FWLite/interface/Handle.h"
   vector<string> fileNames;
   fileNames.push_back("....root");

   fwlite::ChainEvent ev(fileNames);

   for( ev.toBegin(); ! ev.atEnd(); ++ev) {
       fwlite::Handle<std::vector<...> > objs;
       objs.getByLabel(ev,"....");
       //now can access data
       std::cout <<" size "<<objs.ptr()->size()<<std::endl;
       ...
   }
}

The compilation is identical to what is done for the case of a single file above, except in the macro you replace the use of fwlite::Event with fwlite::ChainEvent. ALERT! Note: You must still pick one of the files to be used in the fwlite::ChainEvent and open it with a TFile on the ROOT command line in order to force the proper dictionaries to be opened.

3.5.2.4 Using edm::EventBase

It is now possible to use the edm::EventBase such that the user does not need to use the fwlite::Handle any more. This allows the user to write functions that can be used directly in FWLite or in the full framework and use the same "get" methods. Note that this does not function if one uses CINT to interprete the macros. A snippet of how to do this is given below:

for (ev.toBegin(); ! ev.atEnd(); ++ev) {
    edm::EventBase const & event = ev;

    // This snippet can be used in EITHER FWLite or the Full Framework
    edm::Handle<vector<reco::Vertex> > vertices;
    event.getByLabel( edm::InputTag("offlinePrimaryVertices"), vertices);

    // ...

  }

In the meantime FWLite and full framework can indeed be used in parallel on tha bases of truely compiled executables! If you want to know more about this have a look for a bunch of examples of compiled executables with FWLite on WorkBookFWLiteExamples.

Review status

Reviewer/Editor and Date (copy from screen) Comments
RogerWolf - 22 Mar 2011 Update for CMSSW_4_1_3

Responsible: RogerWolf



3.5.3 Examples of FWLite macros

Contents


Detailed Review status

3.5.3.1. Introduction

You should view FWLite as a way to access bare ROOT input files of the EDM, with the capability to read and recognise CMSSW DataFormats while the performance advantages of I/O and CPU consumption of the bare ROOT access are sustained. It should be emphasised that FWLite is not an exclusive alternative to the use of the full framework. Rather we propose to use FWLite and full framework in parallel depending on the requirements of your analysis. Many efforts have been made during the developments of FWLite to achieve a user interface maximally facilitating the interchange of code between both paradigms. AT (Analysis Tools) strongly recommends NOT to rely on CINT within ROOT with interactive macros! We recommend to use compiled code instead. On this page you will find examples of compiled FWLite executables that can be used as starting points for your own analysis. The emphasise is to start from the simplest and most basic skeletons and to give examples how to increase the level of complexity step by step. Finally we will point you to a fully equipped example to do an analysis both with FWLite and within the full EDM framework using the same code. We will assume that you work on lxplus, but following the instructions given above you can run these examples on any other computer system that has FWLite installed.

3.5.3.2. Setting up of the environment

First of all connect to lxplus and go to some working directory. You can choose any directory, provided that you have enough space. You need some minimum of free disc space to do the exercises described here. We recommend you to use your ~/scratch0 space. In case you don't have this (or do not even know what it is) check your quota typing fs lq and follow this link. If you don't have enough space, you may instead use the temporary space (/tmp/your_user_name), but be aware that this is lost once you log out from lxplus (or app. within a day). We will expect in the following that you have such a ~/scratch0 directory.

ssh lxplus
[ ... enter password ... ]
cd scratch0/

Create a local release area and enter it, set up the environment (using the cmsenv command)

cmsrel CMSSW_5_3_12
cd CMSSW_5_3_12/src 
cmsenv

Created a local release area and enter it, set up the environment (using the_cmsenv_command)

setenv SCRAM_ARCH slc6_amd64_gcc530
cmsrel CMSSW_8_1_0_pre16
cd CMSSW_8_1_0_pre16/src
cmsenv

ALERT! Note: Please for the time being use the set of tags as described here on top of the recommended release to be able to follow the examples given in the following

Checkout the FWLite package, which contains the examples that will be discussed in the following to be able to inspect them in your favourite editor. You can use the addpkg command, which will automatically pick up the proper cvs tag from the release.

git cms-addpkg PhysicsTools/FWLite

ALERT! Note: We are going to go through the exmaples in the FWLite package. The addpkg tool picks the right cvs version of this package from the release and copies it to your local release area.

ALERT! Note: Don't forget to recompile the package whenever you changed any of the executables. You have to rehash the environment after each compilation to make sure that the cash of the shell prompt is refreshed. You can do this using the cmsenv command or better the command rehash.

ALERT! Note: In the current implementation all examples will require a patTuple.root file unless stated otherwise. Have a look to WorkBookPATTupleCreationExercise to learn how to create such a PAT tuple.

3.5.3.3. Example 1: Access to event information

The first example shows how to access simple event information. We are going to show how to access the run-wise luminosity and peak luminosity from the edm::Event content. To run the example type the following command in the src directory of your local release area (as long as you don't change any code in the local release area you will not have to compile):

FWLiteLumiAccess inputFiles=root://eoscms//eos/cms/store/relval/CMSSW_5_3_11_patch6/RelValTTbar/GEN-SIM-RECO/START53_LV3_Feb20-v1/00000/72E794B2-A29A-E311-A464-02163E009E91.root

ALERT! Note: As you can see we made use of the command line argument inputFiles as described on SWGuideCommandLineParsing. Here we make use of a RECO input file from which we want to read the LumiBlock information.

We will now explain how this executable was compiled: To compile an individual executable as part of a CMSSW package the only thing you have to do is to make it known to the scram build mechanism via a corresponding line in a given BuildFile.xml. We followed the general convention to keep individual executable in a dedicated bin directory of the package. There you can find the corresponding BuildFile.xml and within this file the following lines:

<use   name="root"/>
<use   name="boost"/>
<use   name="rootcintex"/>
<use   name="FWCore/FWLite"/>
<use   name="DataFormats/FWLite"/>
<use   name="DataFormats/Luminosity"/>
<use   name="FWCore/PythonParameterSet"/>
<use   name="CommonTools/Utils"/>
<use   name="PhysicsTools/FWLite"/>
<use   name="PhysicsTools/Utilities"/>
<use   name="PhysicsTools/SelectorUtils"/>
<environment>
  <bin   file="FWLiteLumiAccess.cc"></bin>
  <bin   file="FWLiteHistograms.cc"></bin>
  <bin   file="FWLiteWithPythonConfig.cc"></bin>
  <bin   file="FWLiteWithSelectorUtils.cc"></bin>
</environment>

ALERT! Note: As you can see the BuildFile.xml contains nothing more but compiler directives in xml style. The first lines indicate the libraries that are used by the executables located in the directory followed by a list of .cc files to be picked up for compilation. Each source will be compiled into an individual executable with the name of the file (omitting the .cc ending). You can invoke the executable from the shell prompt after refreshing the cash after compilation (using cmsenv or better rehash), as shown above. The important line for this example is:

<bin   file="FWLiteLumiAccess.cc"></bin>

In the FWLiteLumiAccess.cc file you can find a basic skeleton to write an FWLite executable and to access some extra information from the edm::Event content:

int main(int argc, char ** argv){
  // load framework libraries
  gSystem->Load( "libFWCoreFWLite" );
  AutoLibraryLoader::enable();

  // initialize command line parser
  optutl::CommandLineParser parser ("Analyze FWLite Histograms");

  // parse arguments
  parser.parseArguments (argc, argv);
  std::vector<std::string> inputFiles_ = parser.stringVector("inputFiles");
  
  for(unsigned int iFile=0; iFile<inputFiles_.size(); ++iFile){
    // open input file (can be located on castor)
    TFile* inFile = TFile::Open(inputFiles_[iFile].c_str());
    if( inFile ){
      fwlite::Event ev(inFile);
      fwlite::Handle<LumiSummary> summary;
      
      std::cout << "----------- Accessing by event ----------------" << std::endl;
      
      // get run and luminosity blocks from events as well as associated 
      // products. (This works for both ChainEvent and MultiChainEvent.)
      for(ev.toBegin(); !ev.atEnd(); ++ev){
	// get the Luminosity block ID from the event
	std::cout << " Luminosity ID " << ev.getLuminosityBlock().id() << std::endl;
	// get the Run ID from the event
	std::cout <<" Run ID " << ev.getRun().id()<< std::endl;
	// get the Run ID from the luminosity block you got from the event
	std::cout << "Run via lumi " << ev.getLuminosityBlock().getRun().id() << std::endl;
	// get the integrated luminosity (or any luminosity product) from 
	// the event
	summary.getByLabel(ev.getLuminosityBlock(),"lumiProducer");
      }

      //...

    }
  }
  return 0;
}

ALERT! Note: Each C++ executable starts with a main function. We allow arguments to be passed on the to executable. The next lines enable the ROOT AutoLibraryLoader, which should be enabled for each FWLite executable. Next we make use of command line parsing as described on SWGuideCommandLineParsing. The command line option that we read in is the option inputFiles. In the following the input files are looped and the event within each input file is accessed and looped. Within the event loop you can find the corresponding lines how to access luminosity and run information from the edm::Event content. You can find the implementation of this executable in the FWLiteLumiAccess.cc file in the bin directory of the package.

3.5.3.4. Example 2: Plotting histograms

The second example shows how to book and fill histograms from objects collections in the event. To run the example type the following command in the src directory of your local release area (again this is possible without compiling):

FWLiteHistograms inputFiles=root://eoscms//eos/cms/store/relval/CMSSW_5_3_11_patch6/RelValTTbar/GEN-SIM-RECO/START53_LV3_Feb20-v1/00000/72E794B2-A29A-E311-A464-02163E009E91.root outputFile=analyzeFWLite.root maxEvents=-1 outputEvery=20

The corresponding line for compilation in the BuildFile.xml is the following:

<bin   file="FWLiteLumiHistograms.cc"></bin>

ALERT! Note: In this example more command line options are used:

  • inputFiles: to pass on a vector of input file paths.
  • outputFile: to pass an output file name.
  • maxEvents: to pass on the maximal number of events to loop.
  • outputEvery: to indicate after how many events a report should be sent to the prompt.

You can find the read out of these options in the main function of the FWLiteHistograms.cc file:

// ...

  // initialize command line parser
  optutl::CommandLineParser parser ("Analyze FWLite Histograms");

  // set defaults
  parser.integerValue ("maxEvents"  ) = 1000;
  parser.integerValue ("outputEvery") =   10;
  parser.stringValue  ("outputFile" ) = "analyzeFWLiteHistograms.root";

  // parse arguments
  parser.parseArguments (argc, argv);
  int maxEvents_ = parser.integerValue("maxEvents");
  unsigned int outputEvery_ = parser.integerValue("outputEvery");
  std::string outputFile_ = parser.stringValue("outputFile");
  std::vector<std::string> inputFiles_ = parser.stringVector("inputFiles");

  // book a set of histograms
  fwlite::TFileService fs = fwlite::TFileService(outputFile_.c_str());
  TFileDirectory dir = fs.mkdir("analyzeBasicPat");
  TH1F* muonPt_  = dir.make<TH1F>("muonPt"  , "pt"  ,   100,   0., 300.);
  TH1F* muonEta_ = dir.make<TH1F>("muonEta" , "eta" ,   100,  -3.,   3.);
  TH1F* muonPhi_ = dir.make<TH1F>("muonPhi" , "phi" ,   100,  -5.,   5.);  
  TH1F* mumuMass_= dir.make<TH1F>("mumuMass", "mass",    90,  30.,  120.);

  // ...

ALERT! Note: As you see we provide default values for the options maxEvents, outputEvery and outputFile. In the following section a set of histograms are booked making use of the TFileService, which takes automatic care of handling the individual file systems of the different input and output files. To learn more about the TFileService have a look to SWGuideTFileService.

In the event loop we do some gymnastics for the event report and to stop the loop after maxEvents have been processed. We access the collection of _reco::Muon_'s, loop the muons and fill the histograms.

for(ev.toBegin(); !ev.atEnd(); ++ev, ++ievt){
	edm::EventBase const & event = ev;
	// break loop if maximal number of events is reached 
	if(maxEvents_>0 ? ievt+1>maxEvents_ : false) break;
	// simple event counter
	if(outputEvery_!=0 ? (ievt>0 && ievt%outputEvery_==0) : false) 
	  std::cout << "  processing event: " << ievt << std::endl;

	// Handle to the muon collection
	edm::Handle<std::vector<Muon> > muons;
	event.getByLabel(std::string("muons"), muons);
	
	// loop muon collection and fill histograms
	for(std::vector<Muon>::const_iterator mu1=muons->begin(); mu1!=muons->end(); ++mu1){
	  muonPt_ ->Fill( mu1->pt () );
	  muonEta_->Fill( mu1->eta() );
	  muonPhi_->Fill( mu1->phi() );	  
	  if( mu1->pt()>20 && fabs(mu1->eta())<2.1 ){
	    for(std::vector<Muon>::const_iterator mu2=muons->begin(); mu2!=muons->end(); ++mu2){
	      if(mu2>mu1){ // prevent double conting
		if( mu1->charge()*mu2->charge()<0 ){ // check only muon pairs of unequal charge 
		  if( mu2->pt()>20 && fabs(mu2->eta())<2.1 ){
		    mumuMass_->Fill( (mu1->p4()+mu2->p4()).mass() );
		  }
		}
	      }
	    }
	  }
	}

        //...

You can find the implementation of this executable in the FWLiteLumiHistograms.cc file in the bin directory of the package.

3.5.3.5. Example 3: Using python configuration

You can get the same results using the python configuration you are used to from the full framework. It is much more powerful in passing on an arbitrary number of parameters and is not restricted to a single command line. In case you are not familiar with the python configuration language used within CMS have a look to WorkBookConfigFileIntro. To run the example type the following command in the src directory of your local release area:

FWLiteWithPythonConfig PhysicsTools/FWLite/test/fwliteWithPythonConfig_cfg.py

The corresponding line in the BuildFile.xml is the following:

<bin   file="FWLiteWithPythonConfig.cc"></bin>

ALERT! Note: For this examples we completely replaced the command line parsing by the python configuration file mechanism. You can find the fwliteWithPythonConfig_cfg.py configuration file in the test directory of the package. It contains the following parameters:

import FWCore.ParameterSet.Config as cms

process = cms.PSet()

process.fwliteInput = cms.PSet(
    fileNames   = cms.vstring('rfio:///castor/cern.ch/cms/store/relval/CMSSW_4_1_3/RelValTTbar/GEN-SIM-RECO/START311_V2-v1/0037/648B6AA5-C751-E011-8208-001A928116C6.root'), ## mandatory
    maxEvents   = cms.int32(-1),                             ## optional
    outputEvery = cms.uint32(10),                            ## optional
)
    
process.fwliteOutput = cms.PSet(
    fileName  = cms.string('analyzeFWLiteHistograms.root'),  ## mandatory
)

process.muonAnalyzer = cms.PSet(
    ## input specific for this analyzer
    muons = cms.InputTag('muons')
)

You can find the readout of the configuration file in the main function of the FWLiteWithPythonConfig.cc file:

// ...

  // parse arguments
  if ( argc < 2 ) {
    std::cout << "Usage : " << argv[0] << " [parameters.py]" << std::endl;
    return 0;
  }

  if( !edm::readPSetsFrom(argv[1])->existsAs<edm::ParameterSet>("process") ){
    std::cout << " ERROR: ParametersSet 'process' is missing in your configuration file" << std::endl; exit(0);
  }
  // get the python configuration
  const edm::ParameterSet& process = edm::readPSetsFrom(argv[1])->getParameter<edm::ParameterSet>("process");
  fwlite::InputSource inputHandler_(process); fwlite::OutputFiles outputHandler_(process);


  // now get each parameter
  const edm::ParameterSet& ana = process.getParameter<edm::ParameterSet>("muonAnalyzer");
  edm::InputTag muons_( ana.getParameter<edm::InputTag>("muons") );

  // book a set of histograms
  fwlite::TFileService fs = fwlite::TFileService(outputHandler_.file().c_str());
  TFileDirectory dir = fs.mkdir("analyzeBasicPat");
  TH1F* muonPt_  = dir.make<TH1F>("muonPt"  , "pt"  ,   100,   0.,  300.);
  TH1F* muonEta_ = dir.make<TH1F>("muonEta" , "eta" ,   100,  -3.,    3.);
  TH1F* muonPhi_ = dir.make<TH1F>("muonPhi" , "phi" ,   100,  -5.,    5.);  
  TH1F* mumuMass_= dir.make<TH1F>("mumuMass", "mass",    90,   30., 120.);

  // ...

ALERT! Note: The first part just hands over the single argument, the path to the configuration file. This path is passed on to the PythonProcessDesc constructor from which the edm::ParameterSet is derived. The parameters are then read out equivalent to the full framework case. In addition to the parameters in Example 3 the collection label for the muon collection is also treated as a parameter. You can find the implementation of this executable in the FWLiteLumiWithPythonConfig.cc file in the bin directory of the package.

3.5.3.6. Example 4: Using event selectors

In this example we will introduce you to some more advanced features of the SelectorUtils package. We will make use of an EventSelector to apply event selections, rather than doing the selection in the main event loop. This will allow to keep track of the statistics of the event selection. More details about Selectors can be found on SWGuidePATSelectors. This particular example will create a simple W selector that will select events with muons with pt>20 GeV, and MET>20 GeV (using PAT the MET will per default be type1 and muon corrected calorimeter MET). To run the example type the following command in the src directory of your local release area:

FWLiteWithSelectorUtils PhysicsTools/FWLite/test/fwliteWithSelectorUtils_cfg.py
  processing event: 10
  ...
     0 :              Muon Pt         22
     1 :                  MET         86

The last lines correspond to the number of events that have passed the muon pt, and MET cut, respectively. This can be made as complicated as you wish and automatically keeps track of your cut flow.

ALERT! Note: This example is an extension of Example 3. The configuration file has therefore been slightly modified:

import FWCore.ParameterSet.Config as cms

process = cms.PSet()

process.fwliteInput = cms.PSet(
    fileNames   = cms.vstring('rfio:///castor/cern.ch/cms/store/relval/CMSSW_4_1_3/RelValTTbar/GEN-SIM-RECO/START311_V2-v1/0037/648B6AA5-C751-E011-8208-001A928116C6.root'),                                ## mandatory
    maxEvents   = cms.int32(-1),                             ## optional
    outputEvery = cms.uint32(10),                            ## optional
)

process.fwliteOutput = cms.PSet(
    fileName  = cms.string('analyzeFWLiteHistograms.root'),  ## mandatory
)

process.selection = cms.PSet(
        muonSrc      = cms.InputTag('muons'),
        metSrc       = cms.InputTag('metJESCorAK5CaloJetMuons'),
        muonPtMin    = cms.double(20.0),
        metMin       = cms.double(20.0),
       #cutsToIgnore = cms.vstring('MET')
)

ALERT! Note: The additional edm::ParameterSet selection is passed on to the W boson selector in the implementation of the executable:

// get the python configuration
  const edm::ParameterSet& process = edm::readPSetsFrom(argv[1])->getParameter<edm::ParameterSet>("process");
  fwlite::InputSource inputHandler_(process); fwlite::OutputFiles outputHandler_(process);

  // initialize the W selector
  edm::ParameterSet selection = process.getParameter<edm::ParameterSet>("selection");
  WSelector wSelector( selection ); pat::strbitset wSelectorReturns = wSelector.getBitTemplate();
  
  // book a set of histograms
  fwlite::TFileService fs = fwlite::TFileService(outputHandler_.file().c_str());
  TFileDirectory theDir = fs.mkdir("analyzeBasicPat");
  TH1F* muonPt_  = theDir.make<TH1F>("muonPt", "pt",    100,  0.,300.);
  TH1F* muonEta_ = theDir.make<TH1F>("muonEta","eta",   100, -3.,  3.);
  TH1F* muonPhi_ = theDir.make<TH1F>("muonPhi","phi",   100, -5.,  5.); 

The definition of the Selector can be found as an own class in the WSelector.h file in the interface directory of the package. The class is derived from the EventSelector base class of the SelectorUtils package:

class WSelector : public EventSelector {

public:
  /// constructor
  WSelector(edm::ParameterSet const& params) :
    muonSrc_(params.getParameter<edm::InputTag>("muonSrc")),
    metSrc_ (params.getParameter<edm::InputTag>("metSrc")) 
  {
    double muonPtMin = params.getParameter<double>("muonPtMin");
    double metMin    = params.getParameter<double>("metMin");
    push_back("Muon Pt", muonPtMin );
    push_back("MET"    , metMin    );
    set("Muon Pt"); set("MET");
    wMuon_ = 0; met_ = 0;
    if ( params.exists("cutsToIgnore") ){
      setIgnoredCuts( params.getParameter<std::vector<std::string> >("cutsToIgnore") );
    }
    retInternal_ = getBitTemplate();
  }
  /// destructor
  virtual ~WSelector() {}
  /// return muon candidate of W boson
  pat::Muon const& wMuon() const { return *wMuon_;}
  /// return MET of W boson
  pat::MET  const& met()   const { return *met_;  }

  /// here is where the selection occurs
  virtual bool operator()( edm::EventBase const & event, pat::strbitset & ret){
    ret.set(false);
    // Handle to the muon collection
    edm::Handle<std::vector<pat::Muon> > muons;    
    // Handle to the MET collection
    edm::Handle<std::vector<pat::MET> > met;
    // get the objects from the event
    bool gotMuons = event.getByLabel(muonSrc_, muons);
    bool gotMET   = event.getByLabel(metSrc_, met   );
    // get the MET, require to be > minimum
    if( gotMET ){
      met_ = &met->at(0);
      if( met_->pt() > cut("MET", double()) || ignoreCut("MET") ) 
	passCut(ret, "MET");
    }
    // get the highest pt muon, require to have pt > minimum
    if( gotMuons ){
      if( !ignoreCut("Muon Pt") ){
	if( muons->size() > 0 ){
	  wMuon_ = &muons->at(0);
	  if( wMuon_->pt() > cut("Muon Pt", double()) || ignoreCut("Muon Pt") ) 
	    passCut(ret, "Muon Pt");
	}
      } 
      else{
	passCut( ret, "Muon Pt");
      }
    }
    setIgnored(ret);
    return (bool)ret;
  }

protected:
  /// muon input
  edm::InputTag muonSrc_;
  /// met input
  edm::InputTag metSrc_;
  /// muon candidate from W boson
  pat::Muon const* wMuon_;
  /// MET from W boson
  pat::MET const* met_;
};

The selection is applied in the following lines in the main function of the executable code in the FWLiteWithSelectorUtils.cc file:

if ( wSelector(event, wSelectorReturns ) ) {
	  pat::Muon const & wMuon = wSelector.wMuon();
	  muonPt_ ->Fill( wMuon.pt()  );
	  muonEta_->Fill( wMuon.eta() );
	  muonPhi_->Fill( wMuon.phi() );
	} 

As a feature of the base class you can switch off all parts of the selection in our example by uncommenting the following line in the configuration file (indicating the corresponding selection step):

cutsToIgnore = cms.vstring('MET')    # <<<----- Uncomment this line

This configuration parameter is picked up in these lines in the executable code:

if ( params.exists("cutsToIgnore") )
    setIgnoredCuts( params.getParameter<std::vector<std::string> >("cutsToIgnore") );

ALERT! Note: In general the Selector works with the old PAT feature of strbitset. You can operate it completely without this feature though, by changing the argument of the if statement in the main function of the code from if ( wSelector(event, wSelectorReturns ) ) to if ( wSelector(event) ) and adding the following hook to the constructor of your selector:

retInternal_ = getBitTemplate();

ALERT! Note: Recently the Selectors have been speeded up in CPU performance by introducing selection index cashing. By un/commenting the .h files according to the following lines in the FWLiteWithSelectorUtils.cc file:

//#include "PhysicsTools/FWLite/interface/WSelector.h"
#include "PhysicsTools/FWLite/interface/WSelectorFast.h"

you can see the effect (don't forget to recompile and to refresh the cash of your shell). You can have a look into the WSelectorFast.h file in the interface directory of the package to check the differences to the common WSelector.h file in the same directory.

You can find the implementation of this executable in the FWLiteLumiWithSelectorUtils.cc file in the bin directory of the package. You can find a more elaborate example of a V+Jets selector on CMS.VplusJetsSelectors.

3.5.3.7 Example 5:Using FWLite and full framework in parallel

In this example we will show you how to write your analysis code that can be used both in FWLite and full framework following only a few very simple rules. It makes use of the BasicAnalyzer class, a powerful tool of the UtilAlgos package. You can concentrate on the actual analysis code and with less than 10 lines will be able to transform it either into a FWLite executable as described above or into a full framework EDAnalyzer. For the 10 extra lines you gain that you don't have to care about the FWLite event loop or any python configuration parameter parsing any more in FWLite case, which is taken care for you. You can combine this kind of programming style with any examples and FWLite features shown on this page. This is the way of doing an analysis as recommended by the AT (Analysis Tools) group. The example given here is equivalent to Example 3. To run it type the following command in the src directory of your local release area:

FWLiteWithBasicAnalyzer PhysicsTools/FWLite/test/fwliteWithPythonConfig_cfg.py

ALERT! Note: As you see it also makes use of the same configuration file as Example 3. The example though originates from the PhysicsTools/UtilAlgos package. Having a look into the FWLiteWithBasicAnalyzer.cc file there, for the implementation of the executable you will recognise that it is significantly shorter:

int main(int argc, char* argv[]) 
{
  // load framework libraries
  gSystem->Load( "libFWCoreFWLite" );
  AutoLibraryLoader::enable();

  // only allow one argument for this simple example which should be the
  // the python cfg file
  if ( argc < 2 ) {
    std::cout << "Usage : " << argv[0] << " [parameters.py]" << std::endl;
    return 0;
  }
  if( !edm::readPSetsFrom(argv[1])->existsAs<edm::ParameterSet>("process") ){
    std::cout << " ERROR: ParametersSet 'plot' is missing in your configuration file" << std::endl; exit(0);
  }

  WrappedFWLiteMuonAnalyzer ana(edm::readPSetsFrom(argv[1])->getParameter<edm::ParameterSet>("process"), std::string("muonAnalyzer"), std::string("analyzeBasicPat"));
  ana.beginJob();
  ana.analyze();
  ana.endJob();
  return 0;
}

ALERT! Note: What remains is the parameter check to read in the python configuration file. The edm::ParameterSet is handed over to the WrappedFWLiteAnalyzer object, which is treated in maximal analogy to an EDAnalyzer in the following. This object has been declared by a simmple template expansion in the first lines:

#include "PhysicsTools/FWLite/interface/BasicMuonAnalyzer.h"
#include "PhysicsTools/UtilAlgos/interface/FWLiteAnalyzerWrapper.h"

typedef fwlite::AnalyzerWrapper<BasicMuonAnalyzer> WrappedFWLiteMuonAnalyzer;

The FWLiteAnalyzerWrapper.h is just a simple tool. You can use it as a black box if you like or have a short glimpse into the implementation in the UtilAlgos package if you like. E.g. you will find back the event loop there that you know from Example 3, which is already implemented there for you so that you don't have to programm it over and over again, for each new executable that you plan to write.

The key element that drives the analysis is the BasicMuonAnalyzer class, which is defined in the interface directory of the package:

class BasicMuonAnalyzer : public edm::BasicAnalyzer {

 public:
  /// default constructor
  BasicMuonAnalyzer(const edm::ParameterSet& cfg, TFileDirectory& fs);
  /// default destructor
  virtual ~BasicMuonAnalyzer(){};
  /// everything that needs to be done before the event loop
  void beginJob(){};
  /// everything that needs to be done after the event loop
  void endJob(){};
  /// everything that needs to be done during the event loop
  void analyze(const edm::EventBase& event);

 private:
  /// input tag for mouns
  edm::InputTag muons_;
  /// histograms
  std::map<std::string, TH1*> hists_;
};

ALERT! Note: Those already familiar with the structure of EDAnalyzer's will immediately recognise the similarities. The class derives from the BasicAnalyzer class, which is an abstract base class. It has a beginJob, endJob and an analyze function in analogy to the EDAnalyzer, which is a requirement of the base class. In addition we added a std__map for histogramming and an input tag for a muon collection.

You can find the implementation in the BasicMuonAnalyzer.cc file in the src directory of the package:

/// default constructor
BasicMuonAnalyzer::BasicMuonAnalyzer(const edm::ParameterSet& cfg, TFileDirectory& fs): 
  edm::BasicAnalyzer::BasicAnalyzer(cfg, fs),
  muons_(cfg.getParameter<edm::InputTag>("muons"))
{
  hists_["muonPt"  ] = fs.make<TH1F>("muonPt"  , "pt"  ,  100,  0., 300.);
  hists_["muonEta" ] = fs.make<TH1F>("muonEta" , "eta" ,  100, -3.,   3.);
  hists_["muonPhi" ] = fs.make<TH1F>("muonPhi" , "phi" ,  100, -5.,   5.); 
  hists_["mumuMass"] = fs.make<TH1F>("mumuMass", "mass",   90, 30., 120.);
}

/// everything that needs to be done during the event loop
void 
BasicMuonAnalyzer::analyze(const edm::EventBase& event)
{
  // define what muon you are using; this is necessary as FWLite is not 
  // capable of reading edm::Views
  using reco::Muon;

  // Handle to the muon collection
  edm::Handle<std::vector<Muon> > muons;
  event.getByLabel(muons_, muons);

  // loop muon collection and fill histograms
  for(std::vector<Muon>::const_iterator mu1=muons->begin(); mu1!=muons->end(); ++mu1){
    hists_["muonPt" ]->Fill( mu1->pt () );
    hists_["muonEta"]->Fill( mu1->eta() );
    hists_["muonPhi"]->Fill( mu1->phi() );
    if( mu1->pt()>20 && fabs(mu1->eta())<2.1 ){
      for(std::vector<Muon>::const_iterator mu2=muons->begin(); mu2!=muons->end(); ++mu2){
	if(mu2>mu1){ // prevent double conting
	  if( mu1->charge()*mu2->charge()<0 ){ // check only muon pairs of unequal charge 
	    if( mu2->pt()>20 && fabs(mu2->eta())<2.1 ){
	      hists_["mumuMass"]->Fill( (mu1->p4()+mu2->p4()).mass() );
	    }
	  }
	}
      }
    }
  }
}

ALERT! Note: You find the histogram booking in the constructor and the analysis implementation in the analyze function of the class. It is not the topic of this chapter but you can transform this very code into EDAnalyzer by a few lines like this:

#include "PhysicsTools/PatExamples/interface/BasicMuonAnalyzer.h"
#include "PhysicsTools/UtilAlgos/interface/EDAnalyzerWrapper.h"

typedef edm::AnalyzerWrapper<BasicMuonAnalyzer> WrappedEDMuonAnalyzer;

#include "FWCore/Framework/interface/MakerMacros.h"
DEFINE_FWK_MODULE(WrappedEDMuonAnalyzer);

You can indeed find such an plugin definition at the end of the modules.cc file in the plugins directory of the PhysicsTool/UtilAlgos package. Try to run the plugin:

addpkg PhysicsTools/UtilAlgos V08-02-09-01
scram b -j 4
cmsRun PhysicsTools/UtilAlgos/test/cmsswWithPythonConfig_cfg.py 

You will get the exact same result as with the FWLite executable having used the exact same class BasicMuonAnalyzer.

ALERT! Note: these lines imply that you have chaecked out the PhysicsTools/UtilAlgos package with a tag larger or equa then V08-02-09-01

Appendix: Using DB Information in FWLite

In this appendix we explain how to access data base information in FWLite: we will first need to create an FWLite-readable ROOT file that will "cache" the payloads from the Conditions Database (CondDB). As an example we choose B-Tag information. The first part will access the database, and dump the payloads to a ROOT file:

cd RecoBTag/PerformanceDB/test
cmsRun testFWLite_Writer_cfg.py

It's obvious that for this operation you to be logged in to lxplus or somewhere else with apropriate DB connection. You can find the python configuration file that we used here. We will go through it step by step. The following lines connect to the database and accesses the appropriate payloads:

process.load("CondCore.DBCommon.CondDBCommon_cfi") 
process.load ("RecoBTag.PerformanceDB.PoolBTagPerformanceDB100426") 
process.load ("RecoBTag.PerformanceDB.BTagPerformanceDB100426") 
process.PoolDBESSource.connect = 'frontier://FrontierProd/CMS_COND_31X_PHYSICSTOOLS'

The specific records you want to cache will be specified with the "toGet" method:

process.PoolDBESSource.toGet = cms.VPSet(
    cms.PSet(
    record = cms.string('PerformanceWPRecord'),
    tag = cms.string('BTagPTRELSSVMwp_v1_offline'),
    label = cms.untracked.string('BTagPTRELSSVMwp_v1_offline')
    ),
                                                        cms.PSet(
    record = cms.string('PerformancePayloadRecord'),
    tag = cms.string('BTagPTRELSSVMtable_v1_offline'),
    label = cms.untracked.string('BTagPTRELSSVMtable_v1_offline')
    ))

The actual module that dumps the payloads is specified here:

process.myrootwriter = cms.EDAnalyzer("BTagPerformaceRootProducerFromSQLITE",
                                  name = cms.string('PTRELSSVM'),
                                  index = cms.uint32(1001)
                                  )

The root file containing the cached payloads will be accessed within FWLite as shown below:

TestPerformanceFWLite_ES

The source code of this file is located here.

To access the dumped DB information, the syntax is:

fwlite::ESHandle< PerformancePayload > plHandle;
  es.get(testRecID).get(plHandle,"PTRELSSVM");
  fwlite::ESHandle< PerformanceWorkingPoint > wpHandle;
  es.get(testRecID).get(wpHandle,"PTRELSSVM");

  if ( plHandle.isValid() && wpHandle.isValid() ) {
    BtagPerformance perf(*plHandle, *wpHandle);

From there you can access the payloads as you would in the full framework:

// check beff, berr for eta=.6, et=55;
    BinningPointByMap p;

    std::cout <<" test eta=0.6, et=55"<<std::endl;

    p.insert(BinningVariables::JetEta,0.6);
    p.insert(BinningVariables::JetEt,55);
    std::cout <<" nbeff/nberr ?"<<perf.isResultOk(PerformanceResult::BTAGNBEFF,p)<<"/"<<perf.isResultOk(PerformanceResult::BTAGNBERR,p)<<std::endl;
    std::cout <<" beff/berr ?"<<perf.isResultOk(PerformanceResult::BTAGBEFF,p)<<"/"<<perf.isResultOk(PerformanceResult::BTAGBERR,p)<<std::endl;
    std::cout <<" beff/berr ="<<perf.getResult(PerformanceResult::BTAGBEFF,p)<<"/"<<perf.getResult(PerformanceResult::BTAGBERR,p)<<std::endl;

Review status

Reviewer/Editor and Date (copy from screen) Comments
RogerWolf - 09 Nov 2010 Revision for Nov Tutorial

Responsible: RogerWolf



3.5.4 FWLite.Python (using PyROOT)

Contents:


Detailed Review status

Goals of this page:

In this section you will learn how to use FWLite.Python.

Introduction

This document assumes that you already know python (although you may be able to learn a fair bit just following the examples). Please see references for some links for learning Python.

Why Python? CMS already uses it for its configuration files. It is a powerful language that many consider both much easier to read and easier to write than C++. FWLite.Python also provides a possibility to interactively play with CMS data.

Note: Many people are already using Root's Cint to interactively access CMS data. This is fine if one is only using the TBrowser or Draw() command, but for anything more complicated, Cint is very unreliable (meaning it crashes if you are lucky or runs without any error messages but gives you the wrong answer if you are not).

FWLite.Python is designed to be intuitive if you are both experienced with Python and FWLite or the full framework.

Simple Annotated Example

For a complete example, see patZpeak.py. This can be used either as a script or typed in interactively.

To start, we want to import ROOT and the needed pieces of FWLite.Python:

import ROOT
from DataFormats.FWLite import Events, Handle

You then create an Events object by giving it either:

  • VarParsing options object (see Command line option parsing twiki for details)
  • A string containing an input file
  • A list of strings containing input files
In this case, I'll use a single input file

events = Events ('ZmumuPatTuple.root')

We then create a handle. Unlike cmsRun and FWLite in C++, this is a cpu-intensive operation so it is recommend to do this outside of the event loop.

# create handle outside of loop
handle  = Handle ('std::vector<pat::Muon>')

Now we make a label

# a label is just a tuple of strings that is initialized just
# like and edm::InputTag
label = ("selectedPatMuons")

By default, PyRoot will open windows. Make sure you tell Root if you don't want to do this:

ROOT.gROOT.SetBatch()        # don't pop up canvases

Create any histograms you want to fill:

# Create histograms, etc.
ROOT.gROOT.SetStyle('Plain') # white background
zmassHist = ROOT.TH1F ("zmass", "Z Candidate Mass", 50, 20, 220)

And now the event loop:

# loop over events
for event in events:

Let's get the data as we would in FWLite and cmsRun:

    # use getByLabel, just like in cmsRun
    event.getByLabel (label, handle)

Now use what you got:

    # get the product
    muons = handle.product()

Since I'm running over Z → μμ, let's make a Z peak:

    # use muons to make Z peak
    numMuons = len (muons)
    if muons < 2: continue
    for outer in xrange (numMuons - 1):
        outerMuon = muons[outer]
        for inner in xrange (outer + 1, numMuons):
            innerMuon = muons[inner]
            if outerMuon.charge() * innerMuon.charge() >= 0:
                continue
            inner4v = ROOT.TLorentzVector (innerMuon.px(), innerMuon.py(),
                                           innerMuon.pz(), innerMuon.energy())
            outer4v = ROOT.TLorentzVector (outerMuon.px(), outerMuon.py(),
                                           outerMuon.pz(), outerMuon.energy())
            zmassHist.Fill( (inner4v + outer4v).M() )

Groovy. Let's make a plot and quit:

# make a canvas, draw, and save it
c1 = ROOT.TCanvas()
zmassHist.Draw()
c1.Print ("zmass_py.png")

References

Python introductions:

-- CharlesPlager - 13-Jan-2010 -- PetarMaksimovic - 08 Jul 2009



The information on this page is deprecated and not part of the WorkBook any more.

3.5.5 How to Make and Run FWLite Executables

Contents:

Goals

This Twiki page will lead you through the steps necessary to copy, run, and understand two FWLite executable examples as well as modify one of them to create a third.

Setting up CMSSW Environment

This software is available for software releases after CMSSW_3_3_1.

Please visit here for help setting up CMSSW environment.

Build FWLite Executable - JetPt.exe

To create necessary source files and compile them into an executable:

cd $CMSSW_BASE/src
newFWLiteAna.py Analysis/SimpleExamples/myJetPt  --copy=jetPt --newPackage
scram b

In this case, newFWLiteAna.py:

  • Makes the necessary directory structure ( $CMSSW_BASE/src/AnalysisSimpleExamples/bin ). This mimics CMS's standard directory structure.
  • Makes (if necessary) and adds necessary lines to $CMSSW_BASE/src/AnalysisSimpleExamples/bin/CMS.BuildFile for a new FWLite executable (the CMS.BuildFile is CMS's equivalent to Makefile ).
  • Creates $CMSSW_BASE/src/AnalysisSimpleExamples/bin/myJetPt.cc (which is a copy of default jetPt.cc, the name in the heading )

scram b invokes CMS's build system and creates an executable myJetPt.exe . Note the executable created is myJetPt.exe and not JetPt.exe as the heading may suggest.

If this is the first time compiling and you are using tcsh shell this executable, you need to run the rehash command. You do not need to run this command if you are recompiling. If you don't know which shell you are using, you are probably using tcsh .

Also NOTE: --newPackage option is needed ONLY if it is the first time your are creating Directory/SubDirectory. In this case it means Analysis/SimpleExamples.

Now you are ready to use myJetPt.exe

Run JetPt.exe

myJetPt.exe inputFiles=/afs/cern.ch/cms/Tutorials/RelValTTbar_334.root \
     outputEvery=10 outputFile=myJetPt

Let us look at the command we just executed. In the above command we have the following:

  • The executable myJetPt.exe
  • The options inputFiles=... , outputEvery=... , outputFile=...

The option:

  • inputFiles=/afs/cern.ch/cms/Tutorials/RelValTTbar_334.root
    tells the executable what input file to use.
  • outputEvery=10
    tells the executable to print out lines like Processing Event: 1000 and gives you a measure of progress of running the code.
  • outputFile=myJetPt
    tells the executable the name of the output root file where your histograms are written. In this case it is myJetPt.root

For further information on command line options see Command_line_option_parsing

Here is the output you see on the screen as a result of executing the command above:

unix > myJetPt.exe inputFiles=/afs/cern.ch/cms/Tutorials/RelValTTbar_334.root \
     outputEvery=10 outputFile=myJetPt

------------------------------------------------------------------

Integer options:
    jobid          = -1             - jobID given by CRAB,etc. (-1 means append nothing)
    maxevents      = 0              - Maximum number of events to run over (0 for whole file)
    outputevery    = 10             - Output something once every N events (0 for never)
    section        = 0              - This section (from 1..totalSections inclusive)
    totalsections  = 0              - Total number of sections

Bool options:
    logname        = false          - Print log name and exit

String options:
    outputfile     =                - Output filename
        'myJetPt.root'
    storeprepend   =                - Prepend location on files starting with '/store/'
        ''
    tag            =                - A 'tag' to append to output file (e.g., 'v2', etc.)
        ''

String Vector options:
    inputfiles     =                - List of input files
        /afs/cern.ch/cms/Tutorials/RelValTTbar_334.root
        
    secondaryinputfiles =                - List of secondary input files (a.k.a. two-file-solution
        
------------------------------------------------------------------
Processing Event: 10
Processing Event: 20
Processing Event: 30
Processing Event: 40
Processing Event: 50
Processing Event: 60
Processing Event: 70
Processing Event: 80
Processing Event: 90
Processing Event: 100
EventContainer Summary: Processed 100 events.
TH1Store::write(): Successfully written to 'myJetPt.root'.


The first part of the printout is a listing of all the variables you can set from the command line and their values when the program is run. You can use --noPrint if you wish to turn this output off. See Command_line_option_parsing for more details

NOTE: The data file used here has 100 events only.

You should now see a root file called, in this case, myJetPt.root.

Before you open the root file, you may want to have rootlogon.C file setup, as described in section 3.5.1.3 in your current directory.

In order to open the root file do:

root -l myJetPt.root 
Then on the root prompt do
root [0] jetPt->Draw() 

If you choose log Y-scale you should see a plot like this:

jetPt

edm::Handle

Starting in CMSSW 3.3.1, one can now use edm::Handle s inside of FWLite. There are two advantages of this:

1. You can now use the same function calls in cmsRun and FWLite.

2. As a result of this, you can now share code libraries that work on events between cmsRun and FWLilte.

Details of the Code - JetPt.cc

You can browse the full code JetPt.cc here

The code has several parts:

CMS includes

// -*- C++ -*-

// CMS includes
#include "DataFormats/FWLite/interface/Handle.h"
#include "DataFormats/PatCandidates/interface/Jet.h"

#include "CMS.PhysicsTools/FWLite/interface/EventContainer.h"
#include "CMS.PhysicsTools/FWLite/interface/CommandLineParser.h" 

// Root includes
#include "TROOT.h"

using namespace std;

If you want to use other objects such as electron, photons etc, you would add some these includes files.

       #include "DataFormats/PatCandidates/interface/MET.h"
       #include "DataFormats/PatCandidates/interface/Photon.h"
       #include "DataFormats/PatCandidates/interface/Electron.h"
       #include "DataFormats/PatCandidates/interface/Tau.h"
       #include "DataFormats/TrackReco/interface/Track.h"

Main Subroutine

All executables in C++ need an int main() subroutine. This is what is run when the executable starts.

///////////////////////////
// ///////////////////// //
// // Main Subroutine // //
// ///////////////////// //
///////////////////////////

int main (int argc, char* argv[]) 
{

Command Line Options

Declare command line option parser. Give the parser a short summary of what this program does (will be visible when --help command is used).

        ////////////////////////////////
        // ////////////////////////// //
        // // Command Line Options // //
        // ////////////////////////// //
        ////////////////////////////////


       // Tell people what this analysis code does and setup default options.
       optutl::CommandLineParser parser ("Plots Jet Pt");

Change any defaults or add any new command line options

For example if you do not specify the option outputFile=myJetPt when running the myJetPt.exe, the default name jetPt.root specified in the code line below is used.

        ////////////////////////////////////////////////
        // Change any defaults or add any new command //
        //      line options you would like here.     //
        ////////////////////////////////////////////////
        parser.stringValue ("outputFile") = "jetPt.root";

After adding any new option or changing any defaults, tells the parser to parse the command line options.

        // Parse the command line arguments
        parser.parseArguments (argc, argv);

For more details on command line parsing see Command_line_option_parsing

Create Event Container

We create an event container that initializes itself from the command line options parser (e.g., input files, output file, etc.).

        //////////////////////////////////
        // //////////////////////////// //
        // // Create Event Container // //
        // //////////////////////////// //
        //////////////////////////////////

       // This object 'eventCont' is used both to get all information from the
       // event as well as to store histograms, etc.
       fwlite::EventContainer eventCont (parser);

This object 'event' is used both to get all information from the event as well as to store histograms, etc.

Begin Run (e.g., book histograms, etc)

Here is where you create ( "book") any histogram you want to fill later in the code.

        ////////////////////////////////////////
        // ////////////////////////////////// //
        // //         Begin Run            // //
        // // (e.g., book histograms, etc) // //
        // ////////////////////////////////// //
        ////////////////////////////////////////

        // Setup a style
        gROOT->SetStyle ("Plain");

        // Book those histograms!
        eventCont.add( new TH1F( "jetPt", "jetPt", 1000, 0, 1000) );

Event Loop

Loop over events in input files you specify at the command line. Also make a cast to the "edm::EventBase" to be able to access the same code in the same way in FWLite or the full framework.

        //////////////////////
        // //////////////// //
        // // Event Loop // //
        // //////////////// //
        //////////////////////

        for (eventCont.toBegin(); ! eventCont.atEnd(); ++eventCont) 
        {

Extract Info From Event

Here the code:

  • Creates a "handle"
  • Tries to hook the handle upto a branch containing the selectedLayer1Jets
  • Makes sure that handle hook up is successful

Note: The following code is very similar to how you extract information from the event in cmsRun

           //////////////////////////////////
           // Take What We Need From Event //
           //////////////////////////////////
           edm::Handle< vector< pat::Jet > > jetHandle;
           eventCont.getByLabel (jetLabel, jetHandle);

A handle can be treated as pointer to whatever data format it is holding. For example, we can put const vector < pat::Jet > &jetvec = *handle and then use jetvec or we can use the handle directly as a pointer as the code does below:

   
         // Loop over the jets
         const vector< pat::Jet >::const_iterator kJetEnd = jetHandle->end();
         for (vector< pat::Jet >::const_iterator jetIter = jetHandle->begin();
              kJetEnd != jetIter; 
              ++jetIter) 
         {         

Fill pt() .

The following line of code does two distinct things:

  • get pt() for this jet
  • Fills histogram stored in the event container

            eventCont.hist("jetPt")->Fill (jetIter->pt());
         } // for jetIter
      } // for eventCont

Clean Up Job

Nothing much to do since histograms are automatically saved. Return 0 to indicate that this program was successful.

      ////////////////////////
      // ////////////////// //
      // // Clean Up Job // //
      // ////////////////// //
      ////////////////////////

     // Histograms will be automatically written to the root file
     // specificed by command line options.

     // All done!  Bye bye.
     return 0;
}

FWLite Executable Z peak Example

Build the executable. Here is a link to the zPeak.cc

newFWLiteAna.py Analysis/SimpleExamples/myZPeak --copy=zPeak
scram b
rehash
Also NOTE: Now you DO NOT need the option --newPackage as Analysis/SimpleExamples already exists.

Run the executable

myZPeak.exe inputFiles=/afs/cern.ch/cms/Tutorials/RelValZMM_334.root \
     outputEvery=10 outputFile=myZPeak 

You should now see root file called myZPeak.root If you open this root file you browse to the following plot:

zPeak

Let us look at the the code zPeak.cc to see how we looped over the muons. The snippet to loop over muons is here:

The following loop is effectively doing the

  • outer = [0, N-1]
    • inner = [outer+1, N]

This ensures that while combining two muons together, we do not try to combine a muon with itself, nor try any combination more than once.

 // O.k.  Let's loop through our muons and see what we can find.
      const vector< pat::Muon >::const_iterator kEndIter       = muonVec.end();
      const vector< pat::Muon >::const_iterator kAlmostEndIter = kEndIter - 1;
      for (vector< pat::Muon >::const_iterator outerIter = muonVec.begin();
           kAlmostEndIter != outerIter;
           ++outerIter)
      {
         for (vector< pat::Muon >::const_iterator innerIter = outerIter + 1;
              kEndIter != innerIter;
              ++innerIter)
         {

The following ensures that only pairs of muons with opposite charges will be used for as a Z candidate:

            // make sure that we have muons of opposite charge
            if (outerIter->charge() * innerIter->charge() >= 0) continue;

            // if we're here then we have one positively charged muon
            // and one negatively charged muon.

Here we get the 4-momentum of two muons, add them, get the invariant mass, and fill the histogram.

            eventCont.hist("Zmass")->Fill( (outerIter->p4() + innerIter->p4()).M() );
         } // for innerIter
      } // for outerIter

Modify Z peak

We are going to use myZpeak.cc to make a new code

cd $CMSSW_BASE/src
newFWLiteAna.py Analysis/SimpleExamples/myZPeakWithCuts  --copy=Analysis/SimpleExamples/bin/myZPeak.cc 

Edit myZPeakWithCuts.cc Change

parser.stringValue ("outputFile") = "zpeak1.root";

to

parser.stringValue ("outputFile") = "zpeak2.root";

and change

vector< pat::Muon > const & muonVec = *muonHandle;

to

      // Create a new vector only with muons that pass our cuts
      vector< pat::Muon > muonVec;
      for (vector< pat::Muon >::const_iterator iter = muonHandle->begin();
           muonHandle->end() != iter;
           ++iter)
      {
         if (iter->pt() > 20 && std::abs(iter->eta()) < 2.5)
         {
            muonVec.push_back( *iter );
         }
      }

Then do:

scram b

Remember to run rehash the first time you successfully compile.

Run the code:

myZPeakWithCuts.exe inputFiles=/afs/cern.ch/cms/Tutorials/RelValZMM_334.root \
     outputEvery=10 outputFile=myZPeakModified

You should now see root file called myZPeakModified.root

Using the ROOT commands below, you can super impose the modified ZPeak and the ZPeak we had made before.

root -l 
root [0] TFile::Open("myZPeak.root")
(class TFile*)0x8366578
root [1] Zmass->Draw()
<TCanvas::MakeDefCanvas>: created default TCanvas with name c1
root [2] TFile::Open("myZPeakModified.root")
(class TFile*)0x85776a0
root [3] Zmass->SetLineColor(kRed)
root [4] Zmass->Draw("Same")
root [5] 

The superimposed plots are:

zPeak_zPeakModified

References

Command line option parsing

Command line option parsing is a method of letting you set the values of different variables when running your FWLite executable from the command line. By default several options are hooked up for you (e.g., inputFiles is the list (std::vector) of files to run over, outputFile is the name of the root file where your histograms will be stored).

You will be able to add new command line options, change the default values of the default command line options, as well as learn how to easily set these options from the command line.

Defining Variables

To define options, one gives:

  • A name,
  • A type,
    • kInteger,
    • kDouble,
    • kString,
    • kBool,
    • kIntegerVector,
    • kDoubleVector, or
    • kStringVector.
  • A description, and
  • (If desired for a non-vector type,) a default value. If this is not given for a non-vector type, 0/""/false is chosen.

For example, here are two variables I hooked up in btagTemplates.cc above:

   parser.addOption ("mode",         optutl::CommandLineParser::kInteger, 
                      "Normal(0), VQQ(1), LF(2), Wc(3)", 
                      0);   
   parser.addOption ("sampleName",   optutl::CommandLineParser::kString, 
                      "Sample name (e.g., top, Wqq, etc.)");   

Here are the default six options that are automatically hocked up:

   parser.addOption ("inputFiles",    kStringVector,
                      "List of input files");
   parser.addOption ("totalSections", kInteger,
                      "Total number of sections",
                       0);
   parser.addOption ("section",       kInteger,
                      "This section (from 1..totalSections inclusive)",
                      0);
   parser.addOption ("maxEvents",     kInteger,
                      "Maximum number of events to run over (0 for whole file)",
                      0);
   parser.addOption ("outputFile",    kString,
                      "Output filename",
                      "output.root");
   parser.addOption ("outputEvery",   kInteger,
                      "Output something once every N events (0 for never)",
                     100);

Accessing Options

To access these variables, you use one of the following access functions:
   int         &integerValue  (std::string key);
   double      &doubleValue   (std::string key);
   std::string &stringValue   (std::string key);
   bool        &boolValue     (std::string key);
   IVec        &integerVector (std::string key);
   DVec        &doubleVector  (std::string key);
   SVec        &stringVector  (std::string key);

Note that these are references and not const references, so you can use these functions to set variables as well. For example, if you wanted to change the default of maxEvents to 10, you can put:
   optutl::integerValue ("maxEvents") = 10;
before the optutl::parseArguments() function call.

Setting Command Line Options

The general way to set a command line option is -varName=value. Here are the rules:

  • General Idea : -varName=value
  • kBool values should be set with 1 for true, 0.
  • Vector values can be set in any mixture of multiple times (e.g., inputFiles=one.root inputFiles=two.root) or with a comma separated list (e.g., inputFiles=three.root,four.root).
  • Vector values can be loaded from a text file (e.g.,=inputFiles_load=fileListingRootFiles.txt=).
  • If you have defined default values for a vector in the code, you can clear the default from the command line (e.g., inputFiles_clear=true).
  • --help will print out all values, their descriptions, and their default values as well as the usage string (set by optutl::setUsageString()) and then exit.
  • By default, all variables, their descriptions, and their current values will be printed and then continue running. However, using the --noPrint option will suppress these print messages.

Try the options, example

Type
newJetPt.exe --help 
in Analysis/SimpleExamples/newJetPt/bin= directory. You see all the options available as below:

jetPt.exe - Plots Jet Pt
------------------------------------------------------------------

Integer options:
    jobid          = -1             - jobID given by CRAB,etc. (-1 means append nothing)
    maxevents      = 0              - Maximum number of events to run over (0 for whole file)
    outputevery    = 0              - Output something once every N events (0 for never)
    section        = 0              - This section (from 1..totalSections inclusive)
    totalsections  = 0              - Total number of sections

Bool options:
    logname        = false          - Print log name and exit

String options:
    outputfile     =                - Output filename
        'output.root'
    storeprepend   =                - Prepend location on files starting with '/store/'
        ''
    tag            =                - A 'tag' to append to output file (e.g., 'v2', etc.)
        ''

String Vector options:
    inputfiles     =                - List of input files
        
------------------------------------------------------------------

In the options, the most important ones are inputfiles and the outputfiles. Option logname can be used to have in Crab jobs to have the sane name for the log file and output file. Options section and totalsections are used to split the number of input file data files. For example your relval_ttbar_300pre8.filelist [ Get the data files first] below has 50 files. You can split input of these 50 files,say, by putting section equal to 5 and totalsection equal to 10.

Histogram storage:

fwlite::EventContainer not only contains what is necessary to access data in the event, but also what is needed for users to store histograms. These histograms will be automatically saved at the end of the job in the file specified by outputName command line option.

1. eventCont.add() is used to store a newly created histogram:

 eventCont.add( new TH1F("ZMass", "Mass of Z candidate", 100, 50, 150) );
 

If you want to store histograms in a directory, simply add the name of the directory after the histogram pointer:

 eventCont.add( new TH1F("ZMass", "Mass of Z candidate", 100, 50, 150),
                "parentDir/subDir");
 

In both cases, the histogram will be accessed using the name "ZMass" (see below).

2. eventCont.hist() is used to access the histograms:

 eventCont.hist ("myHistogram")->Fill (myVariable);

Backporting to 3.1.x or 3.2.x

If you are using CMSSW_3_1_X or CMSSW_3_2_X, you need to checkout and build the following tags:

Package Tag
CMS.PhysicsTools/FWLite V02-00-06
DataFormats/FWLite V00-13-00

Responsible: CharlesPlager and SudhirMalik
Last reviewed by: SudhirMalik - 4 Feb 2010

Edit | Attach | Watch | Print version | History: r18 < r17 < r16 < r15 < r14 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r18 - 2011-01-21 - PeterJones


ESSENTIALS

ADVANCED TOPICS


 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback