Pre-exercises for the PAT tutorial

Introduction

These are the pre-exercises that one is expected to do before taking the PAT tutorial. We assume that you have a scratch area at lxplus. We will be using the release 7_4_1_patch4 and login to lxplus.cern.ch. We assume that for all PAT tutorials your working directory is like: /afs/cern.ch/user/m/malik/scratch0/CMSSW_7_4_1_patch4/src. We will refer to this as WORKING DIRECTORY. Of course, in place of m/malik you will have your initial and login name. In case you don't know if you have enough disk space, please check your quota typing fs lq and follow this link. If you don't have enough space (500 MB), you may instead use the temporary space ( /tmp/your_user_name), but be aware that this is lost once you log out of lxplus (or within something like a day). If there are any questions or suggestions about these, please contact Sudhir Malik or Phat Srimanobhas.

NOTE-1: If you need to setup a working area somewhere else, say at Fermilab cmslpc machines, then you need to have a working directory set up accordingly. We do not provide instructions on how to set it up there. But in this case, we assume that you are working in your WORKING DIRECTORY whatever that may be. For example at Fermilab cmslpc machines, your working area could be /uscms_data/d2/malik/. Again, make sure you have enough of working space.

NOTE-2: If you need to increase you working area at CERN, you can do following this suggestion.

  • Login to the "account" web page ( https://account.cern.ch/account/ )
  • Go to "Applications and Resources" and to the "Manage"-link next to "Linux and AFS".
  • You can extend the quota for your $home (up to 2 GB).
  • You can ask for "workspace" in AFS (quota 20 GB). The new style AFS path to your workspace is /afs/cern.ch/work/u/username
  • You can link it to your home using ln -s /afs/cern.ch/work/u/username $HOME/work. You will see work directory in your home when you login to lxplus.

Obtain an CERN computer account:

To have a CERN account, please have a look Get Account at CERN. If you do not have a scratch area then look here.

*NOTE: If you need account elsewhere, you need to contact your local cluster admins and follow their instructions.

Exercise 1 - Cut and Paste

Login to lxplus5.cern.ch

As the exercises often require copying and pasting from instruction, we will make sure that you will have no problems. To verify if cut and paste to/from a terminal window works, first copy the script runThisCommand.py as follows

cp /afs/cern.ch/cms/Tutorials/TWIKI_DATA/runThisCommand.py .

and then cut and paste the following and then hit return

./runThisCommand.py "asdf;klasdjf;kakjsdf;akjf;aksdljf;a" "sldjfqewradsfafaw4efaefawefzdxffasdfw4ffawefawe4fawasdffadsfef"

The response should be your username followed by alphanumeric string of characters unique to your username, for example for a user named malik:

success: malik znyvx 

QUESTION 1 - What is the alphanumeric string of characters unique to your username.

If the command is executed without any cut and paste:

somebody@cmslpc11> ./runThisCommand.py

the result will likely be:

Error: You must provide the secret key

Pasting incorrectly, the result will likely be:

Error: You didn't paste the correct input string

If not running not on lxplus ( for example locally on a laptop), will result in:

bash: ./runThisCommand.py: No such file or directory

OR, for example:

Unknown user: malik.

Exercise 2 - Simple Edit Exercise

The purpose of this exercise is to ensure that the user can edit files.

Log into lxplus5.cern.ch, run this command:

cp /afs/cern.ch/cms/Tutorials/TWIKI_DATA/editThisCommand.py .

Then open editThisCommand.py and edit the 11th line adding a # (hash character) as the first character of the line. Explicitly change the following three lines:

# Please comment the line below out by adding a '#' to the front of
# the line.
raise RuntimeError, "You need to comment out this line with a #"

to:

# Please comment the line below out by adding a '#' to the front of
# the line.
#raise RuntimeError, "You need to comment out this line with a #"

Save the file and execute the command:

user@cmslpc12> ./editThisCommand.py

If this is successful, the result will be:

success:   malik 0xB888EFD

QUESTION 2 - What is the line beginning with "success" in your case?

If the file has not been successfully edited, an error message will result such as:

Traceback (most recent call last):
  File "./editThisCommand.py", line 11, in ?
    raise RuntimeError, "You need to comment out this line with a #"
RuntimeError: You need to comment out this line with a #

Exercise 3 - Setup a release area CMSSW_7_4_1_patch4

To set up a CMSSW release area go into your scratch folder (cd $HOME/scratch0) and issue the commands below. You can copy/paste the commands all at once. The cmsrel command is to create the environment, as soon as you have the CMSSW_7_4_1_patch4 folder, you do not need to execute it again. For exhaustive info on the runtime environment checkout WorkBookSetComputerNode#CreateWork.

cmsrel CMSSW_7_4_1_patch4
cd CMSSW_7_4_1_patch4/src
cmsenv

Make sure you have executed cmsenv before answering the question below.

QUESTION 3.1 - What is the output of the command echo $CMSSW_RELEASE_BASE ?

QUESTION 3.2 - What is the output of the command scram tool list ?

Exercise 4 - Find data in the DAS (Data Aggregation Service)

Go to the url DAS and enter the following in the search field:

dataset dataset=/SingleMu/Run2012B-22Jan2013-v1/AOD

QUESTION 4.1 - What is the size of this data set? How many files does it have? What site is the data available?

QUESTION 4.2 - What release was this dataset collected in? (If you see more than one release, just answer one)

Then click on "FIles" to see the files that this dataset contains. It should look like this:

       File: /store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root
       ...................................................

Alternatively, one can also use the commands on the command prompt from lxplus account to search for dataset /SingleMu/Run2012B-22Jan2013-v1/AOD. First, in your WORKING DIRECTORY (CMSSW_7_4_1_patch4/src) set the environment with cmsenv. Then you can use the das_client.py macro, e.g.:

das_client.py --query="dataset=/SingleMu/Run2012B-22Jan2013-v1/AOD" 

(48) scratch0/CMSSW_7_4_1_patch4/src: das_client.py --query="dataset=/SingleMu/Run2012B-22Jan2013-v1/AOD" 

Showing 1-10 out of 1 results, for more results use --idx/--limit options

/SingleMu/Run2012B-22Jan2013-v1/AOD

You can do other search combinations also from command line. More information about accessing data in the DBS can be found in WorkBookLocatingDataSamples

Exercise 5 - GIT

Version control for CMSSW is done centrally with git at github (https://github.com/cms-sw/cmssw, https://cms-sw.github.io). In order to setup git properly, please follow the computing concepts guideline WorkBookComputingConcepts#GitHub. When you reach the point, where you can successfully perform git cms-addpkg PhysicsTools/PatExamples, you are set for the PAT tutorial. If you'd like to be able to store your changes at github, you need to continue the tutorial to the end, but this can also wait until later.

QUESTION 5.1 - What is the output of the command git branch ?

Exercise 6 - EDM ( Event Data Model framework) standalone utilities - edmFileUtil, edmDumpEventContent, edmProvDump, edmEventSize

As a reminder, make sure CMSSW has been set up as in Exercise 3.

The overall collection of CMS software, referred to as CMSSW, is built around a Framework, an Event Data Model (EDM), and Services needed by the simulation, calibration and alignment, and reconstruction modules that process event data so that physicists can perform analysis. The primary goal of the Framework and EDM is to facilitate the development and deployment of reconstruction and analysis software.

The CMS Event Data Model (EDM) is centered around the concept of an Event. An Event is a C++ object container for all RAW and reconstructed data related to a particular collision.To understand what is in a data file and more, several EDM utilities are available. In this exercise, one will use three of these EDM utilities. They will be very useful at PAT Tutorial and after. More about these EDM utilities can be found at WorkBookEdmUtilities. These together with CMS code repository at github and the CMS code cross referencer LXR are very useful to understand and write CMS code. Browsing the repository and using the LXR will be done in Exercise 8 and 9 respectively.

If you are not logged in at CERN lxplus, please read the *ENDNOTE at the end of Exercise 6.

  • Use edmFileUtil to find the physical file name (PFN) corresponding to the logical file name (LFN) from the RECO data file located above.
    • To do this execute
      edmFileUtil -d /store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root
    • If you are working on lxplus this will return:
root://eoscms//eos/cms/store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root?svcClass=default

  • Use edmFileUtil to find details of data file
    • To do this execute
      edmFileUtil /store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root
    • If you are working on lxplus this will return something like this, where the last line is important:
/store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root
120611 21:43:25 001 Xrd: GoToAnotherServer: Going to: lxfsre11a05.cern.ch:1095
/store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root (1 runs, 58 lumis, 420 events, 90217658 bytes)

  • Use edmDumpEventContent to see what class names etc. to use in order to access the objects in the RECO data file located above
    • To do this execute
      edmDumpEventContent --all --regex caloJet /store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root > EdmDumpEventContent.txt
      cat EdmDumpEventContent.txt
    • If you want to make sure that the file exists at lxplus you can do cmsLs /store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root. This path gives the physical location of the file at lxplus.
    • Open and look at the file EdmDumpEventContent.txt. It has information divided into four variable width columns. The first column is the C++ class type of the data, the second is module label, the third is product instance label and the fourth is process name. More information is available at Identifying Data in the Event.
    • QUESTION 6.1- How many types of CaloJet module labels are there? What are their names?
    • NOTE: Instead of the above, try without the option --regex caloJet . This will dump the entire event content - a file with many lines
      • To do this execute
          edmDumpEventContent --all /store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root > EdmDumpEventContent.txt 
    • To aid in understanding the full history of an analysis, the framework accumulates provenance for all data stored in the standard ROOT output files. Using the command edmProvDump one can print out all the tracked parameters used to create the data file. For example, one can see which modules were run and the CMSSW version used to make the RECO file. In executing the command below it is important to follow the instructions carefully, otherwise a large number of warning messages may appear. The ROOT warning messages can be ignored.
    • To do this execute
        edmProvDump /store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root  > EdmProvDump.txt
    • NOTE: EdmProvDump.txt is a very large file on the order of 10000-20000 lines. Open and look at this file and locate Processing History ( about 10-15 lines from the top).
    • QUESTION 6.2- Which version of CMSSW_?_?_? was used in the RECO data process?

  • Execute edmEventSize to determine the size of different branches in the data file. Further details may be found here: SWGuideEdmEventSize.
    • Execute
      edmEventSize -v root://eoscms//eos/cms/store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root > EdmEventSize.txt 
    • Open and look at file EdmEventSize.txt and locate the line containing the text recoCaloJets_ak5CaloJets__RECO.. There are two numbers following this text that measure the plain and the compressed size of this branch.
    • QUESTION 6.3 - What are these two numbers?
*ENDNOTE : If you are working, for example, at cmslpc at Fermilab, you need to search for files locally. DO NOT look for castor directory at Fermilab. Hence, as an example, to check that the file exists at Fermilab, do:

  • ls /pnfs/cms/WAX/11/store/data/Run2012A/SingleMu/AOD/PromptReco-v1/000/195/606/C45062A2-88B1-E111-82DC-001D09F290CE.root
and do edmDumpEventContnet as follows:

  •   edmDumpEventContent --all --regex caloJet dcap://pnfs/cms/WAX/11/store/data/Run2012A/SingleMu/AOD/PromptReco-v1/000/195/606/C45062A2-88B1-E111-82DC-001D09F290CE.root 

Exercise 7 - BuildFile.xml

SCRAM uses a file called BuildFile.xml in each package directory which describes what the package will produce and what dependencies the package has. For more reading look at WorkBookBuildFilesIntro and SWGuideBuildFile.

Since you already have CMSSW_7_4_1_patch4 release area setup, execute the following steps ( as a reminder, we assume your working directory to be like /afs/cern.ch/user/m/malik/scratch0:

cd ~/scratch0/CMSSW_7_4_1_patch4/src
git cms-addpkg PhysicsTools/PatExamples    # can be omitted if you have PhysicsTools/PatExamples already from exercise 5
scram b

scram b invokes the compilation in all subdirectories of where you are right now. When you compile the output on screen should look like the following. No error messages means successful compilation. You will see several messages scroll by but none with word "error".

Resetting caches
>> Local Products Rules ..... started
>> Local Products Rules ..... done
>> Building CMSSW version CMSSW_7_4_1_patch4 ----
>> Entering Package PhysicsTools/PatExamples
>> Creating project symlinks
  src/PhysicsTools/PatExamples/python -> python/PhysicsTools/PatExamples
>> Compiling  /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/bin/PatMuonEDMAnalyzer.cc 
Entering library rule at PhysicsTools/PatExamples
>> Compiling  /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/src/AnalysisTasksAnalyzerBTag.cc 
>> Compiling  /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/src/AnalysisTasksAnalyzerJEC.cc 
>> Compiling  /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/src/PatBTagCommonHistos.cc 
>> Compiling  /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/src/PatMuonAnalyzer.cc 
>> Building shared library tmp/slc5_amd64_gcc462/src/PhysicsTools/PatExamples/src/PhysicsToolsPatExamples/libPhysicsToolsPatExamples.so
Copying tmp/slc5_amd64_gcc462/src/PhysicsTools/PatExamples/src/PhysicsToolsPatExamples/libPhysicsToolsPatExamples.so to productstore area:
Leaving library rule at PhysicsTools/PatExamples
>> Building binary PatMuonEDMAnalyzer
...
...
...
@@@@ Refreshing Plugins:edmPluginRefresh
>> Pluging of all type refreshed.
gmake[1]: Leaving directory `/localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4'
189.149u 13.452s 3:38.02 92.9%  0+0k 0+0io 18170pf+0w

Look at the BuildFile.xml in PhysicsTools/PatExamples/BuildFile.xml and delete it by doing.

rm PhysicsTools/PatExamples/BuildFile.xml

You can later copy it back from CVS by doing the following (but not now).

git checkout PhysicsTools/PatExamples/BuildFile.xml

Now go back to src directory and clean the area

scram b clean

You should now see the following type of output on the screen:

Reading cached build data
Cleaning ProductStore directories:
/bin/rm -rf  logs/slc5_amd64_gcc462 lib/slc5_amd64_gcc462 include cfipython/slc5_amd64_gcc462 objs/slc5_amd64_gcc462 doc test/slc5_amd64_gcc462 python bin/slc5_amd64_gcc462
Resetting project cache:
/bin/rm -f .SCRAM/slc5_amd64_gcc462/ProjectCache.db*
/bin/rm -f .SCRAM/slc5_amd64_gcc462/DirCache.db*
/bin/rm -f .SCRAM/slc5_amd64_gcc462/RuntimeCache.db*
/bin/rm -rf .SCRAM/slc5_amd64_gcc462/MakeData
Cleaning up the compiled .pyc and generated __init__.py files in the src directory.
/bin/rm -rf tmp/slc5_amd64_gcc462

Now you compile again as follows

scram b

Now you see lot of error messages ending in output as follows.

Resetting caches
>> Local Products Rules ..... started
>> Local Products Rules ..... done
>> Building CMSSW version CMSSW_7_4_1_patch4 ----
>> Entering Package PhysicsTools/PatExamples
>> Creating project symlinks
  src/PhysicsTools/PatExamples/python -> python/PhysicsTools/PatExamples
>> Compiling  /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/bin/PatMuonEDMAnalyzer.cc 
Entering library rule at PhysicsTools/PatExamples
>> Compiling  /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/src/AnalysisTasksAnalyzerBTag.cc 
In file included from /afs/cern.ch/cms/slc5_amd64_gcc462/cms/cmssw/CMSSW_7_4_1_patch4/src/DataFormats/Provenance/interface/ProcessConfigurationRegistry.h:4:0,
                 from /afs/cern.ch/cms/slc5_amd64_gcc462/cms/cmssw/CMSSW_7_4_1_patch4/src/DataFormats/Provenance/interface/ProvenanceFwd.h:40,
                 from /afs/cern.ch/cms/slc5_amd64_gcc462/cms/cmssw/CMSSW_7_4_1_patch4/src/DataFormats/Common/interface/HandleBase.h:29,
                 from /afs/cern.ch/cms/slc5_amd64_gcc462/cms/cmssw/CMSSW_7_4_1_patch4/src/DataFormats/Common/interface/Handle.h:30,
                 from /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/src/AnalysisTasksAnalyzerBTag.cc:1:
/afs/cern.ch/cms/slc5_amd64_gcc462/cms/cmssw/CMSSW_7_4_1_patch4/src/FWCore/Utilities/interface/ThreadSafeRegistry.h:6:28: fatal error: boost/thread.hpp: No such file or directory
compilation terminated.
gmake: *** [tmp/slc5_amd64_gcc462/src/PhysicsTools/PatExamples/src/PhysicsToolsPatExamples/AnalysisTasksAnalyzerBTag.o] Error 1

Error messages means failure to compile. The PhysicsTools/PatExamples package did not compile successfully as the BuildFile.xml is missing.

As suggested above you can copy the BuildFile.xml back and compile again.

QUESTION 7- Did you succeed in compilation after copying back the BuildFile.xml?

Exercise 8 - The GIT repository

The GIT repository for the CMS Software is at https://github.com/cms-sw/cmssw.

QUESTION 8. - Locate the Package/SubPackage called PhysicsTools/PatAlgos and PhysicsTools/PatExamples. In the PatExamples subpackage in which subdirectory are the configuration (*_cfi, *_cff, *_cfg) files present? In which subdirectory are the various analyzers?

Note the "s" for plural in PatExamples.

Exercise 9 - LXR search

LXR is a code browser, that links the code by its references. E.g. if you want to know where a class is defined, you can simply click on it in the browser and it searches for all occurrences of the class. The cross referenced display of the CMS source code is here CMSSW LXR. You can also cross reference ROOT, Geant4, TriDAS and CMSSW code also from the main link here.

QUESTION 9 - Use the LXR to exactly locate the file called patTemplate_cfg.py ?

Note : You will have to use the general search link on the top left of page CMSSW LXR and then field "Files named".

Exercise 10 - About python configuration files

CMSSW configs are written python. Hence, some python guides:

For total beginners: http://docs.python.org/tutorial/

For people who know programming, but not Python: http://diveintopython.org/

Please read through WorkBookConfigFileIntro, it will give you a basic understanding of the elements in a cmssw configuration file. For more info on the python configs, you can refer to SWGuideAboutPythonConfigFile, WorkBookConfigFileIntro and SWGuidePythonTips.

Now, to look at the features of these files let us take an example that you will also study in the PAT exercises. This is the PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py config file. The first step is to get this file in your working directory in case you have not checked out the PhysicsTools/PatAlgos package. To do this go to your working area (e.g. ~/scratch0/CMSSW_7_4_1_patch4/src/) and execute the following step:

git cms-addpkg PhysicsTools/PatAlgos

Let's run this config file. To do this we need to edit this file first, with your favorite editor (which is vim of course):

vim PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py

and ADD this code at the end of the file:

process.source.fileNames = ['file:/afs/cern.ch/work/h/htholen/public/PAT_TUTORIAL/CMSSW_7_0_0_pre11_relval.root']  

NOTE Python is tab sensitive. Make sure that process.source.fileNames has no space at the beginning of the line otherwise you will get a syntax error.

NOTE: We are reading a special file with some 300 events from AFS area BECAUSE the RelVal samples have a short shelf life. As a result this data cannot be accessed after passage of a month or so. Hence we saved these events in AFS are so that the example runs fine without reading it directly from eoscms area. Those events are taken from the RelVal sample file /store/relval/CMSSW_7_0_0_pre11/RelValProdTTbar/AODSIM/START70_V4-v1/00000/D0516C65-766A-E311-B744-00259059642E.root.

NOTE: You can find the information of copying few events from a data file to your local area by using edmCopyPickMerge or copyPickMerge_cfg.py from this link.

NOTE: The standard recipe for PAT is at SWGuidePATRecipes. In there, you can follow the recommendation for each CMSSW release. Under each release section, at the end, you will see "See the corresponding Release Notes for details.", you can click on "Release Notes", then you will find the last recommended version of PhysicsTools/PatAlgos, and related tags.

Now you can run the patTuple_standard_cfg.py configuration using

cmsRun PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py

As you see it creates a root file called patTuple_standard.root. We assume that you know how to open a TBrowser in ROOT session, if not see WorkBookBasicROOT.

Before opening the ROOT make sure you have the following lines included in your rootlogon.C file

 gSystem->Load("libFWCoreFWLite.so");
 AutoLibraryLoader::enable();

as explained in WorkBookSetComputerNode#RooT, otherwise a lot of warning messages pop up and ROOT file may not open properly due to lack of the proper CMS-specific dictionary and library needed to open CMS EDM files.

Please open and look at the contents of patTuple_standard.root by clicking on it and then Events. You will see several branches there like patMuons_selectedPatMuons__PAT, recoGenJets_selectedPatJets_genJets__PAT etc.

Let us first dig through this file to see why these branches got written to the patTuple_standard.root file.

The first hint we can get by looking at the the file PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py, it has the following lines defined in it:

from PhysicsTools.PatAlgos.patTemplate_cfg import *

Note: The include line from PhysicsTools.PatAlgos.patTemplate_cfg import * indicates the location of the patTemplate_cfg file. In the git repository the path name includes a /python/ directory. It is not needed in the include line. Also the include PhysicsTools.PatAlgos.patTemplate_cfg has a .py is missing at the end of patTemplate_cfg but it is implied to be there. Hence please take note of it. A * means to include everything from the file patTemplate_cfg.py.

A convenient way to look at the code is the github code browser. So let's have a look at https://github.com/cms-sw/cmssw/blob/CMSSW_7_0_X/PhysicsTools/PatAlgos/python/patTemplate_cfg.py.

The patTemplate_cfg.py included above has defaults defined in it that can be overidden in /patTuple_standard_cfg.py. An example is patTuple.root that can be overridden in /patTuple_standard_cfg.py by defining changing the commented line, as it has been done in patTuple_standard_cfg.py.

process.out.fileName = 'patTuple_standard.root'

Now let us have a look at one more thing here: Why do we have names of branches like patMuons_selectedPatMuons__PAT in patTuple.root ?

The branches like this get written because they are defined in the outputModule in patTemplate_cfg.py. The patEventContent is coming from the following include:

from PhysicsTools.PatAlgos.patEventContent_cff import patEventContentNoCleaning

If you look at patEventContent_cff.py, it has several groups of commands defined as patEventContentNoCleaning, patEventContent, patExtraAodEventContent and so on. We picked patEventContentNoCleaning. Whatever labels are defined in it get written to the patTuple.root (example 'keep *_selectedPatPhotons*_*_*',).

But in patEventContent_cff.py there are only commands to steer which output is to be stored to file. By browsing the code, starting from the initial script that you've run with cmsRun, patTuple_standard_cfg.py, can you find out, how this data is actually produced? Please answer the following question:

QUESTION 10.1 - What is the name of the config files, where selectedPatJets and countPatJets are originally defined? In which directory is it located?

Exercise 11 - Writing your first framework EDAnalyzer

This exercise is based on WorkBook section 4.1.2 where you learn first steps of interacting with the CMS framework to write a module in which you can put your analysis code.

Go through the section 4.1.2 of the WorkBook (WorkBookWriteFrameworkModule). You will see this PLOT on that page. To reproduce it, you can use the data from EOS. Note that, you can use cmsLs command to list files and folders inside the specific path.

cmsLs /store/relval/CMSSW_7_4_1/RelValTTbar_13/GEN-SIM-DIGI-RECO/MCRUN2_74_V9_FastSim-v1/00000/

As said before, release validation samples (RelVal) have only short lifetimes. If you visit it again in the next few months, you may not see the files anymore. But you can start searching from /store/relval/.

To copy files, you can use cmsStage [file] [destination], for example

cmsStage /store/relval/CMSSW_7_4_1/RelValTTbar_13/GEN-SIM-DIGI-RECO/MCRUN2_74_V9_FastSim-v1/00000/023905DB-20EC-E411-948C-0025905A6066.root MyRelValTTbar.root

QUESTION 11.1 - Use python interactively with your demoanalyzer_cfg.py file as input and dump the content of process.maxEvents on the screen. What is the result? Hint: Type ipython -i demoanalyzer_cfg.py to open the ipython shell. Crtl-D to close it. In ipython you've got tab completion, try it with process.maxEv, then press tab. You can also type process. and press tab in order to see all modules.

QUESTION 11.2 - Make this PLOT for only 100 events in CMSSW_7_4_1_patch4 release (since release in 4.1.2 could be different) and report the mean and RMS values? .

Exercise 12 - Using FWLite (Framework Lite) for analyses

FWLite allows you to analyse the data in a root session with the CMSSW libraries loaded. Section 3.5 in the WorkBook deals with it (WorkBookFWLite, WorkBookFWLiteEventLoop, WorkBookFWLiteExamples, WorkBookFWLitePython). In particular, please have a look at section 3.5.3, WorkBookFWLiteExamples, before continuing with this exercise.

Run FWLiteLumiAccess with the following input files:

1. /afs/cern.ch/cms/Tutorials/PAT_tutorial_Summer12/ZMM_CMSSW_5_2_3_patch3_numEvent100.root

2. /afs/cern.ch/cms/Tutorials/PAT_tutorial_Summer12/AOD_DoubleMu_Run195013_numEvent1000.root

QUESTION 12.1 : What is the Total luminosity from lumi sections in both the cases? Are these numbers different for the two files? Any obvious reason for this differences? Hint: trick question alert!!

As a curiosity, you can also execute the command on one of the files from the collision dataset /SingleMu/Run2011B-PromptReco-v1/AOD as follows:

FWLiteLumiAccess inputFiles=root://eoscms//eos/cms/store/data/Run2012B/DoubleMu/AOD/PromptReco-v1/000/193/774/0CDC3936-889B-E111-9F82-001D09F25041.root

Now run the FWLiteHistograms executable with following input datfiles

1. /afs/cern.ch/cms/Tutorials/PAT_tutorial_Summer12/ZMM_CMSSW_5_2_3_patch3_numEvent100.root

The command for running multiple files is FWLiteHistograms inputFiles=file1,file2,file3....

Once you execute the above, you will get an output file called analyzeFWLiteHistograms.root

QUESTION 12.2 - Open the analyzeFWLiteHistograms.root file and open the histogram that says mumuMass. What is the mass of this peak. Do you think it is a Z peak? Why?

NOTE:

You can run the above command on the collision data file below also, the statistics is very low (just 1000 events), but you will see the peak.

/afs/cern.ch/cms/Tutorials/PAT_tutorial_Summer12/AOD_DoubleMu_Run195013_numEvent1000.root

Exercise 13 - Another very useful edm utility - edmConfigEditor

The edmConfigEditor, a graphical tool to visualize the workflow of all kind of configuration files within the CMSSW framework and to edit their configurations.

You will do an exercise during the PAT tutorial called SWGuidePATConfigExercise. For details please have a look at SWGuideConfigEditor.

Here we will do a very simple exercise so that you know how to use it beforehand.

We will use it to inspect the configuration file patTuple_standard_cfg.py little bit.

Please modify the patTuple_standard_cfg.py to avoid a temporary problem when browsing the test samples from the release (so called RelVal samlpes). Replace the following:

from PhysicsTools.PatAlgos.patInputFiles_cff import filesRelValProdTTbarAODSIM
process.source.fileNames = filesRelValProdTTbarAODSIM
by
process.source.fileNames = cms.untracked.vstring('/store/relval/CMSSW_7_4_1/RelValTTbar_13/GEN-SIM-DIGI-RECO/MCRUN2_74_V9_FastSim-v1/00000/023905DB-20EC-E411-948C-0025905A6066.root')

To use edmConfigEditor to inspect patTuple_standard_cfg.py, simply type execute the following on command line.

edmConfigEditor $CMSSW_RELEASE_BASE/src/PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py

This will pop up ( depending on your connection speed) the config browser window. This window should look like this.

configbrowser.png

Look at Tree View column and click on modulesselectedPatMuons. Note that as you do this, the other two columns change correspondingly. After you have clicked on selectedPatMuons, note the src under Parameters in the third column. This shows the source used to create selectedPatMuons (which is patMuons).

QUESTION 13 - What is the source of muons in patMuons? Hint: Double click on the patMuons in the second column and then once the screen refreshes look for the line in the third column that says muonSource.

Exercise 14 - Obtain a Grid Certificate, CMS VO membership and SiteDB registration

NOTE: If you do not have these, this step can take few days. If you have these, skip straight to the question. This exercise is not strictly necessary for the PAT tutorial but it will necessary in order to run jobs on grid after having completed the tutorial.

  • A Grid Certificate, CMS VO membership and SiteDB registration will be needed for the next set of exercises. The registration process can be time-consuming (actions by several people are required), so it is important to start it as soon as possible. There are three main requirements which can be simply summarized: A certificate ensures that you are who you claim to be. A registration in the VO recognizes your (identified by your certificate) as a member of CMS. A SiteDB is a database and web interface that CMS uses to track sites and also used by CRAB publication step to find out the hypernews username of a person from their Grid Certificate's DN (Distinguished Name) etc.. Please look at Get Your Grid Certificate, CMSVO and SiteDB registration to complete these three steps. All three steps are needed to be completed before you successfully submit jobs on the Grid.
To setup and use CRAB environment at CERN, follow the intructions at Use_CRAB_at_CERN.

Now you can initialize your GRID proxy and verify that your GRID certificate has all the information needed by doing the following

voms-proxy-init -voms cms

QUESTION 14 - What is the output of the above command?

-- SudhirMalik - 27-Apr-2010

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng DBS_snapshot.png r2 r1 manage 52.6 K 2010-06-01 - 16:23 SudhirMalik  
PNGpng DBS_snapshot2.png r1 manage 56.3 K 2010-05-24 - 23:35 SudhirMalik  
PNGpng DBS_snapshot_381.png r1 manage 46.2 K 2010-08-24 - 08:38 SudhirMalik  
PNGpng DBS_snapshot_385_pre3.png r1 manage 63.1 K 2010-11-04 - 16:42 SudhirMalik  
PNGpng DBS_snapshot_413.png r1 manage 74.3 K 2011-03-23 - 06:09 SudhirMalik  
PNGpng DBS_snapshot_424.png r1 manage 70.5 K 2011-06-19 - 22:45 SudhirMalik  
PNGpng configbrowser.png r1 manage 104.7 K 2010-11-01 - 00:38 SudhirMalik  
Edit | Attach | Watch | Print version | History: r119 < r118 < r117 < r116 < r115 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r119 - 2015-06-22 - PhatSrimanobhas
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback