Pre-exercises for the PAT tutorial
Introduction
These are the pre-exercises that one is expected to do before taking the PAT tutorial. We assume that you have a scratch area at lxplus. We will be using the release
7_4_1_patch4
and login to
lxplus.cern.ch
.
We assume that for all PAT tutorials your working directory is like: /afs/cern.ch/user/m/malik/scratch0/CMSSW_7_4_1_patch4/src
. We will refer to this as
WORKING DIRECTORY. Of course, in place of
m/malik
you will have your initial and login name. In case you don't know if you have enough disk space, please check your quota typing
fs lq
and follow
this link
. If you don't have enough space (500 MB), you may instead use the temporary space (
/tmp/your_user_name), but be aware that this is lost once you log out of lxplus (or within something like a day).
If there are any questions or suggestions about these, please contact Sudhir Malik or Phat Srimanobhas.
NOTE-1: If you need to setup a working area somewhere else, say at Fermilab
cmslpc machines, then you need to have a working directory set up accordingly. We do not provide instructions on how to set it up there. But in this case, we assume that you are working in your
WORKING DIRECTORY whatever that may be. For example at Fermilab
cmslpc machines, your working area could be
/uscms_data/d2/malik/. Again, make sure you have enough of working space.
NOTE-2: If you need to increase you working area at CERN, you can do following this suggestion.
- Login to the "account" web page ( https://account.cern.ch/account/
)
- Go to "Applications and Resources" and to the "Manage"-link next to "Linux and AFS".
- You can extend the quota for your $home (up to 2 GB).
- You can ask for "workspace" in AFS (quota 20 GB). The new style AFS path to your workspace is /afs/cern.ch/work/u/username
- You can link it to your home using
ln -s /afs/cern.ch/work/u/username $HOME/work
. You will see work
directory in your home when you login to lxplus.
Obtain an CERN computer account:
To have a CERN account, please have a look
Get Account at CERN. If you do not have a scratch area then look
here
.
*NOTE: If you need account elsewhere, you need to contact your local cluster admins and follow their instructions.
Exercise 1 - Cut and Paste
Login to
lxplus5.cern.ch
As the exercises often require copying and pasting from instruction, we will make sure that you will have no problems. To verify if cut and paste to/from a terminal window works, first copy the script
runThisCommand.py
as follows
cp /afs/cern.ch/cms/Tutorials/TWIKI_DATA/runThisCommand.py .
and then cut and paste the following and then hit return
./runThisCommand.py "asdf;klasdjf;kakjsdf;akjf;aksdljf;a" "sldjfqewradsfafaw4efaefawefzdxffasdfw4ffawefawe4fawasdffadsfef"
The response should be your username followed by alphanumeric string of characters unique to your username, for example for a user named malik:
success: malik znyvx
QUESTION 1 - What is the alphanumeric string of characters unique to your username.
If the command is executed without any cut and paste:
somebody@cmslpc11> ./runThisCommand.py
the result will likely be:
Error: You must provide the secret key
Pasting incorrectly, the result will likely be:
Error: You didn't paste the correct input string
If not running not on
lxplus
( for example locally on a laptop), will result in:
bash: ./runThisCommand.py: No such file or directory
OR, for example:
Unknown user: malik.
Exercise 2 - Simple Edit Exercise
The purpose of this exercise is to ensure that the user can edit files.
Log into
lxplus5.cern.ch, run this command:
cp /afs/cern.ch/cms/Tutorials/TWIKI_DATA/editThisCommand.py .
Then open
editThisCommand.py
and edit the 11th line adding a
# (hash character) as the first character of the line. Explicitly change the following three lines:
# Please comment the line below out by adding a '#' to the front of
# the line.
raise RuntimeError, "You need to comment out this line with a #"
to:
# Please comment the line below out by adding a '#' to the front of
# the line.
#raise RuntimeError, "You need to comment out this line with a #"
Save the file and execute the command:
user@cmslpc12> ./editThisCommand.py
If this is successful, the result will be:
success: malik 0xB888EFD
QUESTION 2 - What is the line beginning with "success" in your case?
If the file has not been successfully edited, an error message will result such as:
Traceback (most recent call last):
File "./editThisCommand.py", line 11, in ?
raise RuntimeError, "You need to comment out this line with a #"
RuntimeError: You need to comment out this line with a #
Exercise 3 - Setup a release area CMSSW_7_4_1_patch4
To set up a CMSSW release area go into your scratch folder (
cd $HOME/scratch0
) and issue the commands below. You can copy/paste the commands all at once. The
cmsrel
command is to create the environment, as soon as you have the CMSSW_7_4_1_patch4 folder, you do not need to execute it again. For exhaustive info on the runtime environment checkout
WorkBookSetComputerNode#CreateWork.
cmsrel CMSSW_7_4_1_patch4
cd CMSSW_7_4_1_patch4/src
cmsenv
Make sure you have executed
cmsenv
before answering the question below.
QUESTION 3.1 - What is the output of the command echo $CMSSW_RELEASE_BASE
?
QUESTION 3.2 - What is the output of the command scram tool list
?
Exercise 4 - Find data in the DAS (Data Aggregation Service)
Go to the url
DAS
and enter the following in the search field:
dataset dataset=/SingleMu/Run2012B-22Jan2013-v1/AOD
QUESTION 4.1 - What is the size of this data set? How many files does it have? What site is the data available?
QUESTION 4.2 - What release was this dataset collected in? (If you see more than one release, just answer one)
Then click on "FIles" to see the files that this dataset contains. It should look like this:
File: /store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root
...................................................
Alternatively, one can also use the commands on the command prompt from lxplus account to search for dataset
/SingleMu/Run2012B-22Jan2013-v1/AOD. First, in your WORKING DIRECTORY (CMSSW_7_4_1_patch4/src) set the environment with
cmsenv
. Then you can use the
das_client.py
macro, e.g.:
das_client.py --query="dataset=/SingleMu/Run2012B-22Jan2013-v1/AOD"
(48) scratch0/CMSSW_7_4_1_patch4/src: das_client.py --query="dataset=/SingleMu/Run2012B-22Jan2013-v1/AOD"
Showing 1-10 out of 1 results, for more results use --idx/--limit options
/SingleMu/Run2012B-22Jan2013-v1/AOD
You can do other search combinations also from command line. More information about accessing data in the DBS can be found in
WorkBookLocatingDataSamples
Exercise 5 - GIT
Version control for CMSSW is done centrally with git at github (
https://github.com/cms-sw/cmssw
,
https://cms-sw.github.io
). In order to setup git properly, please follow the computing concepts guideline
WorkBookComputingConcepts#GitHub. When you reach the point, where you can successfully perform
git cms-addpkg PhysicsTools/PatExamples
, you are set for the PAT tutorial. If you'd like to be able to store your changes at github, you need to continue the tutorial to the end, but this can also wait until later.
QUESTION 5.1 - What is the output of the command git branch
?
Exercise 6 - EDM ( Event Data Model framework) standalone utilities - edmFileUtil
, edmDumpEventContent
, edmProvDump
, edmEventSize
As a reminder, make sure CMSSW has been set up as in
Exercise 3.
The overall collection of CMS software, referred to as
CMSSW, is built around a Framework, an Event Data Model (
EDM), and Services needed by the simulation, calibration and alignment, and reconstruction modules that process event data so that physicists can perform analysis. The primary goal of the Framework and
EDM is to facilitate the development and deployment of reconstruction and analysis software.
The CMS Event Data Model (
EDM) is centered around the concept of an Event. An Event is a C++ object container for all RAW and reconstructed data related to a particular collision.To understand what is in a data file and more, several
EDM utilities are available. In this exercise, one will use three of these
EDM utilities. They will be very useful at PAT Tutorial and after. More about these
EDM utilities can be found at
WorkBookEdmUtilities. These together with CMS code repository at
github
and the CMS code cross referencer
LXR
are very useful to understand and write CMS code. Browsing the repository and using the LXR will be done in Exercise 8 and 9 respectively.
If you are not logged in at CERN lxplus, please read the
*ENDNOTE at the end of Exercise 6.
- Use
edmFileUtil
to find the physical file name (PFN) corresponding to the logical file name (LFN) from the RECO data file located above.
root://eoscms//eos/cms/store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root?svcClass=default
- Use
edmFileUtil
to find details of data file
/store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root
120611 21:43:25 001 Xrd: GoToAnotherServer: Going to: lxfsre11a05.cern.ch:1095
/store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root (1 runs, 58 lumis, 420 events, 90217658 bytes)
- Use
edmDumpEventContent
to see what class names etc. to use in order to access the objects in the RECO data file located above
- To do this execute
edmDumpEventContent --all --regex caloJet /store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root > EdmDumpEventContent.txt
cat EdmDumpEventContent.txt
- If you want to make sure that the file exists at lxplus you can do
cmsLs /store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root
. This path gives the physical location of the file at lxplus.
- Open and look at the file
EdmDumpEventContent.txt
. It has information divided into four variable width columns. The first column is the C++ class type of the data
, the second is module label
, the third is product instance label
and the fourth is process name
. More information is available at Identifying Data in the Event.
- QUESTION 6.1- How many types of CaloJet module labels are there? What are their names?
- NOTE: Instead of the above, try without the option
--regex caloJet
. This will dump the entire event content - a file with many lines
- To aid in understanding the full history of an analysis, the framework accumulates provenance for all data stored in the standard ROOT output files. Using the command
edmProvDump
one can print out all the tracked parameters used to create the data file. For example, one can see which modules were run and the CMSSW version used to make the RECO file. In executing the command below it is important to follow the instructions carefully, otherwise a large number of warning messages may appear. The ROOT
warning messages can be ignored.
- To do this execute
edmProvDump /store/data/Run2012A/SingleMu/AOD/22Jan2013-v1/20000/002F5062-346F-E211-BF00-1CC1DE04DF20.root > EdmProvDump.txt
- NOTE:
EdmProvDump.txt
is a very large file on the order of 10000-20000 lines. Open and look at this file and locate Processing History
( about 10-15 lines from the top).
- QUESTION 6.2- Which version of CMSSW_?_?_? was used in the RECO data process?
- Execute
edmEventSize
to determine the size of different branches in the data file. Further details may be found here: SWGuideEdmEventSize.
*ENDNOTE : If you are working, for example, at
cmslpc at Fermilab, you need to search for files locally.
DO NOT look for
castor directory at Fermilab. Hence, as an example, to check that the file exists at Fermilab, do:
and do
edmDumpEventContnet as follows:
Exercise 7 - BuildFile.xml
SCRAM uses a file called BuildFile.xml in each package directory which describes what the package will produce and what dependencies the package has. For more reading look at
WorkBookBuildFilesIntro and
SWGuideBuildFile.
Since you already have
CMSSW_7_4_1_patch4
release area setup, execute the following steps ( as a reminder, we assume your working directory to be like
/afs/cern.ch/user/m/malik/scratch0
:
cd ~/scratch0/CMSSW_7_4_1_patch4/src
git cms-addpkg PhysicsTools/PatExamples # can be omitted if you have PhysicsTools/PatExamples already from exercise 5
scram b
scram b
invokes the compilation in all subdirectories of where you are right now. When you compile the output on screen should look like the following. No error messages means successful compilation. You will see several messages scroll by but none with word "error".
Resetting caches
>> Local Products Rules ..... started
>> Local Products Rules ..... done
>> Building CMSSW version CMSSW_7_4_1_patch4 ----
>> Entering Package PhysicsTools/PatExamples
>> Creating project symlinks
src/PhysicsTools/PatExamples/python -> python/PhysicsTools/PatExamples
>> Compiling /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/bin/PatMuonEDMAnalyzer.cc
Entering library rule at PhysicsTools/PatExamples
>> Compiling /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/src/AnalysisTasksAnalyzerBTag.cc
>> Compiling /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/src/AnalysisTasksAnalyzerJEC.cc
>> Compiling /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/src/PatBTagCommonHistos.cc
>> Compiling /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/src/PatMuonAnalyzer.cc
>> Building shared library tmp/slc5_amd64_gcc462/src/PhysicsTools/PatExamples/src/PhysicsToolsPatExamples/libPhysicsToolsPatExamples.so
Copying tmp/slc5_amd64_gcc462/src/PhysicsTools/PatExamples/src/PhysicsToolsPatExamples/libPhysicsToolsPatExamples.so to productstore area:
Leaving library rule at PhysicsTools/PatExamples
>> Building binary PatMuonEDMAnalyzer
...
...
...
@@@@ Refreshing Plugins:edmPluginRefresh
>> Pluging of all type refreshed.
gmake[1]: Leaving directory `/localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4'
189.149u 13.452s 3:38.02 92.9% 0+0k 0+0io 18170pf+0w
Look at the BuildFile.xml in
PhysicsTools/PatExamples/BuildFile.xml
and delete it by doing.
rm PhysicsTools/PatExamples/BuildFile.xml
You can later copy it back from CVS by doing the following (
but not now).
git checkout PhysicsTools/PatExamples/BuildFile.xml
Now go back to
src
directory and clean the area
scram b clean
You should now see the following type of output on the screen:
Reading cached build data
Cleaning ProductStore directories:
/bin/rm -rf logs/slc5_amd64_gcc462 lib/slc5_amd64_gcc462 include cfipython/slc5_amd64_gcc462 objs/slc5_amd64_gcc462 doc test/slc5_amd64_gcc462 python bin/slc5_amd64_gcc462
Resetting project cache:
/bin/rm -f .SCRAM/slc5_amd64_gcc462/ProjectCache.db*
/bin/rm -f .SCRAM/slc5_amd64_gcc462/DirCache.db*
/bin/rm -f .SCRAM/slc5_amd64_gcc462/RuntimeCache.db*
/bin/rm -rf .SCRAM/slc5_amd64_gcc462/MakeData
Cleaning up the compiled .pyc and generated __init__.py files in the src directory.
/bin/rm -rf tmp/slc5_amd64_gcc462
Now you compile again as follows
scram b
Now you see lot of error messages ending in output as follows.
Resetting caches
>> Local Products Rules ..... started
>> Local Products Rules ..... done
>> Building CMSSW version CMSSW_7_4_1_patch4 ----
>> Entering Package PhysicsTools/PatExamples
>> Creating project symlinks
src/PhysicsTools/PatExamples/python -> python/PhysicsTools/PatExamples
>> Compiling /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/bin/PatMuonEDMAnalyzer.cc
Entering library rule at PhysicsTools/PatExamples
>> Compiling /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/src/AnalysisTasksAnalyzerBTag.cc
In file included from /afs/cern.ch/cms/slc5_amd64_gcc462/cms/cmssw/CMSSW_7_4_1_patch4/src/DataFormats/Provenance/interface/ProcessConfigurationRegistry.h:4:0,
from /afs/cern.ch/cms/slc5_amd64_gcc462/cms/cmssw/CMSSW_7_4_1_patch4/src/DataFormats/Provenance/interface/ProvenanceFwd.h:40,
from /afs/cern.ch/cms/slc5_amd64_gcc462/cms/cmssw/CMSSW_7_4_1_patch4/src/DataFormats/Common/interface/HandleBase.h:29,
from /afs/cern.ch/cms/slc5_amd64_gcc462/cms/cmssw/CMSSW_7_4_1_patch4/src/DataFormats/Common/interface/Handle.h:30,
from /localhome/srimanob/SUSY/CMSSW/MC-Production/slc5_amd64_gcc462/PAT/CMSSW_7_4_1_patch4/src/PhysicsTools/PatExamples/src/AnalysisTasksAnalyzerBTag.cc:1:
/afs/cern.ch/cms/slc5_amd64_gcc462/cms/cmssw/CMSSW_7_4_1_patch4/src/FWCore/Utilities/interface/ThreadSafeRegistry.h:6:28: fatal error: boost/thread.hpp: No such file or directory
compilation terminated.
gmake: *** [tmp/slc5_amd64_gcc462/src/PhysicsTools/PatExamples/src/PhysicsToolsPatExamples/AnalysisTasksAnalyzerBTag.o] Error 1
Error messages means failure to compile. The
PhysicsTools/PatExamples
package did not compile successfully as the BuildFile.xml is missing.
As suggested above you can copy the BuildFile.xml back and compile again.
QUESTION 7- Did you succeed in compilation after copying back the BuildFile.xml?
Exercise 8 - The GIT repository
The GIT repository for the CMS Software is at
https://github.com/cms-sw/cmssw
.
QUESTION 8. - Locate the Package/SubPackage called PhysicsTools/PatAlgos and PhysicsTools/PatExamples. In the PatExamples subpackage in which subdirectory are the configuration (*_cfi
, *_cff
, *_cfg
) files present? In which subdirectory are the various analyzers?
Note the "s" for plural in PatExamples.
Exercise 9 - LXR search
LXR is a code browser, that links the code by its references. E.g. if you want to know where a class is defined, you can simply click on it in the browser and it searches for all occurrences of the class. The cross referenced display of the CMS source code is here
CMSSW LXR
. You can also cross reference ROOT, Geant4, TriDAS and CMSSW code also from the main link
here
.
QUESTION 9 - Use the LXR to exactly locate the file called patTemplate_cfg.py
?
Note : You will have to use the
general search link on the top left of page
CMSSW LXR
and then field "Files named".
Exercise 10 - About python configuration files
CMSSW configs are written python. Hence, some python guides:
For total beginners:
http://docs.python.org/tutorial/
For people who know programming, but not Python:
http://diveintopython.org/
Please read through
WorkBookConfigFileIntro, it will give you a basic understanding of the elements in a cmssw configuration file. For more info on the python configs, you can refer to
SWGuideAboutPythonConfigFile,
WorkBookConfigFileIntro and
SWGuidePythonTips.
Now, to look at the features of these files let us take an example that you will also study in the PAT exercises. This is the
PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py
config file. The first step is to get this file in your working directory in case you have not checked out the
PhysicsTools/PatAlgos
package. To do this go to your working area (e.g.
~/scratch0/CMSSW_7_4_1_patch4/src/
) and execute the following step:
git cms-addpkg PhysicsTools/PatAlgos
Let's run this config file. To do this we need to edit this file first, with your favorite editor (which is vim of course):
vim PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py
and ADD this code at the end of the file:
process.source.fileNames = ['file:/afs/cern.ch/work/h/htholen/public/PAT_TUTORIAL/CMSSW_7_0_0_pre11_relval.root']
NOTE Python is tab sensitive. Make sure that
process.source.fileNames
has no space at the beginning of the line otherwise you will get a syntax error.
NOTE: We are reading a special file with some 300 events from AFS area BECAUSE the RelVal samples have a short shelf life. As a result this data cannot be accessed after passage of a month or so. Hence we saved these events in AFS are so that the example runs fine without reading it directly from
eoscms
area. Those events are taken from the RelVal sample file
/store/relval/CMSSW_7_0_0_pre11/RelValProdTTbar/AODSIM/START70_V4-v1/00000/D0516C65-766A-E311-B744-00259059642E.root
.
NOTE: You can find the information of copying few events from a data file to your local area by using edmCopyPickMerge or copyPickMerge_cfg.py from
this link.
NOTE: The standard recipe for PAT is at
SWGuidePATRecipes. In there, you can follow the recommendation for each CMSSW release. Under each release section, at the end, you will see "See the corresponding Release Notes for details.", you can click on "Release Notes", then you will find the last recommended version of PhysicsTools/PatAlgos, and related tags.
Now you can run the
patTuple_standard_cfg.py
configuration using
cmsRun PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py
As you see it creates a root file called
patTuple_standard.root
. We assume that you know how to open a TBrowser in ROOT session, if not see
WorkBookBasicROOT.
Before opening the
ROOT
make sure you have the following lines included in your
rootlogon.C
file
gSystem->Load("libFWCoreFWLite.so");
AutoLibraryLoader::enable();
as explained in
WorkBookSetComputerNode#RooT, otherwise a lot of warning messages pop up and
ROOT
file may not open properly due to lack of the proper CMS-specific dictionary and library needed to open
CMS EDM files.
Please open and look at the contents of
patTuple_standard.root
by clicking on it and then
Events
. You will see several branches there like
patMuons_selectedPatMuons__PAT
,
recoGenJets_selectedPatJets_genJets__PAT
etc.
Let us first dig through this file to see why these branches got written to the
patTuple_standard.root
file.
The first hint we can get by looking at the the file
PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py
, it has the following lines defined in it:
from PhysicsTools.PatAlgos.patTemplate_cfg import *
Note: The include line
from PhysicsTools.PatAlgos.patTemplate_cfg import *
indicates the location of the
patTemplate_cfg
file. In the git repository the path name includes a
/python/
directory. It is not needed in the include line. Also the include
PhysicsTools.PatAlgos.patTemplate_cfg
has a
.py
is missing at the end of
patTemplate_cfg
but it is implied to be there. Hence please take note of it. A
*
means to include everything from the file
patTemplate_cfg.py
.
A convenient way to look at the code is the github code browser. So let's have a look at
https://github.com/cms-sw/cmssw/blob/CMSSW_7_0_X/PhysicsTools/PatAlgos/python/patTemplate_cfg.py
.
The
patTemplate_cfg.py
included above has defaults defined in it that can be overidden in
/patTuple_standard_cfg.py
. An example is
patTuple.root
that can be overridden in
/patTuple_standard_cfg.py
by defining changing the commented line, as it has been done in
patTuple_standard_cfg.py
.
process.out.fileName = 'patTuple_standard.root'
Now let us have a look at one more thing here: Why do we have names of branches like
patMuons_selectedPatMuons__PAT
in
patTuple.root
?
The branches like this get written because they are defined in the
outputModule
in
patTemplate_cfg.py
. The
patEventContent
is coming from the following include:
from PhysicsTools.PatAlgos.patEventContent_cff import patEventContentNoCleaning
If you look at
patEventContent_cff.py
, it has several groups of commands defined as
patEventContentNoCleaning
,
patEventContent
,
patExtraAodEventContent
and so on. We picked
patEventContentNoCleaning
. Whatever labels are defined in it get written to the
patTuple.root
(example
'keep *_selectedPatPhotons*_*_*',
).
But in
patEventContent_cff.py
there are only commands to steer which output is to be stored to file. By browsing the code, starting from the initial script that you've run with cmsRun,
patTuple_standard_cfg.py
, can you find out, how this data is actually produced? Please answer the following question:
QUESTION 10.1 - What is the name of the config files, where selectedPatJets
and countPatJets
are originally defined? In which directory is it located?
Exercise 11 - Writing your first framework EDAnalyzer
This exercise is based on WorkBook section
4.1.2 where you learn first steps of interacting with the CMS framework to write a module in which you can put your analysis code.
Go through the section 4.1.2 of the WorkBook (
WorkBookWriteFrameworkModule). You will see this
PLOT on that page. To reproduce it, you can use the data from
EOS. Note that, you can use cmsLs command to list files and folders inside the specific path.
cmsLs /store/relval/CMSSW_7_4_1/RelValTTbar_13/GEN-SIM-DIGI-RECO/MCRUN2_74_V9_FastSim-v1/00000/
As said before, release validation samples (RelVal) have only short lifetimes. If you visit it again in the next few months, you may not see the files anymore. But you can start searching from
/store/relval/
.
To copy files, you can use cmsStage [file] [destination], for example
cmsStage /store/relval/CMSSW_7_4_1/RelValTTbar_13/GEN-SIM-DIGI-RECO/MCRUN2_74_V9_FastSim-v1/00000/023905DB-20EC-E411-948C-0025905A6066.root MyRelValTTbar.root
QUESTION 11.1 - Use python interactively with your demoanalyzer_cfg.py file as input and dump the content of process.maxEvents on the screen. What is the result? Hint: Type
ipython -i demoanalyzer_cfg.py
to open the ipython shell. Crtl-D to close it. In ipython you've got tab completion, try it with
process.maxEv
, then press
tab. You can also type
process.
and press
tab in order to see all modules.
QUESTION 11.2 - Make this PLOT for only 100 events in CMSSW_7_4_1_patch4 release (since release in 4.1.2 could be different) and report the mean and RMS values? .
Exercise 12 - Using FWLite (Framework Lite) for analyses
FWLite allows you to analyse the data in a root session with the CMSSW libraries loaded. Section 3.5 in the
WorkBook deals with it (
WorkBookFWLite,
WorkBookFWLiteEventLoop,
WorkBookFWLiteExamples,
WorkBookFWLitePython). In particular, please have a look at section 3.5.3,
WorkBookFWLiteExamples, before continuing with this exercise.
Run
FWLiteLumiAccess
with the following input files:
1.
/afs/cern.ch/cms/Tutorials/PAT_tutorial_Summer12/ZMM_CMSSW_5_2_3_patch3_numEvent100.root
2.
/afs/cern.ch/cms/Tutorials/PAT_tutorial_Summer12/AOD_DoubleMu_Run195013_numEvent1000.root
QUESTION 12.1 : What is the Total luminosity from lumi sections
in both the cases? Are these numbers different for the two files? Any obvious reason for this differences? Hint: trick question alert!!
As a curiosity, you can also execute the command on one of the files from the collision dataset
/SingleMu/Run2011B-PromptReco-v1/AOD
as follows:
FWLiteLumiAccess inputFiles=root://eoscms//eos/cms/store/data/Run2012B/DoubleMu/AOD/PromptReco-v1/000/193/774/0CDC3936-889B-E111-9F82-001D09F25041.root
Now run the
FWLiteHistograms
executable with following input datfiles
1.
/afs/cern.ch/cms/Tutorials/PAT_tutorial_Summer12/ZMM_CMSSW_5_2_3_patch3_numEvent100.root
The command for running multiple files is
FWLiteHistograms inputFiles=file1,file2,file3....
Once you execute the above, you will get an output file called
analyzeFWLiteHistograms.root
QUESTION 12.2 - Open the analyzeFWLiteHistograms.root
file and open the histogram that says mumuMass
. What is the mass of this peak. Do you think it is a Z peak? Why?
NOTE:
You can run the above command on the collision data file below also, the statistics is very low (just 1000 events), but you will see the peak.
/afs/cern.ch/cms/Tutorials/PAT_tutorial_Summer12/AOD_DoubleMu_Run195013_numEvent1000.root
Exercise 13 - Another very useful edm
utility - edmConfigEditor
The
edmConfigEditor
, a graphical tool to visualize the workflow of all kind of configuration files within the CMSSW framework and to edit their configurations.
You will do an exercise during the PAT tutorial called
SWGuidePATConfigExercise. For details please have a look at
SWGuideConfigEditor.
Here we will do a very simple exercise so that you know how to use it beforehand.
We will use it to inspect the configuration file
patTuple_standard_cfg.py
little bit.
Please modify the
patTuple_standard_cfg.py
to avoid a temporary problem when browsing the test samples from the release (so called
RelVal samlpes). Replace the following:
from PhysicsTools.PatAlgos.patInputFiles_cff import filesRelValProdTTbarAODSIM
process.source.fileNames = filesRelValProdTTbarAODSIM
by
process.source.fileNames = cms.untracked.vstring('/store/relval/CMSSW_7_4_1/RelValTTbar_13/GEN-SIM-DIGI-RECO/MCRUN2_74_V9_FastSim-v1/00000/023905DB-20EC-E411-948C-0025905A6066.root')
To use
edmConfigEditor
to inspect
patTuple_standard_cfg.py
, simply type execute the following on command line.
edmConfigEditor $CMSSW_RELEASE_BASE/src/PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py
This will pop up ( depending on your connection speed) the config browser window. This window should look like this.
Look at
Tree View column and click on
modules
→
selectedPatMuons
. Note that as you do this, the other two columns change correspondingly. After you have clicked on
selectedPatMuons
, note the
src
under
Parameters
in the third column. This shows the source used to create
selectedPatMuons
(which is
patMuons
).
QUESTION 13 - What is the source of muons in patMuons
? Hint: Double click on the
patMuons
in the second column and then once the screen refreshes look for the line in the third column that says
muonSource
.
Exercise 14 - Obtain a Grid Certificate, CMS VO membership and SiteDB registration
NOTE: If you do not have these, this step can take few days. If you have these, skip straight to the question. This exercise is not strictly necessary for the PAT tutorial but it will necessary in order to run jobs on grid after having completed the tutorial.
- A Grid Certificate, CMS VO membership and SiteDB registration will be needed for the next set of exercises. The registration process can be time-consuming (actions by several people are required), so it is important to start it as soon as possible. There are three main requirements which can be simply summarized: A certificate ensures that you are who you claim to be. A registration in the VO recognizes your (identified by your certificate) as a member of CMS. A SiteDB is a database and web interface that CMS uses to track sites and also used by CRAB publication step to find out the hypernews username of a person from their Grid Certificate's DN (Distinguished Name) etc.. Please look at Get Your Grid Certificate, CMSVO and SiteDB registration to complete these three steps. All three steps are needed to be completed before you successfully submit jobs on the Grid.
To setup and use CRAB environment at CERN, follow the intructions at
Use_CRAB_at_CERN.
Now you can initialize your GRID proxy and verify that your GRID certificate has all the information needed by doing the following
voms-proxy-init -voms cms
QUESTION 14 - What is the output of the above command?
--
SudhirMalik - 27-Apr-2010