MiBiCommon PAT and Ntuple

The analysis code

Set up the code

the code repository: the PAT & ntuple producer and the ntuple tools.

Recipe for CMSSW_3_8_X (< 3_8_7)

   cmsrel CMSSW_3_8_4_patch2
   cd CMSSW_3_8_4_patch2/src
   cvs co -r V00-03-14-02 RecoEgamma/ElectronIdentification
   cvs co -r tag_before387 -d PhysicsTools/MiBiCommonPAT UserCode/Bicocca/PhysicsTools/MiBiCommonPAT
   cvs co -r tag_before387 -d PhysicsTools/NtupleUtils UserCode/Bicocca/PhysicsTools/NtupleUtils
   cmsenv
   scramv1 b -j 8
   

Recipe for CMSSW_3_8_7

Starting from this CMSSW version, Jet Energy Corrections are stored in DB. The way to access them is slightly changed, hence the changes in the code. The correct GlobalTag to be used is MC_38Y_V14
   cmsrel CMSSW_3_8_7
   cd CMSSW_3_8_7/src
   cvs co -r V00-03-14-02 RecoEgamma/ElectronIdentification
   cvs co -r tag_387 -d PhysicsTools/MiBiCommonPAT UserCode/Bicocca/PhysicsTools/MiBiCommonPAT
   cvs co -r tag_387 -d PhysicsTools/NtupleUtils UserCode/Bicocca/PhysicsTools/NtupleUtils
   cmsenv
   scramv1 b -j 8
   

Recipe for CMSSW_3_9_7

This release is needed to run on Dec22ReReco and Winter10 data samples.
   cmsrel CMSSW_3_9_7
   cd CMSSW_3_9_7/src
   cvs co -r V00-03-19 RecoEgamma/ElectronIdentification
   cvs co -d PhysicsTools/MiBiCommonPAT UserCode/Bicocca/PhysicsTools/MiBiCommonPAT
   cvs co -d PhysicsTools/NtupleUtils UserCode/Bicocca/PhysicsTools/NtupleUtils
   cmsenv
   scramv1 b -j 8
   

Recipe for CMSSW_3_9_9

This release is intended to be used to apply L1 Jet Energy Corrections. Be sure to use a GlobalTag as recent as START39_V9.
   cmsrel CMSSW_3_9_9
   cd CMSSW_3_9_9/src
   cvs co -r V00-03-19 RecoEgamma/ElectronIdentification
   cvs co -d PhysicsTools/MiBiCommonPAT UserCode/Bicocca/PhysicsTools/MiBiCommonPAT
   cvs co -d PhysicsTools/NtupleUtils UserCode/Bicocca/PhysicsTools/NtupleUtils
   cmsenv
   scramv1 b -j 8
   

Recipe for CMSSW_4_1_3_patch2

This release is intended to be used to analyse Run2011 data (as long as new release is available). See here for PAT release note.

   export SCRAM_ARCH=slc5_amd64_gcc434
   cmsrel CMSSW_4_1_3_patch2
   cd CMSSW_4_1_3_patch2/src
   cvs co -r V00-03-19 RecoEgamma/ElectronIdentification
   addpkg PhysicsTools/FWLite                              V02-03-13      
   addpkg PhysicsTools/PatAlgos                            V08-06-01-14   
   addpkg PhysicsTools/PatExamples                         V00-05-11
   addpkg PhysicsTools/PFCandProducer                      V04-07-02      
   addpkg PhysicsTools/SelectorUtils                       V00-03-06
   addpkg PhysicsTools/UtilAlgos                           V08-02-10
   cvs co -r CMSSW_4_1_3_patch2 RecoEcal/EgammaCoreTools                        # to get EcalTools - not in this release
   cvs co -r CMSSW_4_3_0_pre1 RecoEcal/EgammaCoreTools/src/EcalTools.cc         # to get EcalTools - not in this release
   cvs co -r CMSSW_4_3_0_pre1 RecoEcal/EgammaCoreTools/interface/EcalTools.h    # to get EcalTools - not in this release
   cvs co -d PhysicsTools/MiBiCommonPAT UserCode/Bicocca/PhysicsTools/MiBiCommonPAT
   cvs co -d PhysicsTools/NtupleUtils UserCode/Bicocca/PhysicsTools/NtupleUtils
   cmsenv
   scramv1 b -j 9
   

Recipe for CMSSW_4_1_5

This release is intended to be used to analyse Run2011 data (as long as new release is available). See here for PAT release note.

   export SCRAM_ARCH=slc5_amd64_gcc434
   cmsrel CMSSW_4_1_5
   cd CMSSW_4_1_5/src
   cvs co -r V00-03-19 RecoEgamma/ElectronIdentification
   cvs co -d PhysicsTools/MiBiCommonPAT UserCode/Bicocca/PhysicsTools/MiBiCommonPAT
   cvs co -d PhysicsTools/NtupleUtils UserCode/Bicocca/PhysicsTools/NtupleUtils
   cvs co -r CMSSW_4_1_5 RecoEcal/EgammaCoreTools
   cvs co -r CMSSW_4_3_0_pre1 RecoEcal/EgammaCoreTools/src/EcalTools.cc
   cvs co -r CMSSW_4_3_0_pre1 RecoEcal/EgammaCoreTools/interface/EcalTools.h
   cvs co -r V06-04-07-01 TrackingTools/TrajectoryState 
   cmsenv
   scramv1 b -j 9
   

Recipe for CMSSW_4_2_3

This release is intended to be used to analyse Run2011 data and run on Summer11 MC. PF2PAT is now fixed and comprehends new JECs.

   cmsrel CMSSW_4_2_3
   cd CMSSW_4_2_3/src
   cmsenv
   addpkg PhysicsTools/PatAlgos V08-06-25
   addpkg PhysicsTools/PatExamples V00-05-17
   addpkg RecoJets/Configuration V02-04-16
   addpkg RecoJets/JetAlgorithms V04-01-00
   addpkg RecoJets/JetProducers V05-05-03
   cvs co -r V00-03-30 RecoEgamma/ElectronIdentification                /// for eleID, see https://twiki.cern.ch/twiki/bin/viewauth/CMS/LikelihoodBasedEleID2011
   cvs co -r tagFor42X-v2 -d PhysicsTools/MiBiCommonPAT UserCode/Bicocca/PhysicsTools/MiBiCommonPAT
   cvs co -d PhysicsTools/NtupleUtils UserCode/Bicocca/PhysicsTools/NtupleUtils
   scramv1 b -j 9
   

Recipe for CMSSW_4_2_4

This release is intended to be used for Summer analyses.

   cmsrel CMSSW_4_2_4
   cd CMSSW_4_2_4/src
   cmsenv
   cvs co -r V00-03-30 RecoEgamma/ElectronIdentification
   cvs co -r V02-04-17 RecoJets/Configuration
   cvs co -r tagFor42X-v6 -d PhysicsTools/MiBiCommonPAT UserCode/Bicocca/PhysicsTools/MiBiCommonPAT
   cvs co -d PhysicsTools/NtupleUtils UserCode/Bicocca/PhysicsTools/NtupleUtils
   scramv1 b -j 8
   

Recipe for CMSSW_4_2_8_patch3

This release is intended to be used for Summer analyses.

   cmsrel CMSSW_4_2_8_patch3
   cd CMSSW_4_2_8_patch3/src
   cmsenv
   cvs co -r  V01-05-03 RecoVertex/PrimaryVertexProducer #  this is only needed if you need to re-reco vertices
   cvs co -r CMSSW_4_4_0_pre9 CommonTools/ParticleFlow
   cvs co -r tagFor42X-v7 -d PhysicsTools/MiBiCommonPAT UserCode/Bicocca/PhysicsTools/MiBiCommonPAT
   cvs co -d PhysicsTools/NtupleUtils UserCode/Bicocca/PhysicsTools/NtupleUtils
   scramv1 b -j 9
   

Run the code

Go in the PhysicsTools/MiBiCommonPAT/test folder:

  • To create the PATuples run
cmsRun makeMiBiCommonPAT_cfg.py

  • To create the Ntuples from the PATs
cmsRun test_PAT_to_NT.py

  • To create the Ntuples from the RECO/!AOD
cmsRun FIXME.py

Run with crab

Start to produce PATs from data with this file:

PhysicsTools/MiBiCommonPAT/test/crab/crab_PAT_DATA.cfg

There are two examples, publication oriented, of multicrab cfgs:

PhysicsTools/MiBiCommonPAT/test/crab/multicrab_PAT_DATA.cfg
PhysicsTools/MiBiCommonPAT/test/crab/multicrab_PAT_MC.cfg

Start to create the ntuples with crab

multicrab -create (-submit)
multicrab -submit
multicrab -status
multicrab -get 

Tips for job-sitting

Get the list of samples to be processed:
      cvs co -r HEAD DBS/Clients/Python
      cd DBS/Clients/Python
      source setup.sh

     dbs search --query="find dataset where dataset like /VBF*/Spring11*/AODSIM" &> samples.txt
    
   

get samples for multicrab.cfg

     cat samples.txt | grep VBF | tr "/" " " | awk '{print "["$1"_"$2"] \nCMSSW.datasetpath = /"$1"/"$2"/"$3" \n"}'  &> new.txt
   

Check status (not using multicrab -status)

       ls -d */ | awk '{print "crab -status -c "$1}' | /bin/sh
   

Get information for table filling

     ls -d */ |  grep W | awk '{print "echo "$1" ; cat "$1"/log/crab.log | grep events. | grep on "}' | /bin/sh
     ls -rd */ |  grep W | awk '{print "cat "$1"/log/crab.log | grep events. | grep on | awk @{print $9}@ "}'| tr "@" "'" | /bin/sh
     ls -rd */ |  grep W | awk '{print "cat "$1"/log/crab.log | grep events. | grep on | awk @{print $4}@ "}'| tr "@" "'" | /bin/sh
   
Or (better!) Get the number of events in the sample
    for fol in `cat samples.txt` ; do echo -n " " ; echo $fol | tr "/" " " | awk '{print "cat "$1"_"$2"_"$3"/log/crab.log | grep events | grep on |  awk @{printf $9}@ "}' | tr "@" "'" | /bin/sh; echo " "  ;done  
    for fol in `cat samples.txt` ; do echo -n " " ; echo $fol | tr "/" " " | awk '{print "cat "$1"_"$2"/log/crab.log | grep events | grep on |  awk @{printf $9}@ "}' | tr "@" "'" | /bin/sh; echo " "  ;done  
   

Get the number of jobs created

    for fol in `cat samples.txt` ; do echo -n " " ; echo $fol | tr "/" " " | awk '{print "cat "$1"_"$2"_"$3"/log/crab.log | grep events | grep on |  awk @{printf $9}@ "}' | tr "@" "'" | /bin/sh; echo " "  ;done  
    for fol in `cat samples.txt` ; do echo -n " " ; echo $fol | tr "/" " " | awk '{print "cat "$1"_"$2"/log/crab.log | grep events | grep on |  awk @{printf $9}@ "}' | tr "@" "'" | /bin/sh; echo " "  ;done  
   

Get the number of jobs done

    for fol in `cat samples.txt` ; do echo -n " " ; echo $fol | tr "/" " " | awk '{print "tail -20 "$1"_"$2"_"$3"/log/crab.log | grep Jobs | grep Done |  awk @{printf $2}@ "}' | tr "@" "'" | /bin/sh; echo " "  ;done  
    for fol in `cat samples.txt` ; do echo -n " " ; echo $fol | tr "/" " " | awk '{print "tail -20 "$1"_"$2"/log/crab.log | grep Jobs | grep Done |  awk @{printf $2}@ "}' | tr "@" "'" | /bin/sh; echo " "  ;done  
   

Get the jobs in aborted status and automatically resubmit them (example may be tuned!)

    for fol in `cat samples.txt` ; do echo -n " " ; echo $fol | tr "/" " " |   awk '{print "crab -status -c "$1"_"$2"_"$3}' | /bin/sh; done
    for fol in `cat samples.txt` ; do echo -n " " ; echo $fol | tr "/" " " | awk '{print "tail -20 "$1"_"$2"_"$3"/log/crab.log | grep Jobs | grep Done |  awk @{printf $2}@ "}' | tr "@" "'" | /bin/sh; echo " "  ;done  

    for fol in `cat samples.txt` ; do echo -n " " ; echo $fol | tr "/" " " |   awk '{print "crab -status -c "$1"_"$2}' | /bin/sh; done
    for fol in `cat samples.txt` ; do echo -n " " ; echo $fol | tr "/" " " | awk '{print "tail -20 "$1"_"$2"/log/crab.log | grep Jobs | grep Done |  awk @{printf $2}@ "}' | tr "@" "'" | /bin/sh; echo " "  ;done  
   

Copy from Castor to local directory

    /afs/cern.ch/user/a/amassiro/public/Castor/copyFolder.pl   /castor/cern.ch/user/a/amassiro/NTUPLES/MiBiCommonNT/MC_v1   ./
   

Publication of PATs

If you want to publish the results use (still dunno if it works with multicrab)

crab -publish -c nomeworkingdir

The produced data will be published in the DBS instance ph02 with a name which follows the following structure:

/ParentDarasetName/UserName-PublicationName-HorribleNumber/USER

for example the PATs produced by dimatteo running on Electron Dataset are called like this:

/Electron/dimatteo-MiBiCommonPAT_v4-53315cbd5072b51972b2ea4af838e3e3/USER

To delete the data you produced check https://twiki.cern.ch/twiki/bin/view/Main/LeonardoDiMatteo#Storage_managers and https://twiki.cern.ch/twiki/bin/view/Main/LeonardoDiMatteo#Publication

ntuples analysis software

  • Perl script by A. Massironi, to produce a brand new work environment, available here
  • the same script, makeing use of curl instead of wget, so that works on macosx as well, available here
  • Script to re-compile everything in case everything messes up (example for AnalysisTTBar):

Selfexplicative script. Short how to:

wget http://cmsdoc.cern.ch/~amassiro/NtuplePackage/BuildFramework.txt
chmod 755 BuildFramework.txt
./BuildFramework.txt
Follow the instructions of the script (very clear and it works smoothly)

Just be careful to change NtuplePackage/src/Makefile and the TTbar/src/Makefile "ARCHL" to fit your machine's architecture

In you want to avoid to create a huge header for your test.cpp file like:

 std::vector<int>* runId = reader.GetInt("runId");
 std::vector<float>* HLT_Accept = reader.GetFloat("HLT_Accept");
 .......

Use the root macro makeAUTOread.C , just modify its inputs.

How to use ntuples

  • Objects in the ntuples are stored as "vectors" of "something", where "something" may be: int, double, float, 3D vectors, 4D vectors.
  • To access to these vectors * create a reader treeReader reader(ROOT_TREE); * get the vector reader.Get###(NAME); where ### can be int, float, 3V, 4V, ... and NAME is the name of the branch, e.g. "electrons", "dx", ... (use the macro makeAUTOread.C to get a template!)
  • Then access to the vector content as a normal vector: reader.GetInt("nele")->at(10) (the "Get" method returns a pointer to the vector!)
  • Examples can be found here.
  • To skim DATA ntuples with a JSON file use "readJSONFile" and "AcceptEventByRunAndLumiSection" from here

PU reweight tool

A class PUclass is used for plotting MC distribution reweighted. Main idea of MC reweighting described in PileupMCReweightingRecipe.

Step 1: get PU reweight factors from DATA/MC (see PileupMCReweightingRecipe). May them be two vectors of double

 std::vector<double> PUMC;
 std::vector<double> PUDATA;

Step 2: create PUclass object

 PUclass PU;

Step 3: calculate weight factors and save in PUclass

 for (int itVPU = 0; itVPU < PUMC.size(); itVPU++ ){
  PU.PUWeight.push_back(PUDATA.at(itVPU) / PUMC.at(itVPU));
 }

Step 4: write PUclass (it creates a new file to be loaded in macro/compiled code)

 PU.Write("autoWeight.cxx");

Step 5: load macro

 gROOT->ProcessLine(".L autoWeight.cxx");

Now the weight can be used in plotting using "Draw" method of tree in the "cut" field. For example:

  tree->Draw("pT","autoWeight(nPU)","");

Complete example can be found in MCDATAComparisonPLOTTool.cpp and look for "PUclass".

Ntuples database

v0

  • To follow the status of the production go here

  • Location1 : cmsmi3:/media/dimatteo/MiBiCommonNT/v0/

  • Reference Release: 38X

  • Comments :
    • Old JEC

  • Requests for next iteration :
    • nuove calibrazioni per la zona in avanti (check with PG)

v1

  • Initial day for production: 7 Dec 2010

  • Location1 : cmsmi3:/media/dimatteo/MiBiCommonNT/v1/
  • Location2 : /castor/cern.ch/user/a/amassiro/NTUPLES/MiBiCommonNT/MC_v1
  • Location3 : MiBi cluster: /gwterax2/users/amassiro/MiBiCommonNT/Fall10_Dec10/

  • Reference Release: 387

  • Bugs in ......

  • Comments :
    • Updated JEC

  • Requests for next iteration :
    • ...

Winter10

  • Initial day for production: 11 Jan 2011

  • Location1.MC : /castor/cern.ch/user/a/abenagli/NTUPLES/MiBiCommonNT/MC_Winter10
  • Location1.DATA : /castor/cern.ch/user/a/abenagli/NTUPLES/MiBiCommonNT/DATA
  • Location3 : MiBi cluster: /gwteray/users/amassiro/MiBiCommonNT/Winter10/DATA/
  • Location3 : MiBi cluster: /gwteray/users/amassiro/MiBiCommonNT/Winter10/MC/

  • Reference Release: 397

  • Bugs in ......

  • Comments :
    • Update in 39X reconstruction (e.g. electrons)
    • DATA with Laser corrections and calibration constants

  • Requests for next iteration :
    • L1 jet corrections to be added
    • Tau information added in SimpleNtuple.cc
    • Add MC PU information
    • Filter number of samples to run on and do sample-path association (not all paths on all samples)

Spring11

  • link to the XS file
  • Initial day for production: april 2011
  • DATI
hercules /gwteray/users/amassiro/MiBiCommonNT/DATA/
cmsmi4 /data2/amassiro/NTUPLE/tempCopyDATA/
castor /castor/cern.ch/user/a/amassiro/NTUPLES/MiBiCommonNT/DATA/Apr302011
  • MC
hercules /gwteray/users/amassiro/MiBiCommonNT/Spring11/
hercules /gwteray/users/benaglia/NTUPLES/MiBiCommonNT/Spring11/
    • nella cartella hercules: /gwteray/users/amassiro/MiBiCommonNT/Spring11/ ci sono i link simbolici per andare nell'altra
  • Reference Release: 415
  • Bugs in ......
  • Comments :
    • MC produced in CMSSW_3_11_3
    • DATA in CMSSW_412/414
  • Requests for next iteration :
    • Deterministic Annealing vertexing
    • PU beta-corrections for isolation

2011 data

Primary Dataset date of run location on castor location on hercules JSON file
DoubleElectron_Run2011A-PromptReco-v1 1 Apr 2011 /castor/cern.ch/user/a/amassiro/NTUPLES/MiBiCommonNT/Run2011/Apr1 -- --
DoubleMu_Run2011A-PromptReco-v1 1 Apr /castor/cern.ch/user/a/amassiro/NTUPLES/MiBiCommonNT/Run2011/Apr1 -- --
ElectronHad_Run2011A-PromptReco-v1 1 Apr /castor/cern.ch/user/a/amassiro/NTUPLES/MiBiCommonNT/Run2011/Apr1 -- --
Jet_Run2011A-PromptReco-v1 1 Apr /castor/cern.ch/user/a/amassiro/NTUPLES/MiBiCommonNT/Run2011/Apr1 -- --
MultiJet_Run2011A-PromptReco-v1 1 Apr /castor/cern.ch/user/a/amassiro/NTUPLES/MiBiCommonNT/Run2011/Apr1 -- --
Photon_Run2011A-PromptReco-v1 1 Apr /castor/cern.ch/user/a/amassiro/NTUPLES/MiBiCommonNT/Run2011/Apr1 -- --
SingleElectron_Run2011A-PromptReco-v1 1 Apr /castor/cern.ch/user/a/amassiro/NTUPLES/MiBiCommonNT/Run2011/Apr1 -- --
SingleMu_Run2011A-PromptReco-v1 1 Apr /castor/cern.ch/user/a/amassiro/NTUPLES/MiBiCommonNT/Run2011/Apr1 -- --

Summer11 + May10thReReco

  • link to the google spreadsheet
  • DATI
castor /castor/cern.ch/user/d/dimatteo/MiBiCommonNT/DATA
cmsmi3 /media/dimatteo/Run2011A-May10ReReco-v1
  • MC
castor /castor/cern.ch/user/d/dimatteo/MiBiCommonNT/DATA (proprio questo, scusate la confusione)
cmsmi3 /media/dimatteo/Summer11
  • Reference Release: 423
  • Bugs in ......
  • Comments :
  • Requests for next iteration :
    • utilizzo di "offlinePrimaryVerticesBS" ?
    • salvare nuove info per ele (ok)
    • salvare nuove info per MET (AM)
    • nuovo metodo per fare L1 correction dei jet (AM)
    • pfObject based Isolation
    • salvataggio nuove variabili per muoni (ok)

-- LeonardoDiMatteo - 20-Oct-2010

Edit | Attach | Watch | Print version | History: r51 < r50 < r49 < r48 < r47 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r51 - 2011-10-04 - MartinaMalberti
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback