Re-Reconstruct CMSSW_1_2_0 samples with later (pre)releases
Complete:
Releases covered currently:
Purpose of this page
This wiki is prepared in order to allow developers/validators to re-reconstruct CMSSW_1_2_0 Samples (the famous Xmas production) with later versions.
A note about OnSel production
We realized that
OnSel samples with 12x were already produced with a 2-steps mechanism. Namely, the first process (named "P" in all the cases I looked at) was used for Gen+Sim, the second (named "Rec1") to digitize+reconstruct.
(Consider for example
/store/mc/2007/2/4/mc-onsel-120_PU_pp_muX-DIGI-RECO-NoPU/0003/48ACCA3D-51BB-DB11-BD7B-0018FE294022.root
)
This changes slightly the following instructions:
- the process used for Gen+Sim does not need to be known. Nothing to be dropped there
- the process used to process Digi+Reco is the ones which needs to be "dropped". So, "T" in the following should become "Rec1"
All the rest stays the same.
To correclty identify which is what, the process used for Gen+Sim should have products attached like
<Branch>PSimHits_g4SimHits_TrackerHitsTOBHighTof_P.</Branch>
while Digi+Reco has digis and reconstruction
<Branch>recoGenJets_midPointCone5GenJets__Rec1.</Branch>
CMSSW_1_3_X
please refer to CMSSW_1_3_0_pre5
CMSSW_1_3_0_pre6
please refer to CMSSW_1_3_0_pre5
CMSSW_1_3_0_pre5
No new tags are needed to reprocess with 130_pre5 120 samples. Only, 120-reco must be dropped as explained for 130_pre2.
So, here is the recipe:
- Get a 130pre5 release area
scramv1 p CMSSW CMSSW_1_3_0_pre5
Now you are ready to run re-reconstruction. A cfg is slighlty more complicated than the one needed to re-reconstruct with CMSSW_1_2_2. In particular, you
need to know the process name used during the 120 processing (in 130 we will have standard process names, so this is not going to be an issue; but for 120 processing this is the situation).
You can access the process name in many ways:
- look at the cfg used for the 120 production; the first line reads process T = {; T in this case is the process name you need.
- you can open the 120 root file with root and TBrowser; looking at the products list, all the products' names are like A_B_C_D; the last field, D in this case, is the process name
- you can run a cfg on the root file, scheduling only the module EventContentAnalyzer. You will again get a list of the products in the Event, like A_B_C_D; D is the process name.
- you can do a EdmProvDump filename.root. You will again get a list of the products in the Event, like A_B_C_D; D is the process name.
Once you know the process name, you can run a cfg like
----------------- begin : cut here --------------------------
process Rec1 = {
source = PoolSource {
untracked vstring fileNames = {
'/store/mc/2006/12/21/mc-physval-120-SingleMuPlus-Pt100/0000/78D178BA-B496-DB11-AA19-001560EDC951.root'
}
untracked int32 maxEvents = -1
}
include "Configuration/EventContent/data/EventContent.cff"
module RECO = PoolOutputModule {
untracked string fileName = 'reco.root'
using FEVTSIMEventContent
}
#
# drop all the INPUT stuff, BUT simulation
#
replace RECO.outputCommands += "drop *_*_*_T"
replace RECO.outputCommands += SimG4CoreFEVT.outputCommands
replace RECO.outputCommands += SimTrackerFEVT.outputCommands
replace RECO.outputCommands += SimMuonFEVT.outputCommands
replace RECO.outputCommands += SimCalorimetryFEVT.outputCommands
replace RECO.outputCommands += RecoGenJetsFEVT.outputCommands
include "Configuration/StandardSequences/data/FakeConditions.cff"
include "Configuration/StandardSequences/data/Reconstruction.cff"
path p1 = {reconstruction}
endpath outpath = {RECO}
}
----------------- end : cut here --------------------------
You need to edit three things here
- input file name
- output file name
- the process name you discovered before: change T in the next line with it:
replace RECO.outputCommands += "drop *_*_*_T"
In the output file (
reco.root here), you can find
- simulation products from previous processing with 120 (ending with _T)
- reconstruction products from reprocessing (ending with _Rec1)
...
<Branch>PixelDigiSimLinkedmDetSetVector_siPixelDigis__T.</Branch>
<Branch>PixelDigiedmDetSetVector_siPixelDigis__T.</Branch>
<Branch>RPCDetIdRPCDigiMuonDigiCollection_muonRPCDigis__T.</Branch>
<Branch>RPCDetIdRPCRecHitsOwnedRangeMap_rpcRecHits__Rec1.</Branch>
<Branch>SiPixelClusteredmDetSetVector_siPixelClusters__Rec1.</Branch>
<Branch>SiStripClusteredmDetSetVector_siStripClusters__Rec1.</Branch>
<Branch>SiStripDigiedmDetSetVector_siStripDigis__T.</Branch>
...
CMSSW_1_3_0_pre4
A few tags are needed in pre4 to fix known problems. In addition, 120-reco must be dropped as explained for 130_pre2.
So, here is the recipe:
- Get a 130pre4 release area
scramv1 p CMSSW CMSSW_1_3_0_pre4
cd CMSSW_1_3_0_pre4/src
cvs co -r V00-05-04 RecoBTau/JetTracksAssociator
cvs co -r V00-04-03 DataFormats/BTauReco
scramv1 b
Now you are ready to run re-reconstruction. A cfg is slighlty more complicated than the one needed to re-reconstruct with CMSSW_1_2_2. In particular, you
need to know the process name used during the 120 processing (in 130 we will have standard process names, so this is not going to be an issue; but for 120 processing this is the situation).
You can access the process name in many ways:
- look at the cfg used for the 120 production; the first line reads process T = {; T in this case is the process name you need.
- you can open the 120 root file with root and TBrowser; looking at the products list, all the products' names are like A_B_C_D; the last field, D in this case, is the process name
- you can run a cfg on the root file, scheduling only the module EventContentAnalyzer. You will again get a list of the products in the Event, like A_B_C_D; D is the process name.
Once you know the process name, you can run a cfg like
----------------- begin : cut here --------------------------
process Rec1 = {
source = PoolSource {
untracked vstring fileNames = {
'/store/mc/2006/12/21/mc-physval-120-SingleMuPlus-Pt100/0000/78D178BA-B496-DB11-AA19-001560EDC951.root'
}
untracked int32 maxEvents = -1
}
include "Configuration/EventContent/data/EventContent.cff"
module RECO = PoolOutputModule {
untracked string fileName = 'reco.root'
using FEVTSIMEventContent
}
#
# drop all the INPUT stuff, BUT simulation
#
replace RECO.outputCommands += "drop *_*_*_T"
replace RECO.outputCommands += SimG4CoreFEVT.outputCommands
replace RECO.outputCommands += SimTrackerFEVT.outputCommands
replace RECO.outputCommands += SimMuonFEVT.outputCommands
replace RECO.outputCommands += SimCalorimetryFEVT.outputCommands
replace RECO.outputCommands += RecoGenJetsFEVT.outputCommands
include "Configuration/StandardSequences/data/FakeConditions.cff"
include "Configuration/StandardSequences/data/Reconstruction.cff"
path p1 = {reconstruction}
endpath outpath = {RECO}
}
----------------- end : cut here --------------------------
You need to edit three things here
- input file name
- output file name
- the process name you discovered before: change T in the next line with it:
replace RECO.outputCommands += "drop *_*_*_T"
In the output file (
reco.root here), you can find
- simulation products from previous processing with 120 (ending with _T)
- reconstruction products from reprocessing (ending with _Rec1)
...
<Branch>PixelDigiSimLinkedmDetSetVector_siPixelDigis__T.</Branch>
<Branch>PixelDigiedmDetSetVector_siPixelDigis__T.</Branch>
<Branch>RPCDetIdRPCDigiMuonDigiCollection_muonRPCDigis__T.</Branch>
<Branch>RPCDetIdRPCRecHitsOwnedRangeMap_rpcRecHits__Rec1.</Branch>
<Branch>SiPixelClusteredmDetSetVector_siPixelClusters__Rec1.</Branch>
<Branch>SiStripClusteredmDetSetVector_siStripClusters__Rec1.</Branch>
<Branch>SiStripDigiedmDetSetVector_siStripDigis__T.</Branch>
...
CMSSW_1_3_0_pre3
With CMSSW_1_3_0_pre3 no additional tags are needed to reprocess 120 data. Only, 120-reco must be dropped as explained for 130_pre2.
So, here is the recipe:
- Get a 130pre3 release area
scramv1 p CMSSW CMSSW_1_3_0_pre3
Now you are ready to run re-reconstruction. A cfg is slighlty more complicated than the one needed to re-reconstruct with CMSSW_1_2_2. In particular, you
need to know the process name used during the 120 processing (in 130 we will have standard process names, so this is not going to be an issue; but for 120 processing this is the situation).
You can access the process name in many ways:
- look at the cfg used for the 120 production; the first line reads process T = {; T in this case is the process name you need.
- you can open the 120 root file with root and TBrowser; looking at the products list, all the products' names are like A_B_C_D; the last field, D in this case, is the process name
- you can run a cfg on the root file, scheduling only the module EventContentAnalyzer. You will again get a list of the products in the Event, like A_B_C_D; D is the process name.
Once you know the process name, you can run a cfg like
process Rec1 = {
include "Configuration/StandardSequences/data/Reconstruction.cff"
source = PoolSource {
untracked vstring fileNames = {
'/store/mc/2006/12/21/mc-physval-120-SingleMuPlus-Pt100/0000/78D178BA-B496-DB11-AA19-001560EDC951.root'
}
untracked int32 maxEvents = -1
}
include "Configuration/EventContent/data/EventContent.cff"
module RECO = PoolOutputModule {
untracked string fileName = 'reco.root'
using FEVTSIMEventContent
}
#
# drop all the INPUT stuff, BUT simulation
#
replace RECO.outputCommands += "drop *_*_*_T"
replace RECO.outputCommands += SimG4CoreFEVT.outputCommands
replace RECO.outputCommands += SimTrackerFEVT.outputCommands
replace RECO.outputCommands += SimMuonFEVT.outputCommands
replace RECO.outputCommands += SimCalorimetryFEVT.outputCommands
replace RECO.outputCommands += RecoGenJetsFEVT.outputCommands
path p1 = {reconstruction}
endpath outpath = {RECO}
}
You need to edit three things here
- input file name
- output file name
- the process name you discovered before: change T in the next line with it:
replace RECO.outputCommands += "drop *_*_*_T"
In the output file (
reco.root here), you can find
- simulation products from previous processing with 120 (ending with _T)
- reconstruction products from reprocessing (ending with _Rec1)
...
<Branch>PixelDigiSimLinkedmDetSetVector_siPixelDigis__T.</Branch>
<Branch>PixelDigiedmDetSetVector_siPixelDigis__T.</Branch>
<Branch>RPCDetIdRPCDigiMuonDigiCollection_muonRPCDigis__T.</Branch>
<Branch>RPCDetIdRPCRecHitsOwnedRangeMap_rpcRecHits__Rec1.</Branch>
<Branch>SiPixelClusteredmDetSetVector_siPixelClusters__Rec1.</Branch>
<Branch>SiStripClusteredmDetSetVector_siStripClusters__Rec1.</Branch>
<Branch>SiStripDigiedmDetSetVector_siStripDigis__T.</Branch>
...
CMSSW_1_3_0_pre2
Here the situation is a bit more nasty: several
DataFormats are different in 12x and 13x series, and hence must be dropped (so cannot be copied to the new file; you will only have the
Rec1 products for reconstruction).
Here is the recipe:
- Get a 130pre2 release area
scramv1 p CMSSW CMSSW_1_3_0_pre2
- checkout and compile the following packages
cd CMSSW_1_3_0_pre2/src
cvs co -r V00-02-12-04 RecoEgamma/EgammaElectronAlgos
cvs co -r V00-04-08-01 RecoEgamma/EgammaElectronProducers
cvs co -r V00-06-07 RecoMuon/GlobalTrackFinder
scramv1 b
Now you are ready to run re-reconstruction. A cfg is slighlty more complicated than the one needed to re-reconstruct with CMSSW_1_2_2. In particular, you
need to know the process name used during the 120 processing (in 130 we will have standard process names, so this is not going to be an issue; but for 120 processing this is the situation).
You can access the process name in many ways:
- look at the cfg used for the 120 production; the first line reads process T = {; T in this case is the process name you need.
- you can open the 120 root file with root and TBrowser; looking at the products list, all the products' names are like A_B_C_D; the last field, D in this case, is the process name
- you can run a cfg on the root file, scheduling only the module EventContentAnalyzer. You will again get a list of the products in the Event, like A_B_C_D; D is the process name.
Once you know the process name, you can run a cfg like
process Rec1 = {
include "Configuration/StandardSequences/data/Reconstruction.cff"
source = PoolSource {
untracked vstring fileNames = {
'/store/mc/2006/12/21/mc-physval-120-SingleMuPlus-Pt100/0000/78D178BA-B496-DB11-AA19-001560EDC951.root'
}
untracked int32 maxEvents = -1
}
include "Configuration/EventContent/data/EventContent.cff"
module RECO = PoolOutputModule {
untracked string fileName = 'reco.root'
using FEVTSIMEventContent
}
#
# drop all the INPUT stuff, BUT simulation
#
replace RECO.outputCommands += "drop *_*_*_T"
replace RECO.outputCommands += SimG4CoreFEVT.outputCommands
replace RECO.outputCommands += SimTrackerFEVT.outputCommands
replace RECO.outputCommands += SimMuonFEVT.outputCommands
replace RECO.outputCommands += SimCalorimetryFEVT.outputCommands
replace RECO.outputCommands += RecoGenJetsFEVT.outputCommands
path p1 = {reconstruction}
endpath outpath = {RECO}
}
You need to edit three things here
- input file name
- output file name
- the process name you discovered before: change T in the next line with it:
replace RECO.outputCommands += "drop *_*_*_T"
In the output file (
reco.root here), you can find
- simulation products from previous processing with 120 (ending with _T)
- reconstruction products from reprocessing (ending with _Rec1)
...
<Branch>PixelDigiSimLinkedmDetSetVector_siPixelDigis__T.</Branch>
<Branch>PixelDigiedmDetSetVector_siPixelDigis__T.</Branch>
<Branch>RPCDetIdRPCDigiMuonDigiCollection_muonRPCDigis__T.</Branch>
<Branch>RPCDetIdRPCRecHitsOwnedRangeMap_rpcRecHits__Rec1.</Branch>
<Branch>SiPixelClusteredmDetSetVector_siPixelClusters__Rec1.</Branch>
<Branch>SiStripClusteredmDetSetVector_siStripClusters__Rec1.</Branch>
<Branch>SiStripDigiedmDetSetVector_siStripDigis__T.</Branch>
...
CMSSW_1_2_2
With this release, reprocessing is quite easy, since no change in dataformats is present.
Only two issues are present
- one due to the fact that a wrong getByType is used in Electron code.
- the second due to the impossibility to reproduce recoGenJets ("source" missing from the Event)
So, here it is:
scramv1 project CMSSW CMSSW_1_2_2
- get some packages (the ones you need to change)
cd CMSSW_1_2_2/src
project CMSSW
cvs co -r V00-02-12-03 RecoEgamma/EgammaElectronAlgos
cvs co -r V00-00-06-01 Configuration/StandardSequences
scramv1 b
Now you are ready to run re-reconstruction. A simple cfg allowing this is shown here
process Rec1 = {
include "Configuration/StandardSequences/data/Reconstruction.cff"
source = PoolSource {
untracked vstring fileNames = {
'/store/mc/2006/12/21/mc-physval-120-SingleMuPlus-Pt100/0000/78D178BA-B496-DB11-AA19-001560EDC951.root'
}
untracked int32 maxEvents = -1
}
include "Configuration/EventContent/data/EventContent.cff"
module RECO = PoolOutputModule {
untracked string fileName = 'reco.root'
using FEVTSIMEventContent
}
path p1 = {reconstruction}
endpath outpath = {RECO}
}
You have only to change the input file and output file names.
Please note that this cfg does
not remove the old reconstruction from the root file. Hence, at the end it it, all the reconstructed product are duplicated in the Event. The newest ones (in this case with name ending with
Rec1) are the ones to be used.
For example, this is a part of the Event Content after re-reconstruction:
...
<Branch>recoTracks_globalMuons__Rec1.</Branch>
<Branch>recoTracks_globalMuons__T.</Branch>
<Branch>recoTracks_pixelTracks__Rec1.</Branch>
<Branch>recoTracks_pixelTracks__T.</Branch>
<Branch>recoTracks_standAloneMuons__Rec1.</Branch>
<Branch>recoTracks_standAloneMuons__T.</Branch>
...
The products ending with
Rec1 are the new ones, those ending with
T are the ones from 120 processing (we used
process T there).
CMSSW_1_2_3 and following
With this release, reprocessing is quite easy, since no change in dataformats is present.
So, here it is:
scramv1 project CMSSW CMSSW_1_2_3
Now you are ready to run re-reconstruction. A simple cfg allowing this is shown here
process Rec1 = {
include "Configuration/StandardSequences/data/Reconstruction.cff"
source = PoolSource {
untracked vstring fileNames = {
'/store/mc/2006/12/21/mc-physval-120-SingleMuPlus-Pt100/0000/78D178BA-B496-DB11-AA19-001560EDC951.root'
}
untracked int32 maxEvents = -1
}
include "Configuration/EventContent/data/EventContent.cff"
module RECO = PoolOutputModule {
untracked string fileName = 'reco.root'
using FEVTSIMEventContent
}
path p1 = {reconstruction}
endpath outpath = {RECO}
}
You have only to change the input file and output file names.
Please note that this cfg does
not remove the old reconstruction from the root file. Hence, at the end it it, all the reconstructed product are duplicated in the Event. The newest ones (in this case with name ending with
Rec1) are the ones to be used.
For example, this is a part of the Event Content after re-reconstruction:
...
<Branch>recoTracks_globalMuons__Rec1.</Branch>
<Branch>recoTracks_globalMuons__T.</Branch>
<Branch>recoTracks_pixelTracks__Rec1.</Branch>
<Branch>recoTracks_pixelTracks__T.</Branch>
<Branch>recoTracks_standAloneMuons__Rec1.</Branch>
<Branch>recoTracks_standAloneMuons__T.</Branch>
...
The products ending with
Rec1 are the new ones, those ending with
T are the ones from 120 processing (we used
process T there).
Digi and Reco reprocessing with CMSSW_1_3_X
No new tags are needed to reprocess with 13X 120 samples. Only, 120-reco and 120-digi must be dropped as explained for 130_pre2.
So, here is the recipe:
scramv1 p CMSSW CMSSW_1_3_X
Now you are ready to run re-reconstruction and re-digitization. A cfg is slighlty more complicated than the one needed to re-reconstruct with CMSSW_1_2_2. In particular, you
need to know the process name used during the 120 digi and reco processing (in 130 we will have standard process names, so this is not going to be an issue; but for 120 processing this is the situation).
You can access the process name in many ways:
- look at the cfg used for the 120 production; the first line reads process T = {; T in this case is the process name you need.
- you can open the 120 root file with root and TBrowser; looking at the products list, all the products' names are like A_B_C_D; the last field, D in this case, is the process name
- you can run a cfg on the root file, scheduling only the module EventContentAnalyzer. You will again get a list of the products in the Event, like A_B_C_D; D is the process name.
- you can do a EdmProvDump filename.root. You will again get a list of the products in the Event, like A_B_C_D; D is the process name.
Once you know the process name, you can run a cfg like (this is just
ReleaseValidation/data/digi-reco-131.cfg with the addition of drop commands)
----------------- begin : cut here --------------------------
process Rec1 = {
include "Configuration/StandardSequences/data/Reconstruction.cff"
include "Configuration/StandardSequences/data/Simulation.cff"
#
# choose PileUp!
#
include "Configuration/StandardSequences/data/MixingNoPileUp.cff"
include "Configuration/StandardSequences/data/FakeConditions.cff"
source = PoolSource {
untracked vstring fileNames = {'file:sim.root'}
untracked int32 maxEvents =-1
}
include "Configuration/EventContent/data/EventContent.cff"
module FEVT = PoolOutputModule {
untracked string fileName = 'reco.root'
using FEVTSIMEventContent
# using RECOSIMEventContent
# using AODSIMEventContent
untracked PSet dataset ={
untracked string dataTier = "GEN-SIM-DIGI-RECO"
}
}
#
# drop the old reconstrcuction and digitization - here it is assumed that digi + reco with 120 was done in one go
#
replace RECO.outputCommands += "drop *_*_*_T"
replace RECO.outputCommands += SimG4CoreFEVT.outputCommands
replace RECO.outputCommands += RecoGenJetsFEVT.outputCommands
replace RECO.outputCommands += "drop *_trackingtruth_*_T"
# run digitization before reconstruction
path p1 = {pdigi,reconstruction}
endpath outpath = {FEVT}
}
----------------- end : cut here --------------------------
You need to edit three things here
- input file name
- output file name
- the process name you discovered before: change T in the next lines with it:
replace RECO.outputCommands += "drop *_*_*_T"
replace RECO.outputCommands += "drop *_trackingtruth_*_T"
In the output file (
reco.root here), you can find
- simulation products (SimHits, SimTracks, SimVertices, NOT digis!) from previous processing with 120 (ending with _T)
- digitization and reconstruction products from reprocessing (ending with _Rec1)
Review Status
Responsible: Main.tboccali
Last reviewed by: Reviewer