Penn Data Samples
Submitting the Trimming Job.
The Trimming script is located
~olivito/public/gridD3PDtest/egammaD3PDTrimmerForData.py
, In it you can find the branches that we are currently saving.
There is a separate script:
egammaD3PDTrimmerForMC.py
to be used on MC.
To run the trimming job locally do:
python egammaD3PDTrimmer.py `less fileList.txt`
where fileList.txt is a comma separated list of D3PDs to be trimmed.
To run the trimming job on the grid do:
- Setup the grid tools and pathena.
-
prun --exec "python egammaD3PDTrimmer.py %IN" --athenaTag=15.6.8 --inDS data10_7TeV.00154817.physics_L1Calo.merge.NTUP_EGAMMA.f255_p150/ --outDS user10.JohnAlison.00154817.L1Calo.NTUP_EGAMMA.f255.p150.trimmed.v3 --outputs "trimmed*root"

Skip's Tips: Putting
"trimmed*root"
in the outputs is very important if your subjobs create more than one file (which root automatically creates if the file size exceeds 1.9G). If you just list a specific filename, that's the only output you'll get. So if you're sure your output will be less than 1.9G, you can just use
trimmed.root
.

Skip's Tips: There's a limit of ~7G on the work directory size for each subjob. If it exceeds this, the job fails. This often happens when trying to trim MC datasets. One solution is to reduce the number of files per subjob with the
--nFilesPerJob <N>
argument.
When you get the output from the grid, if you used
"trimmed*root"
the ntuples will be zipped up in files like
*.trimmedXYZroot.tgz
. There's a script to untar all of these and name the ntuples uniquely:
~olivito/public/gridD3PDtest/untar_trimmed_ntuples.py <dir>
A list of current egamma D3PD datasets can be found on
this page. (not current anymore as of 24/05/10!)
At CERN
Because of a mistake in the command used for the grid jobs, we only got back a fraction of the statistics. We're now rerunning the trimming (and including HLT decision info). The status flag in the table tells whether the run has been reprocessed yet or not.
Data
MC
NOTE: for data on pc-penn-d-01, when this is mounted on another machine, the path is:
/project/data_d01_0/egammaD3PDs
At Penn
Data
MC
--
JohnAlisonUpenn - 18-May-2010
--
DominickOlivito - 17-May-2010