Top Ntuple Analysis Package

  • CURRENT SVN: svn co svn+ssh://svn.cern.ch/reps/atlas-childers/childers/toparea/TopNtupleAnalysis, Browsable here

export SVNUSR="svn+ssh://svn.cern.ch/reps/atlasusr"
svn co $SVNUSR/childers/toparea/TopNtupleAnalysis/ TopNtupleAnalysis
  • The unfolding code:
  • CURRENT SVN: svn co svn+ssh://svn.cern.ch/reps/atlas-childers/childers/unfolding, Browsable here

  • SVN before the USER swap:
export SVNUSR="svn+ssh://svn.cern.ch/reps/atlasusr"
svn co $SVNUSR/childers/unfolding unfolding
cd unfolding/scripts ; source setup.sh ; cd ../ ; make clean ; make rooclean ; make
Mind you might first want to change the path to your own ROOT trunk or to e.g. AFS's /afs/cern.ch/sw/lcg/app/releases/ROOT/5.34.00/x86_64-slc5-gcc43-dbg/root/ in the setup script. Taylor uses trunk, which has some MEM fixes found by him, to prevent mem leaks when running toys.

svn co svn+ssh://svn.cern.ch/reps/atlasgrp/Institutes/CERN/TopPhysics/TopNtupleAnalysis/trunk TopNtupleAnalysis

The main branches:

  • 01-00-00 = old package as it was before the reorganization
  • 02-00-00 = old packages with no code changes, but with RootCore setup
  • trunk = currently in development, to run over mini ntuples by Ian
tags:
  • 01-00-00 = same as branch 01-00-00, any developments of that branch should have tags with versions 01-XX-YY
  • 02-00-00 = same as branch 02-00-00, any developments of that branch should have tags with versions 02-XX-YY

Compilation of branch trunk, which runs over mini trees by Ian:

Tested only on the ui5.farm.particle.cz in Prague. The package runs with RootCore-00-00-33 and TopRootCoreRelease-00-00-14. For installation using TopRootCoreRelease-00-00-17 or trunk see below the Alternative Installation Procedure.

1. set TOP_INST_PATH

export TOP_INST_PATH=/path/of/installation
Note: it's best to just add this to your .bashrc

2. checkout TopNtupleAnalysis package

cd $TOP_INST_PATH
svn co svn+ssh://svn.cern.ch/reps/atlasgrp/Institutes/CERN/TopPhysics/TopNtupleAnalysis/trunk TopNtupleAnalysis

source $TOP_INST_PATH/TopNtupleAnalysis/scripts/setup.sh
use the prague argument at ui5 at prague:
source $TOP_INST_PATH/TopNtupleAnalysis/scripts/setup.sh prague

3. checkout and compile RootCore, BAT and KLFitter (~5 min)

$TOP_INST_PATH/TopNtupleAnalysis/scripts/RootCoreKLFitterAndBat.sh

4. checkout and compile other packages (~18 min)

$TOP_INST_PATH/TopNtupleAnalysis/scripts/checkoutRootCoreRelAndBuildAll.sh

If there is no error, it looks good. Log out and log in.

If there was an error in steps 2, 3 or 4 (outside ui5.farm.particle.cz), probably something wrong is in $TOP_INST_PATH/TopNtupleAnalysis/scripts/setup.sh. Try to fix it!

5. IMPORTANT: When you log in next time, run the setup script to use the package:

source $TOP_INST_PATH/TopNtupleAnalysis/scripts/setup.sh
(make sure TOP_INST_PATH is already defined) Note: if you are on the prague farm use:
source $TOP_INST_PATH/TopNtupleAnalysis/scripts/setup.sh prague

Try to run ./TopNtupleAnalysis/bin/top_x and if it says "And: Have fun! ;-)" at the end, everything went right and the package works. Now, see the sections below to get mini trees, make links, prepare filelists and scripts for running.

Recommendation for mc11c

If you are using MC mc11c, then you need to read recommendation from Ian README. Here are the steps how to easily do it:

You need the new versions of additional packages. You can try install our package in a new directory and in step 3 firstly checkout TopRootCoreRelease, then change TopRootCoreRelease/share/packages.txt to packages.txt and then download them using build-all.sh. You don;t have to compile them, because you need to change the file TopD3PDCorrections/TopD3PDCorrections/QCDMMScale.h. Replace MethodB(2) by MethodB(3). Then compile using build-all.sh.

Or you can change your current instalation of package: delete additional packages (not TopNTupleAnalysis, KLFitter, BAT, RootCore, TopRootCoreRelease). Copy the file packages.txt to TopRootCoreRelease/share and then:

rm TopNtupleAnalysis/obj/*
cd TopRootCoreRelease/share
./build-all.sh
and as written above do the change in TopD3PDCorrections/TopD3PDCorrections/QCDMMScale.h

If you want to use the MV1 tagger, don't forget to set it in configs/analysis_cuts.cfg

Alternative Installation procedure

export TOP_INST_PATH=$YOURPATH # e.g. $PWD
cd $TOP_INST_PATH
# Set your CERN_USER environment variable
NEW:
<verbatim>
export SVNUSR="svn+ssh://svn.cern.ch/reps/atlasusr"
svn co $SVNUSR/childers/toparea/TopNtupleAnalysis/ TopNtupleAnalysis
</verbatim>
OLD:
svn co svn+ssh://$CERN_USER@svn.cern.ch/reps/atlasgrp/Institutes/CERN/TopPhysics/TopNtupleAnalysis/trunk TopNtupleAnalysis

in Bologna:
$TOP_INST_PATH/TopNtupleAnalysis/scripts/setup.sh T3_BO # or other top site
in Prague:
$TOP_INST_PATH/TopNtupleAnalysis/scripts/setup.sh prague
$TOP_INST_PATH/TopNtupleAnalysis/scripts/RootCoreKLFitterAndBat.sh
svn co svn+ssh://$CERN_USER@svn.cern.ch/reps/atlasoff/PhysicsAnalysis/TopPhys/TopRootCoreRelease/tags/TopRootCoreRelease-11-00-00-05 TopRootCoreRelease

cd TopRootCoreRelease/share
./build-all.sh # keep fingers crossed

We'll provide a script for this asap!

For Bologna users only

Setup ROOT version > 5.28/00 . I suggest to setup ROOT via AFS:

export PATH="/afs/cern.ch/sw/lcg/external/Python/2.6.5/x86_64-slc5-gcc43-opt/bin:$PATH"
export LD_LIBRARY_PATH="/afs/cern.ch/sw/lcg/external/Python/2.6.5/x86_64-slc5-gcc43-opt/lib:$LD_LIBRARY_PATH"
. /afs/cern.ch/sw/lcg/external/gcc/4.3.2/x86_64-slc5/setup.sh
. /afs/cern.ch/sw/lcg/app/releases/ROOT/5.30.04/x86_64-slc5-gcc43-opt/root/bin/thisroot.sh

You can put these lines in your $HOME/.bashrc.

On our T3 user interface (uibo-atlas-01.cr.cnaf.infn.it) the lixml2 library is installed, but its headers are not. During the installation process two packages (namely GoodRunLists and TopD3PDAnalysis) fail the compilation. You have to modify their makefiles manually:

In file GoodRunsLists/cmt/Makefile.RootCore modify:

PACKAGE_CXXFLAGS = -I/usr/include/libxml2 -I/home/ATLAS/disipio/local/include/libxml2
INCLUDES +=  -I/usr/include/libxml2

In file TopD3PDAnalysis/cmt/Makefile.RootCore modify:

PACKAGE_CXXFLAGS = -g -I/home/ATLAS/disipio/local/include/libxml2
PACKAGE_BINFLAGS = -g -lxml2 -lXMLParser -L/home/ATLAS/disipio/local/lib

If you don't want to get these headers from my location (or you can't), this library can be fetched from xmlsoft.org.

Documentation (old):

cd $TOP_INST_PATH/TopNtupleAnalysis/doc/
./doall.sh
gv documentation.ps 
and this TWiki.

Something for package compilation: less doc/README_2012-01-01.txt

Location of the latest mini trees by Ian

  • To merge ntuples from Ian, use scripts/allSubdirectoriesMerge.sh (on iberis01 only!)
  • Merged at ui5.farm.particle.cz:/raid7_atlas2/qitek/iwatson/truth-20121214/
  • Note on runnig btagsfup/btagsfdown: just make two such links in your input dir to nominal, and the file list generator and automatic systematic recognition from the input file should pick up the right SF variation;) Should be automatic with running allSubdirectoriesMerge.sh
  • Dec2012: /raid7_atlas2/qitek/iwatson/minituples-20121128/

  • Nov2012:
  • New ele/jet removal (ttj-style) are being copied directly to Prague and Bologna independently, not located on starbucks-sfu anymore. At Prague, see ui5.farm.particle.cz:/raid7_atlas2/qitek/iwatson/minituples-20121101/

  • Aug2012:
  • Full MC Will_*T1 and AlpgenJimmytt* nominal stuff and updated flavour-dep JES: ui5.farm.particle.cz:/raid7_atlas2/qitek/iwatson/minituples-20120824 , starbuck-sfu.cern.ch:/scratch/iwatson/minituples-20120824
  • All other MCs and data: merged at ui5.farm.particle.cz:/raid7_atlas2/qitek/iwatson/minituples-20120805-met-1-9/ with special loose1 for tagged at ui5.farm.particle.cz:/raid7_atlas2/qitek/iwatson/minituples-20120816-loose1/
  • Non-merged: starbuck-sfu.cern.ch:/scratch/iwatson/minituples-20120805-met-1-9/ with special loose1 for tagged at starbuck-sfu.cern.ch:/scratch/iwatson/minituples-20120816-loose1/

  • June 28th + fixed el/loose/ (Jul 3rd): Non-merged: starbuck-sfu.cern.ch:/scratch/iwatson/minituples-20120628/
  • Merged at ui5.farm.particle.cz:/raid7_atlas2/qitek/iwatson/minituples-20120628/
  • May 29th: Non-merged: starbuck-sfu.cern.ch:/scratch/iwatson/minituples-20120529
  • Merged at ui5.farm.particle.cz:/raid7_atlas2/qitek/iwatson/minituples-20120529/
  • May 18th: Non-merged: starbuck-sfu.cern.ch:/scratch/iwatson/minituples-20120518
  • Merged at ui5.farm.particle.cz:/raid7_atlas2/qitek/iwatson/minituples-20120518/

HOW TO RUN in Prague PBS Queue

Before first running the binary ./bin/top_x

One has to prepare links to config files with .C ending:
cd $TOP_INST_PATH/TopNtupleAnalysis/configs
./MakeLinks.sh

Create directory for output files with histograms.

cd $TOP_INST_PATH/TopNtupleAnalysis
mkdir output # or make this a link to some large writable storage area

On the Prague farm, if you want the possibility to submit PBS jobs, do (you have to repeat it after some time):

source ./scripts/setup_voms.sh

Premerge the mini-trees we receive:

Change path pathToRootFiles to point to core directory (containing el/ and mu/ subdirs), then switch to fast high-mem machine ssh iberis01 and run
cd scripts/ ; ./allSubdirectoriesMerge.sh
(which uses merge.sh)

Filelist scripts

FIRST, run cd filelists/; ./GenSystCode.py to make links to nominal, where the links names will encode the syst name to be correctly picked up by TopAnalysis.cxx (like SF variations, which don't have specific root files: btag eff, wjets...).

THEN, make sure you SWITCH ON THOSE SYSTEMATIC YOU WANT in scripts/mypythonbase.py You need to prepare filelists, because a filelist is an input for TopNtupleAnalysis/bin/top_x. A filelist can contain one or more paths to root files. MC signal filelist contains two root files. Filelists preparation (outside Prague farm change paths in the script!):

cd $TOP_INST_PATH/TopNtupleAnalysis/filelists
./MakeFileLists.py    #change the basedir in it

You need to calculate the MC signal split for subsamples using

cd scripts/ ; python GetMCentriesForSubsamples.py
Then change the split values in
./scripts/goliSendMCSigJobs_subsamples.py

Interactive testing at Prague

Now, you can prepare scripts for running or prepare submit scripts to send PBS jobs. To prepare scripts to run locally for TEST at iberis01 ONLY, not for production, do:
cd $TOP_INST_PATH/TopNtupleAnalysis
./scripts/test/SendMCSigJobs.py       # prepares scripts for running full MC signal, they will be prepared in $TOP_INST_PATH/TopNtupleAnalysis (you can change it)
./scripts/test/SendMCSigJobs_subsamples.py              # prepares scripts for running MC signal subsamples useful for unfolding
./scripts/test/SendMCBGJobs.py    # prepares scripts for running all MC backgrounds
./scripts/test/SendDataJobs.py    # prepares scripts for running all data. Updated recently (if not working, try "svn update")
ls -lart and grep a line from execution scripts for a subsample you want to test/debug.

Jobs submission at Prague

First, ensure
source ./scripts/setup_voms.sh
Next, to prepare submit scripts to send PBS jobs on Prague farm, do this on ui5 Make sure you SWITCH ON ONLY THOSE SYSTEMATIC YOU WANT in scripts/mypythonbase.py Even if you have more than you need, it's OK, ot only affects generating submission scripts, not the actual jobs submission. Then prepare the submit scripts:
cd $TOP_INST_PATH/TopNtupleAnalysis
./scripts/goliSendMCSigJobs_subsamples.py              # prepares submit scripts for running MC signal subsamples used in unfolding
./scripts/goliSendJobs.py    # prepares submit scripts for background and data for all systematics. 
The submit scripts after running ./scripts/goliSendMCSigJobs_subsamples.py or ./scripts/SendGoliJobs.py will be in $TOP_INST_PATH/TopNtupleAnalysis/scripts/submit/ To submit jobs do:
cd $TOP_INST_PATH/TopNtupleAnalysis
./scripts/Qsub.sh 'expression'   # CAREFUL! this really sends PBS jobs, and for nominal, this is about 500 jobs!;)
All submit scripts which contain the 'expression' will be submitted. For details see the script ./scripts/Qsub.sh. The output root files with histograms will be in /raid7_atlas2/$USER/root/"systematics". Check the log file in /raid7_atlas2/$USER/logs/ to see, if everything went OK. If you run locally, your output root files should be in $TOP_INST_PATH/TopNtupleAnalysis/output.

EXAMPLES:

./scripts/Qsub.sh nominal   # sends all data, MC sig and BG, DOES NOT send QCD!
./scripts/Qsub.sh _submit.subsample.MC.*nominal   # sends only MC signal and dilepton subsamples
./scripts/Qsub.sh loose    # for running QCD in mujets
./scripts/Qsub.sh JetEle   # for running over the special sample for QCD in ejets

Jobs diagnostics!

./scripts/Qsub.sh >_sub_log.txt if you want to check which jobs were skipped due to already existing files; see and check SkipAlreadyOutput=1 in the Qsub.sh script!

You should also see _outfilelistForChecks.txt with the expected output file lists created (before the merge!), which you can then check with the actual output present by ./scripts/DiagnoseSubmissionLog.sh This also checks for bus errors in logfiles, which leave output file there, but of a small size. You're also invited to go to our output dir and check ls -Slh *.root for root files of small sizes at the end of the list.

Where is my output?

Should be at /raid7_atlas2/${USER}/root/*/ This is hardcoded in central scripts config file mypythonbase.py and (unfortunately:) also in GenSubmitScriptGoli.sh and GenSubmitScriptGoli_subsamples.sh.

Jobs Monitoring in the PBS queue:

./logs/qs.sh
./logs/Qstat.sh
./logs/MyQstat.sh
Deleting all your jobs (DANGEROUS:)
./logs/QdelAll.sh
Deleting one job:
qdel JOBID

Jobs Output Merge

Make sure you SWITCH ON ONLY THOSE SYSTEMATIC YOU WANT in scripts/mypythonbase.py
cd $TOP_INST_PATH/TopNtupleAnalysis
./scripts/goliHadd.py    # PBS job on Prague farm. Merges full MC signal, BG and data for all systematics. You can change it to merge only some systematics.
./scripts/goliHadd_co.py    # optional, to make combined co=el+mu for TopPlots
./scripts/goliMergeMCToHalves.py    # PBS job on Prague farm. Merges subsamples to two halves (for unfolding)  for all systematics. You can change it to merge only some systematics. 
Only for local testing or for running on iberis01, you can use the interactive scripts
./scripts/test/Hadd.py    # merges full MC signal, BG and data locally.
./scripts/test/MergeMCToHalves.py    # merge subsamples to two halves (for unfolding) locally. 

Special links for Wjets systs!

Now you need to make links to all non-W samples in Wjets syst dirs, in order to run TopPlots properly:
cd scripts/; ./MakeNonWlinksInOutpuDirs.py

Cut Flow

  • Cut flow TeX table generation: (JK) Modify the path in scripts/cutflow/RunCutFlow2tex.sh, which internally runs scripts/cutflow/cutFlow2tex.py
  • top pT in bins and in samples composition: ./scripts/cutflow/CheckHistoSampleComposition.sh (jk, 13.1.2013)
  • Efficiency flow (only of cuts over the mini trees!) ./scripts/cutflow/RunEffFlow.sh (jk, 18.1.2013)
  • Running everything, also Full cut flow: ./scripts/cutflow/RunCutFlow2tex.sh; ./scripts/cutflow/RunEffFlow.sh ; ./scripts/cutflow/RunFullCutFlow.sh ; ./scripts/cutflow/RunFullEffFlow.sh (jk 31.1.2013)

Migration matrix plotting

Edit macros/RunMigra.sh to loop over your ttbar signal root files and run it as
  •  cd macros/ ; ./MakeLinks.sh ; ./RunMigra.sh 
    See output like Eps_MC_el_T1_McAtNlo_Jimmy_nominal/Kinem_TopFit_NTag1_Perm0Top1RecoVsPerm0Top1Truth_Pt2VsPt1_migra_text.eps
  • Plotting the resolution:
cd macros/config/ ./GenDiagNominalLhoodRecoCmp.py ./GenResoNominalLhoodRecoCmp.py and run the commands;-)

Efficiency correction

  • Again, verify the systematics list you want to derive the efficiency for as stored in scripts/mypythonbase.py.
  • Next, run the jobs which produce the parton after cuts with all SF and weights except for the luminosity weight:
  • ./scripts/SendGoliJobs_SigForEff.py ; ./scripts/Qsub.sh Eff
  • Then run the efficienies computation itself:
     cd macros/efficiency/ ; ./RunEfficiency.py 
    (this also merges all Alpgen ttbaer NpX jobs and makes total Alpgen efficiency)
  • The output will appear at /raid7_atlas2/${USER}/root/Efficiency/ as governed by scripts/mypythonbase.py>/code>
  • Move it to a standard structure for the unfolding team by running cd scripts ; ./scripts/moveEfficiencyFiles.py

Check Inputs For Unfolding!

  • Check all inputs needed for unfolding: ./scripts/CheckUnfoldingInputs.sh

Downloading stuff from ui5

  • ...to lxplus/laptop to run the unfolding:
  • around 10GB needed: you can either have the TopNtupleAnalysis package locally and use the following script, or just use this script, and it still should get the results, but make sure you list in there all systematics you want to download!
  • ./scripts/GetFromUi5.py
  • You can copy to AFS work space, which should be /afs/cern.ch/work/${YOURFIRSTLETTER}/${USER}/, which you can ask for (20GB!) at CERN Accounts -> Applicatoin and resources Tab -> Linux and AFS, or simply at My Services wink

Theory Plots Preparation

  • Plotting the MCFM prediction for top pT in our binning from Lorenzo:
  • cd /macros/theory/ ; ./MakeLinks.sh ; root -l MakeTheoryHistoMCFM.C+ ...and see the resulting root file;-)
  • Rebinning the approx. NNLO theory from Kidonakis, parsed by Kelsey:
  • cd /macros/theory/ ; ./RunTheory.sh ...and see the resulting root files;-)
  • Standard MC@NLO prediction can be obtained from the merged nomnal standard signal files MC_??_T1_McAtNlo_Jimmy_nominal.root
    Example histogram is Kinem/TopFit_NTag1/GeneratedTop1/GeneratedTop1PtSpectrum
    Divide by bin width, and scale by Eff/Lumi/BR = 0.0371134 / 4713.11 / 0.543 = 1.7230860497428916e-05
    where Eff is the MC lumi weight factor = lumiData / lumiMC. LumiMC can be obtained from Root/MCLum.cxx
  • Script to plot and compare predictions side-by-side:
    cd macros/ ; root -l 'CompareAnyHisto.C+("config/CmpTheories.cfg")'

Site-dependent setup and Job Manager

DEVELOPMENTAL

In order to customize the setup and environmental variables, you need to perform a few steps.

First, define your configuration file. Its name must look like environment.SITENAME.cfg, where SITENAME could be for instance T3_BO. The site name Prague at the moment is reserved.

An example of environment config file can be found in configs/environment.T3_BO.cfg, and it shown here for your convenience:

 
# define here directories and other options

[General]
top_site               = T3_BO
root_version           = 5.32.00-r2
root_build             = x86_64
gcc_version            = 4.3.2
gcc_build              = x86_64-slc5

[User]
grid_user              = RiccardoDiSipio
queue                  = T3_BO

[Directories]
atlastools_dir         = /home/ATLAS/disipio/development/AtlasTools-00-00-14/
tna_install_dir        = /home/ATLAS/disipio/development/AtlasTools-00-00-14/TopNtupleAnalysis

base_input_ntuple_dir  = /gpfs_data/local/atlas/mromano/mc11c/
base_histo_dir         = /home/ATLAS/disipio/LOCAL_DISK/histograms/TopNtupleAnalysis
base_output_ntuple_dir = /home/ATLAS/disipio/development/AtlasTools-00-00-14/TopNtupleAnalysis/output/
base_log_dir           = /home/ATLAS/disipio/LOCAL_DISK/logs/TopNtupleAnalysis

Once you are confident with your config file, you can launch the setup script this way:

export TOP_INST_PATH=/path/of/installation
source $TOP_INST_PATH/TopNtupleAnalysis/scripts/setup.sh SITENAME

If everything goes well, you should see this:

>>> TOP_INST_PATH =  /home/ATLAS/disipio/development/AtlasTools-00-00-14/
Top site T3_BO
file /home/ATLAS/disipio/development/AtlasTools-00-00-14//RootCore/scripts/setup.sh exists

At this point, all the directories are set. Since the system already knows where to find the ntuples and where to put the log and histogram files, you don't need to modify any script.

The environment config files are interpreted by scripts/configReader.py and readConfig.sh, depending if you are using a python or a bash script. In python, the configuration script produces a dictionary with all the relevant information.

from configReader import Configuration 
configuration =  Configuration()
...
site = configuration['site']
atlasToolsPath = configuration['install']
configsDir = configuration['configs']
...

In BASH, there are two useful functions:

atlasToolsPath=$(readValue "atlastools_dir")
atlasToolsPath=$(readValueAndExport "atlastools_dir")
The latter function (use with care!) exports an environment variable (all uppercase) like this:
#> echo $ATLASTOOLS_DIR
#> /home/ATLAS/disipio/development/AtlasTools-00-00-14/

To launch the analysis on a PBS cluster, there is now an all-in-one script called JobManager.py:

Usage: JobManager.py [options] MCSig | MCBG | MCDilepton | Data | QCD 

Options:
  -h, --help            show this help message and exit
  -s SYST, --syst=SYST  Systematic [nominal]
  -d, --dry             Dry run (do not really submit jobs) [False]
  -o OUTPUT, --output-dir=OUTPUT
                        Output histograms file directory []
  -j JOBDIR, --job-dir=JOBDIR
                        Where to store job files [/tmp/disipio/jobs/]

By default, the MC signal (sample T1_McAtNlo 105200) is split in 50 subsamples, while for data and MC backgrounds there is only one job per sample. You can submit the analyses this way:

./scripts/JobManager.py --job-dir scripts/submit -s nominal MCSig
./scripts/JobManager.py --job-dir scripts/submit -s nominal MCBG
./scripts/JobManager.py --job-dir scripts/submit -s nominal Data

If you omit the --job-dir option, the job files will be stored in /tmp/$USER/jobs. If you don't need to debug the job scripts, just ignore this option.

The job scripts are generated from a template placed in share/template.analysis_job.sh. You can adapt it to your needs. Tags like @REPLACE_ME@ are replaced at generation time with the relevant parameters.

How to merge output histograms

In order to merge all the output histograms and to maintain the naming convention, we provide a useful script called scripts/MergeHistograms.py. Issuing the command, one can get its online help:
*** Histogram Merger Tool ***
Usage: MergeHistograms.py [options] MCBG|Data|QCD|PartialBG|AllBkg 

Options:
  -h, --help            show this help message and exit
  -s SYST, --syst=SYST  Systematic [nominal]
  -d, --dry             Dry run (do not really submit jobs) [False]
  -l LJETS, --lchannel=LJETS
                        Lepton channel [el,mu]

In the typical workflow you'll be doing something like this:

#> ./scripts/MergeHistograms.py MCBG
#> ./scripts/MergeHistograms.py Data
#> ./scripts/MergeHistograms.py QCD
#> ./scripts/MergeHistograms.py AllBkg
#> ./scripts/MergeHistograms.py PartialBG # Optional for WJets, ZJets and Diboson
You can use the final Data_*_AllPeriods_nominal.root and AllBkg_*_nominal.root files for the unfolding (* = el, mu). CAVEAT: This script makes use of python multiprocessing package. On machines with a large number of cores (>4) the performance is greatly enhanced over the usual ROOT hadd. Our benchmarks on a 16-cores machine showed that MCBG, Data and QCD can be merged in 7 minutes each.

However, we found some issues: if you decide to stop the execution by pressing CTRL+C you have to kill the background processes manually (see their PIDs with ps aux | grep $USER). This will be fixed asap.

Useful scripts for work with output root files

  • Merging: use scripts/Hadd.py in the dir with the root files output of top_x
  • Merging MC signal subsamples to halves and to full MC signal (useful for unfolding):
cd $TOP_INST_PATH/TopNtupleAnalysis
./scripts/MergeMCToHalves.py        # NOT PREFERRED: runs locally
./scripts/goliMergeMCToHalves.py  # PREFFERRED: prepares submit script and sends it as PBS job. On Prague farm, this is better.

The structure of the output root files, which contain histograms:

The directory structure of the output root files: /"type"/"btag_mode"/"furhter_subdirectory" where
  • "type" can be:
    • JetDistributions - not filled yet
    • BtagDistributions - not filled yet
    • Topological - not filled yet
    • Kinem - contains all control plots, migration matrices, truth and reco spectra and some other histos. Some subdirectories are not filled yet
    • Kinem_topMassCut - contains the same histos as Kinem, but after the KLFitter found the solution, the top mass cut was applied. Top mass cut: event passes, if: both, leptonic and hadronic, reconstructed masses are in [169.5, 175.5].
    • Kinem_likelihoodCut - contains the same histos as Kinem, but after the KLFitter found the solution, the likelihood cut was applied. Likelihood cut: event passes if: log likelihood > -50.
(the Kinem_topMassCut and Kinem_likelihoodCut directories are created only if the option CreateTopMassCutHistos or CreateLikelihoodHistos in configs/analysis_cuts.cfg are true)

  • "btag_mode" can be: NTag0 - pretag (all events without any btag cut), NTag1 - at least one b-tag (events have at least one b-tagged jet), NTag2 - at least two b-tag, NTag1Excl - exactly one b-tag, NTag2Excl - exactly two b-tag

  • "further_subdirectory" can contain one of these tags:
    • Top1: leptonic top
    • Top2: hadronic top
    • System = Top1 + Top2
    • Perm0: best lihelihood KLFitter permutation/solution
    • Generated: the truth top and antitop four-verctors are used from ALL events, only MC@NLO weights are used. This histograms are filled, if option LoopOverAllSignalEvents in configs/analysis_cuts.cfg is true (then it runs over all MC signal events from SLP TTree without any cuts and it means +4 hours of running). You need to run on full MC (not subsamples)!!!
    • Parton: the truth top and antitop four-verctors are used from selected events, the same weights are used as for Perm0 histograms (MC@NLO weight, cross section weight, StandardTTCorrections weight).
    • Parton_WithoutCrossSectionWeight: the same as Parton but only MC@NLO weight and StandardTTCorrections weight are used (no weighting with luminosity, cross section and number of events).

The most important histograms for top pt:

  • Reco Top pt spectrum: Kinem/TopFit_NTag*/Perm0TopLepHad/Perm0TopLepHadPtSpectrum
  • Migration matrix for Top pt: Kinem/TopFit_NTag*/Perm0TopLepHadRecoVsPerm0TopLepHadTruth/Perm0TopLepHadRecoVsPerm0TopLepHadTruthPt2VsPt1
  • Truth Top Pt spectrum after same cuts as reco: Kinem/TopFit_NTag*/PartonTopLepHad/PartonTopLepHadPtSpectrum
Or with the top mass cut:
  • Reco Top pt spectrum: Kinem_topMassCut/TopFit_NTag*/Perm0TopLepHad/Perm0TopLepHadPtSpectrum
  • Migration matrix for Top pt: Kinem_topMassCut/TopFit_NTag*/Perm0TopLepHadRecoVsPerm0TopLepHadTruth/Perm0TopLepHadRecoVsPerm0TopLepHadTruthPt2VsPt1
  • Truth Top Pt spectrum after same cuts as reco: Kinem_topMassCut/TopFit_NTag*/PartonTopLepHad/PartonTopLepHadPtSpectrum

For Mtt:

  • Reco Mtt spectrum: Kinem/TopFit_NTag*/Perm0System/Perm0SystemMassSpectrum
  • Migration matrix for Mtt: Kinem/TopFit_NTag*/Perm0TopLepHadRecoVsPerm0TopLepHadTruth/Perm0SystemRecoVsPerm0SystemTruthMass2VsMass1
  • Truth Mtt spectrum after same cuts as reco: Kinem/TopFit_NTag*/PartonSystem/PartonSystemMassSpectrum
Or with the top mass cut:
  • Reco Mtt spectrum: Kinem_topMassCut/TopFit_NTag*/Perm0System/Perm0SystemMassSpectrum
  • Migration matrix for Mtt: Kinem_topMassCut/TopFit_NTag*/Perm0TopLepHadRecoVsPerm0TopLepHadTruth/Perm0SystemRecoVsPerm0SystemTruthMass2VsMass1
  • Truth Mtt spectrum after same cuts as reco: Kinem_topMassCut/TopFit_NTag*/PartonSystem/PartonSystemMassSpectrum

Control Plots: Plotting stacked data/MC

cd macros/TopPlot/ change TString MainDir in RunTopPlot.C to point to your merged directory. Then run RunTopPlots.py and see what parameters, plots etc. you can control in there. The output is automatically moved to /raid7_atlas2/$USER/root/Eps/ You can also run some chi2 and KS summary plotting by Kelsey with ./RunGraphStats.sh

Scripts to create LaTeX beamer slides for selected spectra:

scripts/GenNjetsSpectrumCmpSlides.py
scripts/GenNjetsSpectrumCmpSlides_ttbar.py  
scripts/GenSystSlidesDataMCOverView.py

Control Plots: chi2 comaprison

cd macros/TopPlot/ ; ./RunGraphStats.sh
cd config/ ; ./ParseAndAddTopOnlyCfg.py ; cd ../

Quick compiling just our package when developping:

cd $TOP_INST_PATH/TopNtupleAnalysis/cmt/
make -f Makefile.RootCore

Old version branch 00-01-00:

  ./LsData.sh
  ./MakeFilelists.sh
  ./ParseFilelists.sh
  ./SplitEPS.sh
cd $TOP_INST_PATH/TopNtupleAnalysis ./scripts/SendGoliJobs.py Then use the scripts/Qsub.py to submit.

How to add packages and compile in original branch 01-00-00, running over TopD3PDs:

less doc/README.txt or some more doc in ps/pdf:
cd doc/
./doall.sh
gv documentation.ps

Useful Tips

  • Some useful tips on how to quickly visualize some output histos:
    [] .L macros/SpecialDrawHistoFromFile.C+
    [] SpecialDrawHistoFromFile("yourRootFile.root")

Problems to solve:

  • problem with sending PBS jobs on Prague farm: if you have installed the package twice in two different directories and if you have in ~/.bash_profile
export TOP_INST_PATH=/path/of/installation
then you always have to change the path/of/installation (to use the other package). It is because both packages use variable with same name TOP_INST_PATH. It is not enough to type export TOP_INST_PATH=/path/of/newinstallation, because after sending a job, the file ~/.bash_profile is sourced again on different computer.

SVN commands

  • update your svn: svn update
  • Check the status of your local copy: svn status | grep -v \?
  • add newly created file to svn, you need to check in it later as well: svn add [file]
  • check in your new version of files:
# ALWAYS do one svn update immediatelly before you want to check anything in! See changes, resolve conflicts, make sure package compiles etc.!
svn update
svn ci -m "comment" [files]

Prague HEP Farm Basics

  • Prague HEP FzU Farm accout request.
  • How to login: ssh -X $USER@ui5.farm.particle.cz
  • NOTE: the ui5 machine is designed for all users to login and send jobs from. It is not intended for MEM/CPU extensive tasks. Send a PBS job for such (see some inspiration on qsub in scripts/Qsub.sh) or, in case you need an interactive shell or just need something quickly, login from ui5 to ssh iberis01, where you will see the same home area, raid disks etc., while also profiting from a machine with some power and not in the PBS (so no jobs running on it). There is no X-server running, so no batch ROOT png plots saving possible, eps works. Enjoy!
  • SVN howto
  • How to access without ssh password - ssh keygen.
  • Use screen.

-- PeterBerta - 07-Feb-2012

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatext README r1 manage 1.7 K 2012-02-23 - 17:05 PeterBerta Ian's comments for mc11c
Texttxt packages.txt r2 r1 manage 2.7 K 2012-02-28 - 15:01 PeterBerta the newest version numbers for addditional packages. Copy to TopRootCoreRelease/share/
Edit | Attach | Watch | Print version | History: r91 < r90 < r89 < r88 < r87 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r91 - 2013-07-22 - JiriKvita
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback