Dilepton Differential Cross-Section Code

Introduction

All of the code used in the dilepton 13 TeV differential cross-section measurements (at least, all the really cool ones) can be found here:

Phase 2 (current): svn+ssh://<username>@svn.cern.ch/reps/atlasphys-top/Physics/Top/Summer17/INT_XS_dilepdiff/
Phase 1 (previous): svn+ssh://svn.cern.ch/reps/atlasphys-top/Physics/Top/Winter1516/INT_XS_dilepdiff_13TeV/

Below are some descriptions of what the code does. The section headings mimic the structure of the code directory (i.e. all the event saver related information is detailed in Top Dilepton Analysis and everything related to unfolding is detailed in, you guessed it, the Unfolding section).

Top Dilepton Analysis

The TopDileptonAnalysis package is basically just a normal AnalysisTop event saver with a few extra features. The package uses the standard overlap removal in AnalysisTop, as well as the standard object definitions (we don't overwrite the overlap removal code or the objects, just the event saver). The code structure looks something like this:

CodeStructure.png

What information is provided?

Four vector information is written for leptons, jets, bjets, met, top, tbar, ttbar etc. In addition, the b-jet multiplicities are stored for all defined working points, as well as some dilepton reconstruction specific information (such as which jets were used to reconstruct the ttbar system etc). Some of these branches duplicate information that is always written out by AnalysisTop (e.g. AnalysisTop writes out el_ and mu_ branches, whereas TopDileptonAnalysis adds the d_lep_X branches). This duplication is a little annoying, but as it isn't currently limiting our NTuple sizes, the information is kept for sanity check purposes. The following branches are probably of most interest to users:

tree_name = nominal / particleLevel / truth:
  * d_top_     pt, eta, phi, m, e, y: four vector + information on the highest weight top quark from NuW
  * d_tbar_    pt, eta, phi, m, e, y: as above but for the tbar
  * d_ttbar_   pt, eta, phi, m, e, y: as above but for the ttbar
  * d_nu_      pt, eta, phi, m, e, y: as above but for the neutrino
  * d_nubar_   pt, eta, phi, m, e, y: as above but for the neutrino
  * d_Wp_      pt, eta, phi, you get the idea: Positive W boson information (p stands for plus)
  * d_Wm_      pt, eta, phi, you get the idea: Negative W boson information (m stands for minus)

  * d_top(tbar)_match: true/false flag to say if the top(anti-top) matched to the particle-level top within dR < 0.4 (doesn't consider pT, m, or e differences)
  * d_weight_max: weight returned by NuW. As a general rule, you want to cut on this to remove low-weight shitty solutions

The current version of this event saver also recalculates the JVT scale factors. You shouldn't use the AnalysisTop version of this scale factor until it has been fixed (still not done as of 2.4.25).

Neutrino Weighting

Neutrino weighting is now available as a standalone class in c++ (a python version exists as well but it is much slower and is therefore not recommended). You can find the necessary code in the NeutrinoWeighter.cxx file (if you want to change the top mass / W mass smearing, you'll need to edit the code by hand). Examples of how to instantiate the class and run the reconstruction can be found in the saveEvent function in TopDileptonEventSaver.cxx.

Unfolding

Most of the unfolding scripts assume that you have already run the necessary NTuples using the TopDileptonAnalysis EventSaver or some other suitably similar NTuples. You can find a list of files that the unfolding scripts assume exist in files.py (these are the files needed for an emu analysis at 13 TeV). Other unfolding scripts only require the output from the generate_unfolding_input script (such as creating reco-level plots or migration matrices).

In general, all of the scripts rely on a python Observable class (see observable.py) which provides all of the binning, tree projection, latex/ROOT histogram labels, and meta data for single and double-differential observables. Most scripts should run identically on single differential and double differential distributions without the user needing to do anything other than to configure the observable class. At the moment, all default observables are stored in unfold_config.py but many more can be added as long as the branches exist in the TTrees.

Generating inputs for unfolding

In order to generate the unfolding inputs, we assume you have run AnalysisTop and have some outputs for data, signal, and backgrounds. The idea is that you get all the histograms you need by projecting them from the Trees. This is a very flexible way of generating inputs without relying on the grid/batch systems (though you can run the scripts on a batch system if you like). You can specify what types of output you want with arguments to the script. Running the unfolding input is as simple as:

> python generate_unfolding_input.py do_parton do_particle do_systematics obs: top_pt el_pt

The flags do_parton and do_particle determine if you want to plot the particle-level and parton-level plots, as well as their migration matrices in the output. If you leave these out the code will simply make reco-level output. The do_systematics flag turns on or off the systematics (currently, it's not possible to define a sub-set, but a simple tweak to the argument parser could fix this). Without this flag, only the nominal is used. Finally, any arguments following obs: will be observables to be added. These must go at the end of the arguments (because I wrote a lazy arg parser). Without this flag, the default 5 top/ttbar differential observables are used. If you want to add your own default observables, just add them to the list of default observables found in unfold_config.py.

Creating reco-level plots

Creating reco-level plots is probably the quickest thing you can do once you have your NTuples. Here is an example of how one might create some default control plots with just MC stat. uncertainties:

> python generate_unfolding_input.py obs: el_pt mu_pt jet_n bjet_n dphi_ll_vs_ttbar_m
> python do_reco_plots.py obs: el_pt mu_pt jet_n bjet_n dphi_ll_vs_ttbar_m

If you now look in the Unfolding/Plots directory, you should have .pdf files for each of these observables. In the case of double differential observables, you will have one plot per secondary differential bin (in the example above this translates to one dphi_ll figure per bin in ttbar_m). Adding systematic uncertainties to these plots increases the execution time from seconds to minutes but shouldn't cause too many problems (NOTE: Though coded, this option is not yet tested).

Drawing migration matrices

The migration matrices will be drawn automatically when you run the unfolding. However, if you would like to run the code stand-alone for some reason, you must have first run the generate_unfolding_input.py script. The migration matrices can then be constructed using:

> python do_migration_matrices obs: top_pt dphi_ll_vs_ttbar_m

For single differential observables, you will get a migration matrix that has been normalised by row (i.e. the entries represent a percentage of events and sum to 1 along the rows). For double differential observables, you will get one large migration matrix, sub-divided depending on the number of secondary differential bins. The code currently still normalised the rows to 1. If the colour range is not to your liking, simply disable the set_palette command at the start of the do_migration_matrices script, or edit the atlas_labels.py code where the palette is defined.

Running the unfolding

The unfolding will only run if you already have generated the unfolding inputs (in the previous step) and set the correct path in unfold_config for where they are stored. When testing, it is advisable to turn off the PDF systematics (as this adds hundreds of new systematic variations and will increase your run time to minutes, rather than seconds). Run the unfolding like so:

> python unfold.py

This code will unfold all of your reco spectra (be they data or MC) and store them all in a file in the Plots directory as a root file. The final script you need to run will take these unfolded spectra and compare them to the relevant particle-level spectra, thus calculating your results/systematics. It is run similarly to the unfolding:

> python unfold_results.py

Generating pseudo data

Many scripts require pseudo experiments and to avoid making these on the fly each time a script is run, we generate them separately and simply reuse them as needed. Pseudo experiments are generated with statistically independent testing and training samples (called even and odd in the code). The script that does this can take quite a lot of time so it is recommended to only run 100 experiments and to only run one observable at a time. The experiments are generated using:

> python generate_pseudo_experiments.py do_particle do_parton obs: dphi_ll_vs_ttbar_m

If you don't specify do_parton or do_particle the code will return a warning.

Pull tests

There are currently two ways to run the pull tests; by using independent testing and train samples for each pseudo experiment, or by taking a single pseudo experiment and Poisson fluctuating the bin contents. The latter case is referred to as the "simple" option and is the one we recommend for the time being. Both scripts run in the same way:

> python do_pull_tests(_simple).py do_particle do_parton obs: dphi_ll_vs_ttbar_m

If you don't specify do_parton or do_particle the code will return a warning.

A note on systematics

If you look in unfold_config.py you will see many lists of systematic names. Some are used for the unfolding input generation (those starting SYSTEMATICS_) and some are used for the unfolding (LIST_OF_SYSTEMATICS). The reason for this difference is that some of the systematics don't require special inputs from generate_unfolding_inputs. For example, background cross-section shifts simply use the nominal background estimations, scaled up or down by a certain amount, and so there is no need to include this systematic when generating the unfolding inputs.

Other useful scripts

Scripts used to perform various studies haven't been refactored for 2016 yet (you will notice they look a little messier and don't rely on the same helper python scripts). We are working to update them as we test the 2016 data.

My Links

UserForm
First Name James
Last Name Howarth
Titles Doctor, The Doctor, Lord Commander of the top cross-section group (former)
Email jhowarth@cernNOSPAMPLEASE.ch
Telephone 555-seriously? Don't call me.
Mobile No really, don't call me.
Skype ID jamie_odin
Department of Mysteries
Organization Manchester school of badassery and Physics
URL www.pen-island.com (remove the hyphen)
Location Wetter version of Winterfell
Region The North
Country Kingdom of Westeros
Status Update who is actually looking at TWiki status updates?! You, you are. You sad sad person.
Edit personal data
Edit | Attach | Watch | Print version | History: r10 < r9 < r8 < r7 < r6 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r10 - 2017-01-25 - JayHowarth
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback