Jet substructure performance at high luminosity
BOOST 2012 working group

Conveners: Gregory Soyez and Ariel Schwartzman



The purpose of this working group is to study the performance of jet substructure algorithms at very high luminosity, and to investigate the use of new techniques for pile-up subtraction and suppression presented at BOOST 2012.


Below is a list of proposed projects. The goal is to establish the performance of jet substructure techniques for four different luminosity scenarios: mu = 30 (2012 LHC conditions), 60, 100, 200, and for three signal samples: dijets, boosted tops, and boosted W(lnu)H(bb).

Jet substructure perfromance at high luminosity

Coordinators: Gregory Soyez and Ariel Schwartzman

  • Jet mass response and resolution
  • jet substructure observables
  • S/Sqrt(B)

Pile-up subtraction plus grooming

Coordinator: Mishra Kalanand

  • Study the application of jet-areas pile-up subtraction during grooming
  • Compare jet substructure performance with the use of pile-up subtraction after grooming (CMS approach)
  • Consider all luminosity scenarios separately
  • Figures of merit include: mass vs. number of vertices, mass resolution, S/sqrt(B), etc.


Pile-up subtraction for jet shapes

Coordinator: Han, Zhenyu

  • Study the performance of the proposed jet-areas corrections for jet shapes in all four high luminosity scenarios
  • Focus on n-subjettiness and jet width in all four luminosity scenarios


Pile-up suppression using jet substructure

Coordinators: John Backus Mayes, Lene Bryngemark

Pile-up local fluctuations within a same event can lead to fake pile-up jets that need to be tagged and rejected. Fake pile-up jets are made of an uniform distribution of particles from multiple interactions, leading to jets with anomalous structure and no high pT core

  • Understand the relative contribution of pile-up hard jets vs. combinatorial background (fake) jets from overlapping pile-up particles
  • Study jet substructure techniques to identify 0-core (pile-up) jets using minimum-bias only data (no signal Monte Carlo) Potential interesting methods include:
    • ACF (Angular Correlation Function)
    • Jet width using R2 weighting
    • groomed pT fraction
    • QJets
    • n-subjettiness beta=1 vs. beta=2


Samples and analysis software

Coordinators: Peter Loch and Miguel Villaplana

A common set of 108 minimum bias events, generated with Pythia8 Tune 4C, is available at the public server

Signal and background samples from previous BOOST conferences may be found here:

Organization of minBias samples

These are single event minimum bias samples (mu = 1). The events are organized as follows:

Number of events per run ...   2,000,000
Number of runs .............          50
Total number of events ..... 100,000,000

Total data size on disk is about 365 GBytes in 5000 files. A run is a single production job with a new random number generator seed. Each run produces 100 files with 20,000 single vertex events each. On the server file system, groups of 10 runs are stored in a directory under the path given above (directory names 00_09...40_49). The average file size is 75MBytes, meaning about 3.8 kBytes/event.

Data format

The events are stored in tuples using ROOT. The ROOT tree name is MB_Py8. The tuple branches are (_FLOAT_T_ = float to optimize disk space):

   int Nentry;                  // total # entries
   int Npartons;                // # partons
   int Nparticles;              // # particles
   int ID[Nentry];              // PDG Id
   int Stat[Nentry];            // internal status word (-1,-2 for partons, 2 for particles - can be dropped)
   _FLOAT_T_ Charge[Nentry];    // charge
   _FLOAT_T_ Px[Nentry];        // momentum Px
   _FLOAT_T_ Py[Nentry];        // momentum Py
   _FLOAT_T_ Pz[Nentry];        // momentum Pz
   _FLOAT_T_ P0[Nentry];        // momentum P
   _FLOAT_T_ Pm[Nentry];        // mass m
   _FLOAT_T_ Pt[Nentry];        // transverse momentum
   _FLOAT_T_ Rap[Nentry];       // rapidity y
   _FLOAT_T_ Phi[Nentry];       // azimuth phi
   _FLOAT_T_ Eta[Nentry];       // pseudo-rapidity eta
For partons we save those only with Pythia8 status=-21 and -23, thus preserving a minimum info on the scattering process. The first Npartons entries in the ROOT tuple are these partons. The following Nparticles entries are the "stable" particles as defined by Pythia8 (based on the default cτ cut). The total number of entries is then Nentry = Npartons + Nparticles. Further reduction on disk space is possible by omitting derived kinematic quantities and reducing the actual data to something like (Px,Py,Pz,m).

ROOT codes for unpacking and analysis

The ROOT tuples on files are in the MB_Py8 tree, with the following leaves matching the data structure described above (code snippet):

fChain->SetBranchAddress("Nentry", &Nentry, &b_Nentry);
fChain->SetBranchAddress("Npartons", &Npartons, &b_Npartons);
fChain->SetBranchAddress("Nparticles", &Nparticles, &b_Nparticles);
fChain->SetBranchAddress("ID", ID, &b_ID);
fChain->SetBranchAddress("Stat", Stat, &b_Stat);
fChain->SetBranchAddress("Charge", Charge, &b_Charge);
fChain->SetBranchAddress("Px", Px, &b_Px);
fChain->SetBranchAddress("Py", Py, &b_Py);
fChain->SetBranchAddress("Pz", Pz, &b_Pz);
fChain->SetBranchAddress("P0", P0, &b_P0);
fChain->SetBranchAddress("Pm", Pm, &b_Pm);
fChain->SetBranchAddress("Pt", Pt, &b_Pt);
fChain->SetBranchAddress("Rap", Rap, &b_Rap);
fChain->SetBranchAddress("Phi", Phi, &b_Phi);
fChain->SetBranchAddress("Eta", Eta, &b_Eta);
The auto-generated code templates useful to analyze the tuples (using MB_Py8->MakeClass()) can be found in MB_Py8.h and MB_Py8.C.

In addition to the 8 TeV samples described above, samples are available for 7 TeV and 13 TeV are available in the same format, but with lower statistics (estimated few 107 events for each center-of-mass energy). Please contact Peter L. for information on how to access these files.

Additional production: pile-up on a grid

In addition to the full spectrum of stable particles available in the datasets described above, single interaction events have been collected into complete pile-up events for μ = 30, 60, 100 and 200 minbias events overlayed. The number of individual collisions in each of these pile-up events is Poission distributed. The overlaid particles are projected onto grids of Δη × Δφ = 0.1 × 0.1, with -5.0 < η < 5.0 and -π < φ < π. There are two grids for each pile-up event, one filled from all stable particles and one filled only from charged particles. Presently there is no further selection of the particles entered in to the grid.

The data structure of the ROOT tree MBGridEvent is as follows. The common event information is stored in

Int_t    NEntries;              // entries in arrays below (6400)
Int_t    NPartTotal;            // total number of all particles in event
Int_t    NChargedPartTotal;     // total  number of charged particles in event
Int_t    NInteractions;         // number of pp interactions in this event
The kinematic information in each grid bin is the momentum p = ((Σpx)2 + (Σpy)2 + (Σpz)2)½, the transverse momentum pT = ((Σpx)2 + (Σpy)2)½, the mass m = ((ΣE)2 - p2)½, the "kinematic" η = log((ΣE + Σpz)/(ΣE - Σpz)), and the "kinematic" φ = cos-1((Σpx)/pT). Here Σ indicates the sum over all particles projected into a given grid bin.

Note that the central (ηii) of any grid bin i are not stored in the data structure, but can be calculated from the bin index (array index) i = 0, ..., 6399 itself, as implemented in the example MBGridEvent.h and MBGridEvent.C.

The data structure in the MBGridEvent tree is (for all and charged particles):

Int_t      NParticles[6400];       // number of particles in grid bin
Float_t  P[6400];                  // momentum
Float_t  Pt[6400];                 // transverse momentum
Float_t  M[6400];                  // mass
Float_t  EtaKine[6400];            // eta from particle kinematics
Float_t  PhiKine[6400];            // eta from particle kinematics
Int_t    NChargedParticles[6400];  // number of charged particles in bin
Float_t  ChargedP[6400];           // momentum of charged particles
Float_t  ChargedPt[6400];          // transverse momentum of charged particles
Float_t  ChargedM[6400];           // mass of charged particles
Float_t  ChargedEtaKine[6400];     // eta from charged particle kinematics
Float_t  ChargedPhiKine[6400];     // phi from charged particle kinematics

The overall grid description is stored in a separate tree GridGeometry, which is used to reconstruct the grid center η and φ. Details on this data structure, which is stored only once per file, can be found in MBGridEvent.h, together with some code performing the calculations. Also check MBGridEvent.C for an example on how to use this implementation. The branches are:

fChain->SetBranchAddress("NEntries", &NEntries, &b_NEntries);
fChain->SetBranchAddress("NPartTotal", &NPartTotal, &b_NPartTotal);
fChain->SetBranchAddress("NChargedPartTotal", &NChargedPartTotal, &b_NChargedPartTotal);
fChain->SetBranchAddress("NInteractions", &NInteractions, &b_NInteractions);
fChain->SetBranchAddress("NParticles", NParticles, &b_NParticles);
fChain->SetBranchAddress("P", P, &b_P);
fChain->SetBranchAddress("Pt", Pt, &b_Pt);
fChain->SetBranchAddress("M", M, &b_M);
fChain->SetBranchAddress("EtaKine", EtaKine, &b_EtaKine);
fChain->SetBranchAddress("PhiKine", PhiKine, &b_PhiKine);
fChain->SetBranchAddress("NChargedParticles", NChargedParticles, &b_NChargedParticles);
fChain->SetBranchAddress("ChargedP", ChargedP, &b_ChargedP);
fChain->SetBranchAddress("ChargedPt", ChargedPt, &b_ChargedPt);
fChain->SetBranchAddress("ChargedM", ChargedM, &b_ChargedM);
fChain->SetBranchAddress("ChargedEtaKine", ChargedEtaKine, &b_ChargedEtaKine);
fChain->SetBranchAddress("ChargedPhiKine", ChargedPhiKine, &b_ChargedPhiKine);

Grid events are presently available for 7 and 13 TeV center-of-mass energies. Please contact Peter L. on how to access the files.

Analysis software

SpartyJet setup

SpartyJet can be retrieved from the SpartyJet HEPForge site.

Full instructions for your initial SpartyJet setup can be accessed here:

tar xf spartyjet-4.0.2.tar
cd spartyjet-4.0.2/
sed -i -e 's/if (m_tree) delete m_tree;/\/\/if (m_tree) delete m_tree;/g' IO/
make fastjet
cd examples_py/

ALERT! Note: If you are using a Mac, you need to set: export DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH:$SPARTYJETDIR/lib
On some versions of Linux (e.g., slc5_amd64_gcc462), you need to set: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$SPARTYJETDIR/lib

Analysis setup

A testing macro is provided in the Attachments on this TWiki page. To use this macro:

  1. Download both a pileup sample file and a signal sample file
cd data/
gunzip herwig65-lhc7-ttbar2hadrons-pt0500-0600.UW.gz
  1. Download the analysis macro from this TWiki page
  2. Set the primary options inside the macro to define the output files and paths
    • The current defaults should be appropriate for most people but please check
  3. Run the analysis macro


Please upload your contributions and results to the following live indico page

-- ArielSchwartzman - 03-Aug-2012 -- PeterL - 14-Aug-2012 -- PeterL - 24-Jan-2013

Edit | Attach | Watch | Print version | History: r10 < r9 < r8 < r7 < r6 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r7 - 2013-01-24 - PeterL
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback