Physics Analysis Ntuples

This page describes the Physics Analysis Ntuples that are produced from the reconstructed xAODs, and are intended to make it easier to get started with a physics analysis.

Overview

Calypso code that produces the ntuples can be found here.

Ntuples are currently (Feb 5, 2023) being stored at:

  • TI12 = /eos/experiment/faser/phys/2022/r0013/
  • MC = /eos/experiment/faser/sim/mc22/[MC-type]/[run-number]/phy/r0013/
    • 'MC-type' can be forsee, genie, or particle_gun
    • see FaserMC twiki for more info on MC samples

Pros of using the ntuples:

  • The ntuples can be analyzed without a Calypso build, all you need is ROOT.
  • They are much smaller than the xAOD files due to only storing interesting events and are thus quicker to analyze.
  • You can use RDataFrame to analyze the ntuple events in parallel which greatly speeds up the analysis time. This cannot be done on the xAOD files as the tracker class we use is not supported in RDataFrame.
  • The ntuples are already blinded.

Cons of using the ntuples:

  • Not all events are stored in the Ntuples.
  • Not all xAOD variables are stored in the Ntuples.
  • Can't use calypso tools such as track extrapolation.

Ntuple Event Filtering

There are three event filters that prevent TI12 events from being stored in the ntuples:
  1. Stable Beams Filter
    • Events that are not labelled as stable beams are not stored in the ntuple.
  2. Blinding Filter
    • Events that miss all vetoes and have a calo signal greater than that of a 10 GeV electron are not stored in the ntuple.
    • Updated in r0013: Events that miss all vetoes and have a calo signal greater than a 25 GeV electron and are not within +/- 1 BCID of a collision are not stored in the ntuple.
  3. Scintillator Coincidence Filter
    • Events that do not have coincidence between scintillator triggers are not stored in the ntuple.
      • Needed to pass filter: (Trig0 & Trig1) or (Trig0 & Trig2) or (Trig1 & Trig2) or Trig3
        • Trig0: "CalorimeterBottom|CalorimeterTop"
        • Trig1: "FaserNuVetoLayer|FirstVetoLayer|SecondVetoLayer|PreshowerLayer"
        • Trig2: "TimingLayerBottom|TimingLayerTop"
        • Trig3: "(FaserNuVetoLayer|SecondVetoLayer)&PreshowerLayer"
    • Updated in r0013: Trig0 (alone) is added, so to pass filter: Trig0 or (Trig1 & Trig2) or Trig3

Note: MC Ntuples have no event filtering, all events are stored.

Ntuple Variables

Event Info:

Variable Name Type
run Int_t
eventID Int_t
eventTime Int_t
BCID Int_t

LHC Info:

Variable Name Type
fillNumber Int_t
betaStar Float_t
crossingAngle Float_t
distanceToCollidingBCID Int_t
distanceToUnpairedB1 Int_t
distanceToUnpairedB2 Int_t
distanceToInboundB1 Int_t
distanceToTrainStart Int_t
distanceToPreviousColliding Int_t

Notes:

  • 'distanceToPreviousColliding' is always positive, whereas 'distanceToCollidingBCID' can be positive or negative depending on if the closest colliding BCID is ahead of (+) or behind (-)
  • 'distanceToInboundB1' == 0 is used to pick out B1 background events seen in Faser

Trigger Words:

Variable Name Type
TBP Int_t
TAP Int_t
inputBits Int_t
inputBitsNext Int_t

Notes:

  • TAP (Trigger After Prescale) and TBP (Trigger Before Prescale) are trigger words with the following meanings:
    • 1 = "CalorimeterBottom|CalorimeterTop"
    • 2 = "FaserNuVetoLayer|FirstVetoLayer|SecondVetoLayer|PreshowerLayer"
    • 4 = "TimingLayerBottom|TimingLayerTop"
    • 8 = "(FaserNuVetoLayer|SecondVetoLayer)&PreshowerLayer"
    • 16 = Random
    • 32 = LED

Scintillator and Calorimeter waveform variables:

Variable Name Type
VetoNu0_time Float_t
VetoNu0_peak Float_t
VetoNu0_width Float_t
VetoNu0_charge Float_t
VetoNu0_raw_peak Float_t
VetoNu0_raw_charge Float_t
VetoNu0_baseline Float_t
VetoNu0_baseline_rms Float_t
VetoNu0_status Int_t
VetoNu1_time Float_t
VetoNu1_peak Float_t
VetoNu1_width Float_t
VetoNu1_charge Float_t
VetoNu1_raw_peak Float_t
VetoNu1_raw_charge Float_t
VetoNu1_baseline Float_t
VetoNu1_baseline_rms Float_t
VetoNu1_status Int_t
VetoSt10_time Float_t
VetoSt10_peak Float_t
VetoSt10_width Float_t
VetoSt10_charge Float_t
VetoSt10_raw_peak Float_t
VetoSt10_raw_charge Float_t
VetoSt10_baseline Float_t
VetoSt10_baseline_rms Float_t
VetoSt10_status Int_t
VetoSt11_time Float_t
VetoSt11_peak Float_t
VetoSt11_width Float_t
VetoSt11_charge Float_t
VetoSt11_raw_peak Float_t
VetoSt11_raw_charge Float_t
VetoSt11_baseline Float_t
VetoSt11_baseline_rms Float_t
VetoSt11_status Int_t
VetoSt20_time Float_t
VetoSt20_peak Float_t
VetoSt20_width Float_t
VetoSt20_charge Float_t
VetoSt20_raw_peak Float_t
VetoSt20_raw_charge Float_t
VetoSt20_baseline Float_t
VetoSt20_baseline_rms Float_t
VetoSt20_status Int_t
Timing0_time Float_t
Timing0_peak Float_t
Timing0_width Float_t
Timing0_charge Float_t
Timing0_raw_peak Float_t
Timing0_raw_charge Float_t
Timing0_baseline Float_t
Timing0_baseline_rms Float_t
Timing0_status Int_t
Timing1_time Float_t
Timing1_peak Float_t
Timing1_width Float_t
Timing1_charge Float_t
Timing1_raw_peak Float_t
Timing1_raw_charge Float_t
Timing1_baseline Float_t
Timing1_baseline_rms Float_t
Timing1_status Int_t
Timing2_time Float_t
Timing2_peak Float_t
Timing2_width Float_t
Timing2_charge Float_t
Timing2_raw_peak Float_t
Timing2_raw_charge Float_t
Timing2_baseline Float_t
Timing2_baseline_rms Float_t
Timing2_status Int_t
Timing3_time Float_t
Timing3_peak Float_t
Timing3_width Float_t
Timing3_charge Float_t
Timing3_raw_peak Float_t
Timing3_raw_charge Float_t
Timing3_baseline Float_t
Timing3_baseline_rms Float_t
Timing3_status Int_t
Preshower0_time Float_t
Preshower0_peak Float_t
Preshower0_width Float_t
Preshower0_charge Float_t
Preshower0_raw_peak Float_t
Preshower0_raw_charge Float_t
Preshower0_baseline Float_t
Preshower0_baseline_rms Float_t
Preshower0_status Int_t
Preshower1_time Float_t
Preshower1_peak Float_t
Preshower1_width Float_t
Preshower1_charge Float_t
Preshower1_raw_peak Float_t
Preshower1_raw_charge Float_t
Preshower1_baseline Float_t
Preshower1_baseline_rms Float_t
Preshower1_status Int_t
Calo0_time Float_t
Calo0_peak Float_t
Calo0_width Float_t
Calo0_charge Float_t
Calo0_raw_peak Float_t
Calo0_raw_charge Float_t
Calo0_baseline Float_t
Calo0_baseline_rms Float_t
Calo0_status Int_t
Calo1_time Float_t
Calo1_peak Float_t
Calo1_width Float_t
Calo1_charge Float_t
Calo1_raw_peak Float_t
Calo1_raw_charge Float_t
Calo1_baseline Float_t
Calo1_baseline_rms Float_t
Calo1_status Int_t
Calo2_time Float_t
Calo2_peak Float_t
Calo2_width Float_t
Calo2_charge Float_t
Calo2_raw_peak Float_t
Calo2_raw_charge Float_t
Calo2_baseline Float_t
Calo2_baseline_rms Float_t
Calo2_status Int_t
Calo3_time Float_t
Calo3_peak Float_t
Calo3_width Float_t
Calo3_charge Float_t
Calo3_raw_peak Float_t
Calo3_raw_charge Float_t
Calo3_baseline Float_t
Calo3_baseline_rms Float_t
Calo3_status Int_t

Notes:

  • the 'time' variables are corrected by the clock phase such as to remove the 16ns digitizer jitter
  • the 'status' variables are bit words that have the following meanings:
    • 0 = good hit
    • 1 = below threshold
    • 2 = secondary hit
    • 4 = amplitude overflow
    • 8 = find baseline failed
    • 16 = gaus fit failed
    • 32 = cryst-ball fit failed
    • 64 = invalid clock
    • 128 = waveform missing
    • 256 = waveform invalid

Calibrated Energy variables for Calorimeter and Preshower channels:

Variable Name Type
Calo0_nMIP Float_t
Calo0_E_dep Float_t
Calo0_E_EM Float_t
Calo1_nMIP Float_t
Calo1_E_dep Float_t
Calo1_E_EM Float_t
Calo2_nMIP Float_t
Calo2_E_dep Float_t
Calo2_E_EM Float_t
Calo3_nMIP Float_t
Calo3_E_dep Float_t
Calo3_E_EM Float_t
Calo_total_nMIP Float_t
Calo_total_E_dep Float_t
Calo_total_E_EM Float_t
Preshower0_nMIP Float_t
Preshower0_E_dep Float_t
Preshower0_E_EM Float_t
Preshower1_nMIP Float_t
Preshower1_E_dep Float_t
Preshower1_E_EM Float_t
Preshower_total_nMIP Float_t
Preshower_total_E_dep Float_t

Notes:

  • Energies are in MeV
  • 'Preshower0_E_EM' and 'Preshower1_E_EM' are dummy variables and filled with 'NaN' as we do not estimate the EM energy from the preshower layers

Cluster Counts in Tracking Stations:

Variable Name Type
nClusters0 Int_t
nClusters1 Int_t
nClusters2 Int_t
nClusters3 Int_t

SpacePoint positions:

Variable Name Type
SpacePoints Int_t
SpacePoint_x vector
SpacePoint_y vector
SpacePoint_z vector

TrackSegment Fit Paramters:

Variable Name Type
TrackSegments Int_t
TrackSegment_Chi2 vector
TrackSegment_nDoF vector
TrackSegment_x vector
TrackSegment_y vector
TrackSegment_z vector
TrackSegment_px vector
TrackSegment_py vector
TrackSegment_pz vector

Reconstructed Track Parameters:

Variable Name Type
longTracks Int_t
Track_PropagationError Int_t
Track_Chi2 vector
Track_nDoF vector
Track_x0 vector
Track_y0 vector
Track_z0 vector
Track_px0 vector
Track_py0 vector
Track_pz0 vector
Track_p0 vector
Track_x1 vector
Track_y1 vector
Track_z1 vector
Track_px1 vector
Track_py1 vector
Track_pz1 vector
Track_p1 vector
Track_charge vector
Track_nLayers vector
Track_InStation0 vector
Track_InStation1 vector
Track_InStation2 vector
Track_InStation3 vector
Track_X_atVetoNu vector
Track_Y_atVetoNu vector
Track_ThetaX_atVetoNu vector
Track_ThetaY_atVetoNu vector
Track_X_atVetoStation1 vector
Track_Y_atVetoStation1 vector
Track_ThetaX_atVetoStation1 vector
Track_ThetaY_atVetoStation1 vector
Track_X_atVetoStation2 vector
Track_Y_atVetoStation2 vector
Track_ThetaX_atVetoStation2 vector
Track_ThetaY_atVetoStation2 vector
Track_X_atTrig vector
Track_Y_atTrig vector
Track_ThetaX_atTrig vector
Track_ThetaY_atTrig vector
Track_X_atPreshower1 vector
Track_Y_atPreshower1 vector
Track_ThetaX_atPreshower1 vector
Track_ThetaY_atPreshower1 vector
Track_X_atPreshower2 vector
Track_Y_atPreshower2 vector
Track_ThetaX_atPreshower2 vector
Track_ThetaY_atPreshower2 vector
Track_X_atCalo vector
Track_Y_atCalo vector
Track_ThetaX_atCalo vector
Track_ThetaY_atCalo vector
Track_x_atMaxRadius vector
Track_y_atMaxRadius vector
Track_z_atMaxRadius vector
Track_r_atMaxRadius vector

Notes:

  • Only long tracks that have hits in tracking stations 1, 2, and 3 are saved (using track collection without IFT)
  • the size of each vector is equal to the value of 'longTracks'
  • (x0,y0,z0,px0,...) are taken from the most upstream tracker measurement
  • (x1,y1,z1,px1,...) are taken from the most downstream tracker measurement
  • track parameters at the scintillators are obtained via track extrapolation
  • track propagation error indicates Acts::PropagatorError::StepCountLimitReached and Acts::CombinatorialKalmanFilterError::PropagationReachesMaxSteps, do not use these events

MC truth info for particles matched to reco track:

Variable Name Type
t_pdg vector
t_barcode vector
t_truthHitRatio vector
t_prodVtx_x vector
t_prodVtx_y vector
t_prodVtx_z vector
t_decayVtx_x vector
t_decayVtx_y vector
t_decayVtx_z vector
t_px vector
t_py vector
t_pz vector
t_theta vector
t_phi vector
t_p vector
t_pT vector
t_eta vector
t_st0_x vector
t_st0_y vector
t_st0_z vector
t_st1_x vector
t_st1_y vector
t_st1_z vector
t_st2_x vector
t_st2_y vector
t_st2_z vector
t_st3_x vector
t_st3_y vector
t_st3_z vector
isFiducial vector

Notes:

  • 't_pdg' will be zero if we failed to find a truth particle for the reconstructed track
  • 'truthHitRatio' tells you how many clusters on track are from this truth particle
  • 'isFiducial' tells you if the truth particle is within 100 mm radius for all 3 tracking stations

MC truth parameters of first 10 truth particles:

Variable Name Type
truth_P vector
truth_px vector
truth_py vector
truth_pz vector
truth_m vector
truth_pdg vector
truth_prod_x vector
truth_prod_y vector
truth_prod_z vector
truth_dec_x vector
truth_dec_y vector
truth_dec_z vector

Notes:

  • the first entry of the vector will be the initial particle from the generator or particle gun
  • the 'prod' variables are the position of the production vertex and the 'dec' variables are the position of the decay vertex

MC initial truth parameters of Dark Photon and e+/e- daughter particles:

Variable Name Type
truthM_P vector
truthM_px vector
truthM_py vector
truthM_pz vector
truthM_x vector
truthM_y vector
truthM_z vector
truthd0_P vector
truthd0_px vector
truthd0_py vector
truthd0_pz vector
truthd0_x vector
truthd0_y vector
truthd0_z vector
truthd1_P vector
truthd1_px vector
truthd1_py vector
truthd1_pz vector
truthd1_x vector
truthd1_y vector
truthd1_z vector

Notes:

  • dark photon is mother (M)
  • e+ is daughter 0 (d0)
  • e- is daughter 1 (d1)

Ntuple Noise Histograms

For TI12 data, randomly triggered events are not stored in the ntuple, but the scintillator charges for such an event are stored in histograms that are saved to the same root file as the ntuple. The histograms thus contain the noise distributions for each calorimeter and scintillator channel. The names of the noise historgrams are 'hRandomCharge0', 'hRandomCharge1', ..., and 'hRandomCharge14' where the number at the end of the name is the digitizer channel.

Example pyROOT Analysis Code

#!/usr/bin/env python

# Set up (Py)ROOT.
import ROOT

t = ROOT.TChain("nt")
nfiles = 0
nfiles += t.Add("/eos/experiment/faser/phys/2022/r0011/009148/*.root") # chain all ntuples from run 9148

# define histogram
hTrackChi2_over_nDOF = ROOT.TH1F("hTrackChi2_over_nDOF", "Track #chi^{2}/nDOF;Track #chi^{2} / nDOF;# of tracks",100,0,11)
hCaloE_EM = ROOT.TH1F("hCaloE_EM", "EM E in Calo;EM E (GeV);# of events",100,0.0,3000.0)

i = 0
for event in t:
    i += 1

    if i%1000 == 0:
        print( "Processing event #%i of %i" % (i, t.GetEntries() ) )

    if event.longTracks == 0: continue # only use events with at least 1 track

    for j in range(event.longTracks): # loop over all long tracks in the event (long = has hits in last 3 tracking stations)
        if event.Track_nDoF[j] != 0: # avoid division by zero error
            hTrackChi2_over_nDOF.Fill(event.Track_Chi2[j]/event.Track_nDoF[j]) # fill histogram

    hCaloE_EM.Fill(event.Calo_total_E_EM / 1000.0) # fill histogram

    if i > 100000: # only look at the first 100k events
        break

# Now save plots to pdf
filename = "Ntuple-9148-PhysicsAnalysis.pdf"

c = ROOT.TCanvas()
c.Print(filename+'[')
hTrackChi2_over_nDOF.Draw()
ROOT.gPad.SetLogy()
c.Print(filename)

c = ROOT.TCanvas()
hCaloE_EM.Draw()
ROOT.gPad.SetLogy()
c.Print(filename)

# Must close file at the end
c.Print(filename+']')

Example RDataFrame Analysis Code

A more efficient way of running is to use RDataFrame. A simple example of this is:

import ROOT as R

# Utilise multiple cores
R.EnableImplicitMT()


R.gROOT.SetBatch(True)

# Create Dataframe
df = R.RDataFrame("nt", "/eos/experiment/faser/phys/2022/r0011/009148/Faser-Physics-009148-*PHYS.root")

# Require at least one track
df = df.Filter("longTracks > 0", "> 1 Track")

# Define new variables
df = df.Define("Track_Chi2PerDof", "Track_Chi2/Track_nDoF")
df = df.Define("Calo_EEM_GeV", "Calo_total_E_EM/1000.")

# Define hists
htrack = df.Histo1D(("Chi2PerDof", "; #chi^{2}/Dof; NTracks", 100, 0, 11), "Track_Chi2PerDof")
hcalo = df.Histo1D(("CaloE", "; #EM E_{calo} [GeV]; NEvents", 100, 0, 11), "Calo_EEM_GeV")

# Draw hists and print to file
# NB. The first call to draw triggers a single event loop to fill all histograms
#     To avoid having multiple eventloops you must define all hists before accessing any of them
filename = "Ntuple-9148-PhysiscsAnalysis.pdf"
c = R.TCanvas()
c.Print(f"{filename}[")

htrack.Draw()
R.gPad.SetLogy()
c.Print(filename)

c.Clear()
hcalo.Draw()
R.gPad.SetLogy()
c.Print(filename)

c.Print(f"{filename}]")
print (f"Ran eventloop {df.GetNRuns()} times")

which was tested with the root version obtained via

source /cvmfs/sft.cern.ch/lcg/views/LCG_101/x86_64-centos7-gcc10-opt/setup.sh

A more complex framework based on this can be found at here

-- DeionElginFellers - 2023-01-18

Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r9 - 2023-03-02 - TobiasBockh
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    FASER All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback