Physics Analysis Ntuples
This page describes the Physics Analysis Ntuples that are produced from the reconstructed xAODs, and are intended to make it easier to get started with a physics analysis.
Overview
Calypso code that produces the ntuples can be found
here
.
Ntuples are currently (Feb 5, 2023) being stored at:
- TI12 = /eos/experiment/faser/phys/2022/r0013/
- MC = /eos/experiment/faser/sim/mc22/[MC-type]/[run-number]/phy/r0013/
- 'MC-type' can be forsee, genie, or particle_gun
- see FaserMC twiki for more info on MC samples
Pros of using the ntuples:
- The ntuples can be analyzed without a Calypso build, all you need is ROOT.
- They are much smaller than the xAOD files due to only storing interesting events and are thus quicker to analyze.
- You can use RDataFrame to analyze the ntuple events in parallel which greatly speeds up the analysis time. This cannot be done on the xAOD files as the tracker class we use is not supported in RDataFrame.
- The ntuples are already blinded.
Cons of using the ntuples:
- Not all events are stored in the Ntuples.
- Not all xAOD variables are stored in the Ntuples.
- Can't use calypso tools such as track extrapolation.
Ntuple Event Filtering
There are three event filters that prevent TI12 events from being stored in the ntuples:
- Stable Beams Filter
- Events that are not labelled as stable beams are not stored in the ntuple.
- Blinding Filter
- Events that miss all vetoes and have a calo signal greater than that of a 10 GeV electron are not stored in the ntuple.
- Updated in r0013: Events that miss all vetoes and have a calo signal greater than a 25 GeV electron and are not within +/- 1 BCID of a collision are not stored in the ntuple.
- Scintillator Coincidence Filter
- Events that do not have coincidence between scintillator triggers are not stored in the ntuple.
- Needed to pass filter: (Trig0 & Trig1) or (Trig0 & Trig2) or (Trig1 & Trig2) or Trig3
- Trig0: "CalorimeterBottom|CalorimeterTop"
- Trig1: "FaserNuVetoLayer|FirstVetoLayer|SecondVetoLayer|PreshowerLayer"
- Trig2: "TimingLayerBottom|TimingLayerTop"
- Trig3: "(FaserNuVetoLayer|SecondVetoLayer)&PreshowerLayer"
- Updated in r0013: Trig0 (alone) is added, so to pass filter: Trig0 or (Trig1 & Trig2) or Trig3
Note: MC Ntuples have no event filtering, all events are stored.
Ntuple Variables
Event Info:
LHC Info:
Variable Name |
Type |
fillNumber |
Int_t |
betaStar |
Float_t |
crossingAngle |
Float_t |
distanceToCollidingBCID |
Int_t |
distanceToUnpairedB1 |
Int_t |
distanceToUnpairedB2 |
Int_t |
distanceToInboundB1 |
Int_t |
distanceToTrainStart |
Int_t |
distanceToPreviousColliding |
Int_t |
Notes:
- 'distanceToPreviousColliding' is always positive, whereas 'distanceToCollidingBCID' can be positive or negative depending on if the closest colliding BCID is ahead of (+) or behind (-)
- 'distanceToInboundB1' == 0 is used to pick out B1 background events seen in Faser
Trigger Words:
Notes:
- TAP (Trigger After Prescale) and TBP (Trigger Before Prescale) are trigger words with the following meanings:
- 1 = "CalorimeterBottom|CalorimeterTop"
- 2 = "FaserNuVetoLayer|FirstVetoLayer|SecondVetoLayer|PreshowerLayer"
- 4 = "TimingLayerBottom|TimingLayerTop"
- 8 = "(FaserNuVetoLayer|SecondVetoLayer)&PreshowerLayer"
- 16 = Random
- 32 = LED
Scintillator and Calorimeter waveform variables:
Variable Name |
Type |
VetoNu0_time |
Float_t |
VetoNu0_peak |
Float_t |
VetoNu0_width |
Float_t |
VetoNu0_charge |
Float_t |
VetoNu0_raw_peak |
Float_t |
VetoNu0_raw_charge |
Float_t |
VetoNu0_baseline |
Float_t |
VetoNu0_baseline_rms |
Float_t |
VetoNu0_status |
Int_t |
VetoNu1_time |
Float_t |
VetoNu1_peak |
Float_t |
VetoNu1_width |
Float_t |
VetoNu1_charge |
Float_t |
VetoNu1_raw_peak |
Float_t |
VetoNu1_raw_charge |
Float_t |
VetoNu1_baseline |
Float_t |
VetoNu1_baseline_rms |
Float_t |
VetoNu1_status |
Int_t |
VetoSt10_time |
Float_t |
VetoSt10_peak |
Float_t |
VetoSt10_width |
Float_t |
VetoSt10_charge |
Float_t |
VetoSt10_raw_peak |
Float_t |
VetoSt10_raw_charge |
Float_t |
VetoSt10_baseline |
Float_t |
VetoSt10_baseline_rms |
Float_t |
VetoSt10_status |
Int_t |
VetoSt11_time |
Float_t |
VetoSt11_peak |
Float_t |
VetoSt11_width |
Float_t |
VetoSt11_charge |
Float_t |
VetoSt11_raw_peak |
Float_t |
VetoSt11_raw_charge |
Float_t |
VetoSt11_baseline |
Float_t |
VetoSt11_baseline_rms |
Float_t |
VetoSt11_status |
Int_t |
VetoSt20_time |
Float_t |
VetoSt20_peak |
Float_t |
VetoSt20_width |
Float_t |
VetoSt20_charge |
Float_t |
VetoSt20_raw_peak |
Float_t |
VetoSt20_raw_charge |
Float_t |
VetoSt20_baseline |
Float_t |
VetoSt20_baseline_rms |
Float_t |
VetoSt20_status |
Int_t |
Timing0_time |
Float_t |
Timing0_peak |
Float_t |
Timing0_width |
Float_t |
Timing0_charge |
Float_t |
Timing0_raw_peak |
Float_t |
Timing0_raw_charge |
Float_t |
Timing0_baseline |
Float_t |
Timing0_baseline_rms |
Float_t |
Timing0_status |
Int_t |
Timing1_time |
Float_t |
Timing1_peak |
Float_t |
Timing1_width |
Float_t |
Timing1_charge |
Float_t |
Timing1_raw_peak |
Float_t |
Timing1_raw_charge |
Float_t |
Timing1_baseline |
Float_t |
Timing1_baseline_rms |
Float_t |
Timing1_status |
Int_t |
Timing2_time |
Float_t |
Timing2_peak |
Float_t |
Timing2_width |
Float_t |
Timing2_charge |
Float_t |
Timing2_raw_peak |
Float_t |
Timing2_raw_charge |
Float_t |
Timing2_baseline |
Float_t |
Timing2_baseline_rms |
Float_t |
Timing2_status |
Int_t |
Timing3_time |
Float_t |
Timing3_peak |
Float_t |
Timing3_width |
Float_t |
Timing3_charge |
Float_t |
Timing3_raw_peak |
Float_t |
Timing3_raw_charge |
Float_t |
Timing3_baseline |
Float_t |
Timing3_baseline_rms |
Float_t |
Timing3_status |
Int_t |
Preshower0_time |
Float_t |
Preshower0_peak |
Float_t |
Preshower0_width |
Float_t |
Preshower0_charge |
Float_t |
Preshower0_raw_peak |
Float_t |
Preshower0_raw_charge |
Float_t |
Preshower0_baseline |
Float_t |
Preshower0_baseline_rms |
Float_t |
Preshower0_status |
Int_t |
Preshower1_time |
Float_t |
Preshower1_peak |
Float_t |
Preshower1_width |
Float_t |
Preshower1_charge |
Float_t |
Preshower1_raw_peak |
Float_t |
Preshower1_raw_charge |
Float_t |
Preshower1_baseline |
Float_t |
Preshower1_baseline_rms |
Float_t |
Preshower1_status |
Int_t |
Calo0_time |
Float_t |
Calo0_peak |
Float_t |
Calo0_width |
Float_t |
Calo0_charge |
Float_t |
Calo0_raw_peak |
Float_t |
Calo0_raw_charge |
Float_t |
Calo0_baseline |
Float_t |
Calo0_baseline_rms |
Float_t |
Calo0_status |
Int_t |
Calo1_time |
Float_t |
Calo1_peak |
Float_t |
Calo1_width |
Float_t |
Calo1_charge |
Float_t |
Calo1_raw_peak |
Float_t |
Calo1_raw_charge |
Float_t |
Calo1_baseline |
Float_t |
Calo1_baseline_rms |
Float_t |
Calo1_status |
Int_t |
Calo2_time |
Float_t |
Calo2_peak |
Float_t |
Calo2_width |
Float_t |
Calo2_charge |
Float_t |
Calo2_raw_peak |
Float_t |
Calo2_raw_charge |
Float_t |
Calo2_baseline |
Float_t |
Calo2_baseline_rms |
Float_t |
Calo2_status |
Int_t |
Calo3_time |
Float_t |
Calo3_peak |
Float_t |
Calo3_width |
Float_t |
Calo3_charge |
Float_t |
Calo3_raw_peak |
Float_t |
Calo3_raw_charge |
Float_t |
Calo3_baseline |
Float_t |
Calo3_baseline_rms |
Float_t |
Calo3_status |
Int_t |
Notes:
- the 'time' variables are corrected by the clock phase such as to remove the 16ns digitizer jitter
- the 'status' variables are bit words that have the following meanings:
- 0 = good hit
- 1 = below threshold
- 2 = secondary hit
- 4 = amplitude overflow
- 8 = find baseline failed
- 16 = gaus fit failed
- 32 = cryst-ball fit failed
- 64 = invalid clock
- 128 = waveform missing
- 256 = waveform invalid
Calibrated Energy variables for Calorimeter and Preshower channels:
Variable Name |
Type |
Calo0_nMIP |
Float_t |
Calo0_E_dep |
Float_t |
Calo0_E_EM |
Float_t |
Calo1_nMIP |
Float_t |
Calo1_E_dep |
Float_t |
Calo1_E_EM |
Float_t |
Calo2_nMIP |
Float_t |
Calo2_E_dep |
Float_t |
Calo2_E_EM |
Float_t |
Calo3_nMIP |
Float_t |
Calo3_E_dep |
Float_t |
Calo3_E_EM |
Float_t |
Calo_total_nMIP |
Float_t |
Calo_total_E_dep |
Float_t |
Calo_total_E_EM |
Float_t |
Preshower0_nMIP |
Float_t |
Preshower0_E_dep |
Float_t |
Preshower0_E_EM |
Float_t |
Preshower1_nMIP |
Float_t |
Preshower1_E_dep |
Float_t |
Preshower1_E_EM |
Float_t |
Preshower_total_nMIP |
Float_t |
Preshower_total_E_dep |
Float_t |
Notes:
- Energies are in MeV
- 'Preshower0_E_EM' and 'Preshower1_E_EM' are dummy variables and filled with 'NaN' as we do not estimate the EM energy from the preshower layers
Cluster Counts in Tracking Stations:
Variable Name |
Type |
nClusters0 |
Int_t |
nClusters1 |
Int_t |
nClusters2 |
Int_t |
nClusters3 |
Int_t |
SpacePoint positions:
Variable Name |
Type |
SpacePoints |
Int_t |
SpacePoint_x |
vector |
SpacePoint_y |
vector |
SpacePoint_z |
vector |
TrackSegment Fit Paramters:
Variable Name |
Type |
TrackSegments |
Int_t |
TrackSegment_Chi2 |
vector |
TrackSegment_nDoF |
vector |
TrackSegment_x |
vector |
TrackSegment_y |
vector |
TrackSegment_z |
vector |
TrackSegment_px |
vector |
TrackSegment_py |
vector |
TrackSegment_pz |
vector |
Reconstructed Track Parameters:
Variable Name |
Type |
longTracks |
Int_t |
Track_PropagationError |
Int_t |
Track_Chi2 |
vector |
Track_nDoF |
vector |
Track_x0 |
vector |
Track_y0 |
vector |
Track_z0 |
vector |
Track_px0 |
vector |
Track_py0 |
vector |
Track_pz0 |
vector |
Track_p0 |
vector |
Track_x1 |
vector |
Track_y1 |
vector |
Track_z1 |
vector |
Track_px1 |
vector |
Track_py1 |
vector |
Track_pz1 |
vector |
Track_p1 |
vector |
Track_charge |
vector |
Track_nLayers |
vector |
Track_InStation0 |
vector |
Track_InStation1 |
vector |
Track_InStation2 |
vector |
Track_InStation3 |
vector |
Track_X_atVetoNu |
vector |
Track_Y_atVetoNu |
vector |
Track_ThetaX_atVetoNu |
vector |
Track_ThetaY_atVetoNu |
vector |
Track_X_atVetoStation1 |
vector |
Track_Y_atVetoStation1 |
vector |
Track_ThetaX_atVetoStation1 |
vector |
Track_ThetaY_atVetoStation1 |
vector |
Track_X_atVetoStation2 |
vector |
Track_Y_atVetoStation2 |
vector |
Track_ThetaX_atVetoStation2 |
vector |
Track_ThetaY_atVetoStation2 |
vector |
Track_X_atTrig |
vector |
Track_Y_atTrig |
vector |
Track_ThetaX_atTrig |
vector |
Track_ThetaY_atTrig |
vector |
Track_X_atPreshower1 |
vector |
Track_Y_atPreshower1 |
vector |
Track_ThetaX_atPreshower1 |
vector |
Track_ThetaY_atPreshower1 |
vector |
Track_X_atPreshower2 |
vector |
Track_Y_atPreshower2 |
vector |
Track_ThetaX_atPreshower2 |
vector |
Track_ThetaY_atPreshower2 |
vector |
Track_X_atCalo |
vector |
Track_Y_atCalo |
vector |
Track_ThetaX_atCalo |
vector |
Track_ThetaY_atCalo |
vector |
Track_x_atMaxRadius |
vector |
Track_y_atMaxRadius |
vector |
Track_z_atMaxRadius |
vector |
Track_r_atMaxRadius |
vector |
Notes:
- Only long tracks that have hits in tracking stations 1, 2, and 3 are saved (using track collection without IFT)
- the size of each vector is equal to the value of 'longTracks'
- (x0,y0,z0,px0,...) are taken from the most upstream tracker measurement
- (x1,y1,z1,px1,...) are taken from the most downstream tracker measurement
- track parameters at the scintillators are obtained via track extrapolation
- track propagation error indicates Acts::PropagatorError::StepCountLimitReached and Acts::CombinatorialKalmanFilterError::PropagationReachesMaxSteps, do not use these events
MC truth info for particles matched to reco track:
Variable Name |
Type |
t_pdg |
vector |
t_barcode |
vector |
t_truthHitRatio |
vector |
t_prodVtx_x |
vector |
t_prodVtx_y |
vector |
t_prodVtx_z |
vector |
t_decayVtx_x |
vector |
t_decayVtx_y |
vector |
t_decayVtx_z |
vector |
t_px |
vector |
t_py |
vector |
t_pz |
vector |
t_theta |
vector |
t_phi |
vector |
t_p |
vector |
t_pT |
vector |
t_eta |
vector |
t_st0_x |
vector |
t_st0_y |
vector |
t_st0_z |
vector |
t_st1_x |
vector |
t_st1_y |
vector |
t_st1_z |
vector |
t_st2_x |
vector |
t_st2_y |
vector |
t_st2_z |
vector |
t_st3_x |
vector |
t_st3_y |
vector |
t_st3_z |
vector |
isFiducial |
vector |
Notes:
- 't_pdg' will be zero if we failed to find a truth particle for the reconstructed track
- 'truthHitRatio' tells you how many clusters on track are from this truth particle
- 'isFiducial' tells you if the truth particle is within 100 mm radius for all 3 tracking stations
MC truth parameters of first 10 truth particles:
Variable Name |
Type |
truth_P |
vector |
truth_px |
vector |
truth_py |
vector |
truth_pz |
vector |
truth_m |
vector |
truth_pdg |
vector |
truth_prod_x |
vector |
truth_prod_y |
vector |
truth_prod_z |
vector |
truth_dec_x |
vector |
truth_dec_y |
vector |
truth_dec_z |
vector |
Notes:
- the first entry of the vector will be the initial particle from the generator or particle gun
- the 'prod' variables are the position of the production vertex and the 'dec' variables are the position of the decay vertex
MC initial truth parameters of Dark Photon and e+/e- daughter particles:
Variable Name |
Type |
truthM_P |
vector |
truthM_px |
vector |
truthM_py |
vector |
truthM_pz |
vector |
truthM_x |
vector |
truthM_y |
vector |
truthM_z |
vector |
truthd0_P |
vector |
truthd0_px |
vector |
truthd0_py |
vector |
truthd0_pz |
vector |
truthd0_x |
vector |
truthd0_y |
vector |
truthd0_z |
vector |
truthd1_P |
vector |
truthd1_px |
vector |
truthd1_py |
vector |
truthd1_pz |
vector |
truthd1_x |
vector |
truthd1_y |
vector |
truthd1_z |
vector |
Notes:
- dark photon is mother (M)
- e+ is daughter 0 (d0)
- e- is daughter 1 (d1)
Ntuple Noise Histograms
For TI12 data, randomly triggered events are not stored in the ntuple, but the scintillator charges for such an event are stored in histograms that are saved to the same root file as the ntuple. The histograms thus contain the noise distributions for each calorimeter and scintillator channel. The names of the noise historgrams are 'hRandomCharge0', 'hRandomCharge1', ..., and 'hRandomCharge14' where the number at the end of the name is the digitizer channel.
Example pyROOT Analysis Code
#!/usr/bin/env python
# Set up (Py)ROOT.
import ROOT
t = ROOT.TChain("nt")
nfiles = 0
nfiles += t.Add("/eos/experiment/faser/phys/2022/r0011/009148/*.root") # chain all ntuples from run 9148
# define histogram
hTrackChi2_over_nDOF = ROOT.TH1F("hTrackChi2_over_nDOF", "Track #chi^{2}/nDOF;Track #chi^{2} / nDOF;# of tracks",100,0,11)
hCaloE_EM = ROOT.TH1F("hCaloE_EM", "EM E in Calo;EM E (GeV);# of events",100,0.0,3000.0)
i = 0
for event in t:
i += 1
if i%1000 == 0:
print( "Processing event #%i of %i" % (i, t.GetEntries() ) )
if event.longTracks == 0: continue # only use events with at least 1 track
for j in range(event.longTracks): # loop over all long tracks in the event (long = has hits in last 3 tracking stations)
if event.Track_nDoF[j] != 0: # avoid division by zero error
hTrackChi2_over_nDOF.Fill(event.Track_Chi2[j]/event.Track_nDoF[j]) # fill histogram
hCaloE_EM.Fill(event.Calo_total_E_EM / 1000.0) # fill histogram
if i > 100000: # only look at the first 100k events
break
# Now save plots to pdf
filename = "Ntuple-9148-PhysicsAnalysis.pdf"
c = ROOT.TCanvas()
c.Print(filename+'[')
hTrackChi2_over_nDOF.Draw()
ROOT.gPad.SetLogy()
c.Print(filename)
c = ROOT.TCanvas()
hCaloE_EM.Draw()
ROOT.gPad.SetLogy()
c.Print(filename)
# Must close file at the end
c.Print(filename+']')
Example RDataFrame Analysis Code
A more efficient way of running is to use
RDataFrame
. A simple example of this is:
import ROOT as R
# Utilise multiple cores
R.EnableImplicitMT()
R.gROOT.SetBatch(True)
# Create Dataframe
df = R.RDataFrame("nt", "/eos/experiment/faser/phys/2022/r0011/009148/Faser-Physics-009148-*PHYS.root")
# Require at least one track
df = df.Filter("longTracks > 0", "> 1 Track")
# Define new variables
df = df.Define("Track_Chi2PerDof", "Track_Chi2/Track_nDoF")
df = df.Define("Calo_EEM_GeV", "Calo_total_E_EM/1000.")
# Define hists
htrack = df.Histo1D(("Chi2PerDof", "; #chi^{2}/Dof; NTracks", 100, 0, 11), "Track_Chi2PerDof")
hcalo = df.Histo1D(("CaloE", "; #EM E_{calo} [GeV]; NEvents", 100, 0, 11), "Calo_EEM_GeV")
# Draw hists and print to file
# NB. The first call to draw triggers a single event loop to fill all histograms
# To avoid having multiple eventloops you must define all hists before accessing any of them
filename = "Ntuple-9148-PhysiscsAnalysis.pdf"
c = R.TCanvas()
c.Print(f"{filename}[")
htrack.Draw()
R.gPad.SetLogy()
c.Print(filename)
c.Clear()
hcalo.Draw()
R.gPad.SetLogy()
c.Print(filename)
c.Print(f"{filename}]")
print (f"Ran eventloop {df.GetNRuns()} times")
which was tested with the root version obtained via
source /cvmfs/sft.cern.ch/lcg/views/LCG_101/x86_64-centos7-gcc10-opt/setup.sh
A more complex framework based on this can be found at
here
--
DeionElginFellers - 2023-01-18