Instructions on Preparing PIDCalib Samples

This page provides information on how to create PIDCalib samples. It is for experts only.

Latest PIDCalib setup instructions

Please follow the instructions on PIDCalib Package webpage. Besides the above, you also need to
getpack PIDCalib/CalibDataSel head

CalibDatSel Package: Produce tuples from DST

The files you need now boil down to: Src/TupleToolPIDCalib.cpp TupleToolPIDCalib.h EvtTupleToolPIDCalib.cpp EvtTupleToolPIDCalib.h

These are places to put Tuple variables.

And dev/makePIDCalibNtuples.ganga.py makePIDCalibNtuples_Run2.py makePIDCalibNtuples.py

makePIDCalibNtuples.py is for Run 1 where the input is various stripping lines

makePIDCalibNtuples_Run2.py is for Run2 where the input is various trigger lines and this matching needs to be done too.

makePIDCalibNtuples.ganga.py simply runs the jobs. (I think you will need to update the DV version listed).

In dev/makePIDCalibNtuples.ganga.py, add

S5TeVdn = PIDCalibJob(
                    year           = "2015"
                 ,  stripVersion   = "5TeV"
                 ,  magPol         = "MagDown"
                 ,  maxFiles       = -1
                 ,  filesPerJob    = 1
                 ,  simulation     = False
                 ,  EvtMax         = -1
                 ,  bkkQuery       ="LHCb/Collision15/Beam2510GeV-VeloClosed-MagDown/Real Data/Reco15a/Turbo01aEM/95100000/FULLTURBO.DST"
                 ,  bkkFlag        = "OK"
                 ,  stream         = "Turbo"
                 ,  backend        = Dirac()
  )
Then execute the file inside ganga:
In [5]:execfile('makePIDCalibNtuples.ganga.py')
Preconfigured jobs you can just submit: 
. PIDCalib.up11.submit()
. PIDCalib.validation2015.submit()
. PIDCalib.up12.submit()
. PIDCalib.down11.submit()
. PIDCalib.S23r1Up.submit()
. PIDCalib.down12.submit()
. PIDCalib.S5TeVdn.submit()
. PIDCalib.test.submit()
. PIDCalib.S23r1Dn.submit()
---------------------

In [6]:PIDCalib.S5TeVdn.submit()

After all jobs finished, you may need to download them to local directory (not sure if needed)

In [6]:for js in j.subjobs:
   ...:     if js.status == 'completed':
   ...:         js.backend.getOutputData()

CaibDataScripts: Produce tuples for each particles

The RICH performance changes as a function of time (depends on conditions and alignment changes). A RooDataSet can only hold so many events and variables before it becomes too large and won’t save correctly. Both of these facts leads us to have a) more than one file per decay channel and b) the numerical index of each file ascends with run number. This is useful so that if someone wants to run over a specific run period they can just select the few relevant files.

Hence this means that the workflow goes like this:

Ntuples finish making --> Run ranges are defined -- > Data split into those specific run ranges -- > any additional selection applied -- >mass fit performed in each run range for each charge -- > data is sWeighted --> spectator variables are added to the data set --> both charge datasets are merged . The same set of steps is repeated for each decay channel. For the protons, since there is more than one momentum range the fit is done separately in each range and then merged. For D*->D(Kpi)pi the data the definition of the run range and the data splitting is a common step for both K and pi, however the mass fits and are done twice (even though it is the same fit).

First, get the package:

getpack PIDCalib/CalibDataScripts head

Inside, There are 3 src directories Src for S20/S0r1 data Src_S21 for S21 Src_Run2 for S22/23

The reason for different directories is due to changes in the ntuple format/naming conventions and changes in stripping cuts, which changed the selection cuts subsequently applied. Also the variables stored in the calibration datasets has also changed as a function of time. E.g for Run 2 we save online and offline variables.

In cmt/requirements The first step is to choose the correct src directory to compile. This is just done by changing src directory to your corresponding one (except that of SetSpectatorVars.cpp) Then the usual

cmt br cat make

Then go to jobs/Stripping23 and modify configureGangaJobs.sh

Before submitting jobs to PBS, you need to do the following to make it recognize you: add the following lines in ~/.gangarc

preexecute =
import os
env = os.environ
jobid = env["PBS_JOBID"]
tmpdir = None
if "TMPDIR" in env: tmpdir = env["TMPDIR"].rstrip("/")
else: tmpdir = "/scratch/{0}".format(jobid)
os.chdir(tmpdir)
os.environ["PATH"]+=":{0}".format(tmpdir)
postexecute =
import os
env = os.environ
jobid = env["PBS_JOBID"]
tmpdir = None
if "TMPDIR" in env: tmpdir = env["TMPDIR"].rstrip("/")
else: tmpdir = "/scratch/{0}".format(jobid)
os.chdir(tmpdir)
make sure you have the above lines everytime you run jobs.

and then go to GetRunRanges/

Here you will see a set of scripts, one for each polarity, one for each particle species. You shouldn’t need to change anything – it will look inside your .gangarc file to find your gangadir location etc etc. the output of these jobs gets sent to the jobs/Stripping23/ChopTrees directory as a .pkl file. This file contains the run number ranges that the script which this script calls defines. The file that is actually run by the ganga job is $CALIBDATASCRIPTSROOT/scripts/sh/GetRunRanges.sh which in turn calls $CALIBDATASCRIPTSROOT/scripts/python/getRunRanges.py for Dst and Jpsi. All this script does is look at your tuples, see how the candidates are distributed by runnumber and then split into an number of ranges such that each range contains about a million candidates but avoids the last dataset having too few.

do

ganga ganga_gp_getRunRanges_Dst_MagDown.py 
Please change Stripping version accordingly.

This is just a one-subjob ganga job.

The ganga version works is v600r44, please check if there are other versions.

after that, go to ../ChopTrees and do

ganga ganga_gp_chopTrees_Dst_MagDown.py 
Please also change stripping version if needed.

This will create a list of jobs on batch and could be viewed by qstat

All this script does is call $CALIBDATASCRIPTSROOT/scripts/sh/ChopTrees.sh which in turn calls $CALIBDATASCRIPTSROOT/scripts/python/ChopTrees.py. All this does is look at the .pkl file in ChopTrees from the previous step. It then goes into your gangadir/jobdir. It loops over all the subjobs. In each subjob it creates a different file for each run range. So for example before you do this the only root file in directory would be PIDCalib.root. Once you’ve finished this stage it will look more like this: PID_0_dst_k_and_pi.root etc.

You’ll notice that most of the files are empty since that ganga sub jobs didn’t contain any runs that fall into run range x etc. That’s not a problem, but this is why you need to run the job at oxford since there is a lot of diskspace.

Fit, sWeight and final tuples

In CalibDataScripts/jobs/Stripping5TeV/Dst, Lam0, Jpsi, so run corresponding codes to get calibration samples.

The produced file is in the location setting in configureGangaJobs.sh and it looks like: CalibData_2015/MagDown/K, pi, Mum p etc.

Upload files

The file used is $CALIBDATASCRIPTSROOT/scripts/python/uploadData.py The following things are needed to be changed: prefix,

test

change to user scripts

-- WenbinQian - 2016-03-17

Edit | Attach | Watch | Print version | History: r23 | r8 < r7 < r6 < r5 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r6 - 2016-03-21 - WenbinQian
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCb All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback