lcio
format: gamma
)
mu-
)
K0L
). More kaon energies can be used to determine corrections for non-linearity, but for the time being this is not done.
Marlin
(including digitization and PandoraPFA reconstruction) is run iteratively and in several steps so as to converge to a set of calibration constants within a required precision. In general, the steps have to be performed in a particular order since the some constants depend on the proper calibration and setting of constants determined in an earlier step. Typically the total PFO energy is collected per event and a histogram is filled from which correction factors are extracted. The necessary information is collected in ROOT files using the PfoAnalysis
Marlin processor, run after the PFO creation, with CollectCalibrationDetails
turned on in the processor parameters. More crucially, Steve has written several applications that can be run over these ROOT files and perform the histogram fills, fits and calibration constant extraction. The Marlin processor and the code/applications to run the extraction of every constant are contained in the PandoraAnalysis
package ( link on githubGEAR
(and used events simulated with Mokka
). The procedure was initially modified to run at CERN with a version of the ILD detector tweaked to incorporate the CLIC detector aspects (namely deeper HCal, longer detector, different field), always using the old reconstruction framework. The CLIC detector model diverged more and more from the ILD detector model, not least of which was the use of an all-silicon tracker and associated track reconstruction software. Coupled with the introduction of the new software framework based on DD4hep and the development of the new detector simulation model, it became clear that we needed a new calibration procedure.
To maximize backwards compatibility and to minimize work, the constant extraction applications are used exactly and directly as provided in PandoraAnalysis
, but the calibration scripts were heavily modified to accommodate the new DD4hep-based software: The geometry is initial using DD4hep
, the DDCaloDigi
digitizers are used and most importantly, the DDMarlinPandora
package is used to interface with the PandoraPFA reconstruction. Further modifications include the addition of comments and printout statements in the calibration procedure, checks at the various steps and an attempt to make the whole procedure more robust and possible to be run by anyone. With the current status of the code, anybody can check out the calibration package. Assuming a condor
cluster is setup in the environment, the calibration can be run with minimal modification.
svn co svn+ssh://svn.cern.ch/reps/clicdet/trunk/Users/nikiforo/DD4hepDetCalibrationNo compilation is needed since it is merely a collection of bash and python scripts as well as some configuration templates. As mentioned above, it runs full reconstruction with
Marlin
and uses PandoraAnalysis
, therefore a recent installation of ILCSOFT is needed. Generally one should use the same ILCSOFT installation/version as the one used to produce the simulated single particle files used as an input to the procedure.
The package contains all the scripts, configuration files and templates needed to run the calibration, assuming that the input simulation lcio files are accessible and organized appropriately. Here is a list of the most important files in the package: Calibrate.sh
: This is THE main script file that configures and launches the calibration. Within this file and between the lines labelled ##### CONFIGURATION #####
and ##### END OF MAJOR CONFIGURATION #####
lie the major configuration variables to be configured, in the form of local shell environment variables. The main parameters to be configured are (please consult with the file for more details and more options): ilcsoftPath
: The Path to the ILCSOFT
installation to be used.
geometryModelName
: The detector model geometry name. Together with ilcsoftPath
(which sets the version of lcgeo
), it determines the location of the DD4hep compact xml file for the detector geometry. Override the dd4hep_compact_xml
variable to use an xml file in a non-standard location.
slcioPath
: by default it is set to /scratch1/CalibDataForDD4hep/${geometryModelName}/
and assumes a certain structure for the directories. For example:
/scratch1/CalibDataForDD4hep/CLIC_o2_v04/gamma/10/ /scratch1/CalibDataForDD4hep/CLIC_o2_v04/mu-/10/ /scratch1/CalibDataForDD4hep/CLIC_o2_v04/K0L/50/The script checks whether that the files exist and are accessible, preferable with sequential indices set contained in their filenames.
slcioFormat
: By default set to ".*_([0-9]+).slcio" . Used to filter the lcio files contained in the directories (not needed in this example with a well defined directory structure) but also to determine the index of the filename (as defined by the pattern ([0-9)+)
. No need to modify this if you adhere to the file naming convention (numerical file index before the .slcio
extension) and you use the directory structure defined above.
outputPath
: Absolute path to a directory to store the output results: constant extraction log file, calibration constants summary, plots and Marlin steering xml file template with the updated calibration constants set.
clean.sh
: Auxiliary script to clean the directory of all results, intermediate files and root files. Useful when you want to run a new calibration.
Xml_Generation/CLIC_PfoAnalysis_AAAA_SN_BBBB.xml
: The template Marlin XML steering file to run the reconstruction. Defines and configures the processors run during the reconstruction of the calibration events. You can replace the template with your own, or modify it to include your own additional processors that you need to have to run the reconstruction for your detector. For example, you may want to change the track finding and fitting algorithms (currently this is using the track cheater) or move to a different digitizer. The calibration script copies this file and modifies it for each input file (using sed
) by replacing the following dummy variables (identifiable by the suffix _XXXX
): slcio_XXXX
: Replaced by the input lcio file
Gear_XXXX
: Replaced by the GearOutput.xml file for the detector. Kept for backwards compatibility and the file is generated automatically using convertToGear
.
COMPACTXML_XXXX
: Replaced by the DD4hep compact xml file describing the geometry of the detector
CALIBR_ECAL_XXXX
, CALIBR_HCAL_BARREL_XXXX
, CALIBR_HCAL_ENDCAP_XXXX
, CALIBR_HCAL_OTHER_XXXX
: Replaced by the latest digitization constants for the current step.
ECALBarrelTimeWindowMax_XXXX
, HCALBarrelTimeWindowMax_XXXX
, ECALEndcapTimeWindowMax_XXXX
, HCALEndcapTimeWindowMax_XXXX
, MHHHE_XXXX
: replaced by the chosen configuration values
PSF_XXXX
: Replaced by the name of the chosen Pandora Settings xml configuration file (by default is PandoraSettings.xml
, which is copied over from MarlinPandora
along with the photon likelihood data file).
ECalGeVToMIP_XXXX
, HCalGeVToMIP_XXXX
, EMT_XXXX
, HMT_XXXX
, ECALTOEM_XXXX
, HCALTOEM_XXXX
, ECALTOHAD_XXXX
, ECALTOHAD_XXXX
, HCALTOHAD_XXXX
, MuonGeVToMIP_XXXX
: Replaced by the latest Pandora calibration constants for the current step.
pfoAnalysis_XXXX.root
Replaced by the output root file name associated with the current input slcio file.
Xml_Generate.py
to see how and which variables are replaced. Xml_Generarion/Xml_Generate.py
: Python script responsible to change the marlin xml steering file template and create one for every job with the appropriate settings (slcio file, constants, etc). The tokens AAAA
and BBBB
in the template filename are replaced by the job name and the input lcio file index, respectively, for every job xml file.
Root_File_Generation/condorSupervisor_Calibration.sh
: File responsible for generating temporary job scripts for each event, copying files to worker nodes, submitting jobs and waiting for them to finish. Only works with condor
. This particular version was developed for use at CERN over afs
, but should be usable on any condor
system. If AFS or other network file system is not present, it can still run, provided that the ILCSOFT installation files are accessible on every worker node. The input, output and log files can be transferred using the condor file transfer mechanism.
Root_File_Generation/batchSupervisor_Calibration.sh
: Not working yet, but should be fixed to run over LSF batch or any other job scheduling system.
Root_File_Generation/DummyCondorSupervisor.sh
: Allows to run on a local machine (slow)
clicdpsw
to a virtual machine ( clidpcalwn00
) set-up to perform the calibration. Before the calibration procedure is attempted, single particle events (photons, muos and k-longs) need to be simulated as described above. The procedure described here can be used, though it is preferable that the events be generated on a batch farm or the grid for efficiency. To be able to parallelize the reconstruction at the calibration step, it is advised to split the events (1000, 10000 events or even more per point) over many output files (typically 100) according to the capacity of your computing cluster. A current limitation of the calibration script is that it can only launch one job per input file (no grouping files in one job or splitting files over several jobs).
We have collected the files (100x100 events for each point) under:
[clicdpsw@clicdpcalwn00 ~]$ ls /scratch1/CalibDataForDD4hep/CLIC_o2_v04/ gamma K0L mu-
/scratch1
a mountpoint to an CERN cloud infrastructure volume (0.5 TB) attached to this machine. Notice that this directory (or the entire /scratch1
) is only accessible locally to the machine (i.e. not on AFS
). The files however will be automatically transferred to the execution nodes.
We have also checked out a version of the DD4hepDetCalibration
calibration package under /scratch1
. Note that also this directory is not on AFS
. At least for this temporary calibration pool setup, due to a limitation of condor
the directory from which you launch the calibration and submit the jobs (i.e. /scratch1/DD4hepDetCalibration
) can not be on AFS
since the condor daemons that transfer back the output files (and the out
, err
, and log
files) do not authenticate on AFS
and have not access to it. This limitation does not affect, however, the ILCSOFT installation or the input files which are accessed by the Marlin job itself, which runs under the authenticated user. The input files lcio described above could in principle actually be hosted somewhere on AFS
, provided that the authenticated user (in this case clicdpsw
) has access.
In principle, you can already launch the calibration for the CLIC_o2_v04
model, as implemented in lcgeo
for the ILCSOFT installation under /afs/cern.ch/eng/clic/work/ilcsoft/HEAD-2016-01-19/
. You should verify that your simulated files are indeed simulated with this detector model and ILCSOFT/lcgeo version by looking at the lcio header information. For example, setup ILCSOFT and look at a photon file with anajob
:
[clicdpsw@clicdpcalwn00 DD4hepDetCalibration ]$ source /afs/cern.ch/eng/clic/work/ilcsoft/HEAD-2016-01-19/init_ilcsoft.sh [clicdpsw@clicdpcalwn00 DD4hepDetCalibration ]$ anajob /scratch1/CalibDataForDD4hep/CLIC_o2_v04/gamma/10/DDSim_CLIC_o2_v04_gamma_10GeV0_0_250116102012_0.slcio |less anajob: will open and read from files: /scratch1/CalibDataForDD4hep/CLIC_o2_v04/gamma/10/DDSim_CLIC_o2_v04_gamma_10GeV0_0_250116102012_0.slcio [ number of runs: 1, number of events: 100 ] Run : 0 - CLIC_o2_v04: parameter CommandLine [string]: /afs/cern.ch/eng/clic/work/ilcsoft/HEAD-2016-01-19/lcgeo/HEAD/bin/ddsim --compactFile=/afs/cern.ch/eng/clic/work/ilcsoft/HEAD-2016-01-19/lcgeo/HEAD/CLIC/compact/CLIC_o2_v04/CLIC_o2_v04.xml --runType=batch -I=GEN_gamma_10GeV.slcio -O=DDSim_CLIC_o2_v04_gamma_10GeV0_0_250116102012_0.slcio -N=100 --action.tracker Geant4TrackerWeightedAction --skipNEvents=0 --physicsList=QGSP_BERT_HP --enableDetailedShowerMode, ... parameter compactFile [string]: /afs/cern.ch/eng/clic/work/ilcsoft/HEAD-2016-01-19/lcgeo/HEAD/CLIC/compact/CLIC_o2_v04/CLIC_o2_v04.xml, ...We can verify the ILCSOFT installation,
DDSim
version and dimulation settings, lcgeo
version and detector model from the relevant lines.
Before starting the calibration, it is usually a good idea to clean the directory of previous output, log files, root files, xml files and other results. Make sure you kept the information you need!. To clean, just do:
[clicdpsw@clicdpcalwn00 DD4hepDetCalibration]$ ./clean.shThe scripts are already configured for the appropriate file names, ILCSOFT version and detector geometry. Try to run the calibration:
[clicdpsw@clicdpcalwn00 DD4hepDetCalibration]$ ./Calibrate.sh Will perform calibration with the following settings: ILCSOFT: /afs/cern.ch/eng/clic/work/ilcsoft/HEAD-2016-01-19 AnalysePerformance: /afs/cern.ch/eng/clic/work/ilcsoft/HEAD-2016-01-19/PandoraAnalysis/HEAD/bin/AnalysePerformance dd4hep_compact_xml: /afs/cern.ch/eng/clic/work/ilcsoft/HEAD-2016-01-19/lcgeo/HEAD/CLIC/compact/CLIC_o2_v04/CLIC_o2_v04.xml LCIO File path: /scratch1/CalibDataForDD4hep/CLIC_o2_v04/ Photon path: /scratch1/CalibDataForDD4hep/CLIC_o2_v04//gamma/10/ [ 100 ( 0 ... 99 ) files found ] Muon path: /scratch1/CalibDataForDD4hep/CLIC_o2_v04//gamma/10/ [ 100 ( 0 ... 99 ) files found ] Kaon path: /scratch1/CalibDataForDD4hep/CLIC_o2_v04//K0L/50/ [ 100 ( 0 ... 99 ) files found ] HCALEndcapTimeWindowMax: 10 HCALBarrelTimeWindowMax: 10 ECALEndcapTimeWindowMax: 20 ECALBarrelTimeWindowMax: 20 MHHHE: 1 Run on batch system: Condor Strike any key to continue...If everything is properly set and accessible, you should see the proper versions of the relevant software, as well as the file counts for the input lcio files (with a printout of the first and last index of the files for crosscheck). Press any key to launch the first set of 100 jobs to reconstruct the photon events. After some initial work (including loading the geometry and creating a Gear file), the jobs will be submitted and you should see several lines of printout ending with:
/tmp/clicdpsw/jobs.Marlin_Runfile_10_GeV_Energy_gamma.txt.tmp is empty. All jobs submitted. Checking whether the jobs have finished... Not finished yet, come back later.The script will now wait for the jobs to finish so it can collect the output root files and calculate the ECal digitization constants. After a few minutes, the jobs are done, the files are copied back to the submission directory and the script continues with the calculation of the constants:
Jobs finished. ECAL Digitisation: Update CalibrECAL CaloHitEnergyECal (7324 entries), rawrms: 0.647749, rmsx: 0.545863 (9.83505-11.9056), low_fit and high_fit (10.0939-12.1644), << mean: 10.8111 Info in <TCanvas::MakeDefCanvas>: created default TCanvas with name c1 FCN=0.26993 FROM HESSE STATUS=OK 16 CALLS 339 TOTAL EDM=7.71968e-08 STRATEGY= 1 ERROR MATRIX ACCURATE EXT PARAMETER STEP FIRST NO. NAME VALUE ERROR SIZE DERIVATIVE 1 p0 2.34650e+03 3.99794e+01 3.20368e-03 8.73356e-06 2 p1 1.06986e+01 1.14501e-02 2.55075e-06 -2.52034e-02 3 p2 2.42183e+00 9.87843e-02 4.74486e-07 9.73710e-03 Info in <TCanvas::Print>: png file /scratch1/DD4hepDetCalibration/output/Calorimeter_Hit_Energies_ECal_Digitisation.png has been created Info in <TCanvas::SaveSource>: C++ Macro file: /scratch1/DD4hepDetCalibration/output/Calorimeter_Hit_Energies_ECal_Digitisation.C has been generated _____________________________________________________________________________________ Digitisation of the ECal. : For Photon energy : 10 : /GeV Gaussian fit to ECal calorimeter hit energy has : the following parameters, fit uses 90% data with : min RMS: : Amplitude : 2346.5 : ECal Digi Mean : 10.6986 : Standard Deviation : 0.642581 : The total number of events considered was : 7324 :The procedure iterates some more times to converge on digitization constants that give a reconstructed energy as close as possible to 10 GeV. The procedure will continue similarly with the muon and kaon files to obtain more digitization constants. Then it will loop through the points performing more iterations to obtain the Pandora calibration constants. You can monitor the progress of each step by watching the output for the fitting of the parameters and the calculation of the updated constants as in the example for the first step above. In addition, the histograms and fits for each step are saved in
png
and macro formats.
You can open a new session to the machine from which to monitor the output plots. You can also use
Depending on the availability of the nodes, all your jobs should quickly go into the R (i.e. running) state. If they are marked always as I (Idle) it would suggest that either all nodes are busy or unable to accept the jobs due to misconfiguration of the cluster. In case the jobs stay in the Idle, Held, or other erroneous state, you can remove (i.e. kill) them by using the command Further, if you are observing that your jobs are being killed or not producing the output root files (you see messages from the calibration script saying that the root files cannot be found), you can consult the condor log files under |
K0L
and the collection of the final numbers. You should see this output:
Jobs finished. Info in <TCanvas::Print>: png file /scratch1/DD4hepDetCalibration/output/PandoraPFA_Calibration_Hadronic_Energy_Scale_Chi_Sqaured_Method_50_GeV_KaonL.png has been created Info in <TCanvas::SaveSource>: C++ Macro file: /scratch1/DD4hepDetCalibration/output/PandoraPFA_Calibration_Hadronic_Energy_Scale_Chi_Sqaured_Method_50_GeV_KaonL.C has been generated _____________________________________________________________________________________ For kaon energy : 50 : m_eCalToHadInterceptMinChi2 : 49.5012 : m_hCalToHadInterceptMinChi2 : 49.5012 : m_minChi2 : 4199.9 : The total number of events considered was : 4163 : ['Xml_Generate_Output.py', '1.00528117089', '1.00528117089', '0.958466960008', '1.05876999033', '40.148143635', '46.6732165399', '49.2926519882', '58.1907834102', '1', '163.934', '42.9185', '1000', '10', '10', '20', '20', '/scratch1/DD4hepDetCalibration/output/']The output directory will contain text file that logs each step (
Calibration.txt
) a text file that collects all the computed constants in the form of shell variables (calib.CLIC_o2_v04.txt
), a Marlin xml steering file template with the new constants (CLIC_PfoAnalysis_AAAA_SN_BBBB.xml
) as well as a set of plots for each step.
screen
session ./Calibrate.sh
within a screen
session if you are running remotely to avoid the ssh session dropping and having to start from scratch. However, you should make sure the screen environment is set up properly. For example, we identified the following two issues: shell -$SHELL
to your $HOME/.screenrc
file
kinit
and then aklog
. However, such an error will most likely appear because $TMPDIR
is not set in your screen session. Add this line to your $HOME/.screenrc
file: export TMPDIR=/tmp/username
or whatever is appropriate for your case.
PandoraAnalysis/CalibrationHelper.cc
. Update! The PandoraAnalysis
package has been modified privately and merger to production is pending.
MyDDCaloDigi
configuration). Then, the calibration procedure should be modified to handle the Lumical separately.
DDCaloDigi
processor configuration by accepting a vector of floats for CalibrECAL
:
<parameter name="CalibrECAL" type="FloatVec">40.3366 80.6732</parameter>
<parameter name="ECALLayers" type="IntVec">17 100 </parameter>
CalibrECAL
constant (40.3366) for the first 17 layers and the second (80.6732) for the rest (the next switch of layer "family" would be at the unrealistically large 100-th layer). The second number is a result of a manual multiplication by a factor of 2 of the first number. Notice further down in your Marlin steering file that for the HCal digitization configuration there is only one family of layers (the switching index is set to 100). Things that can be done: DDCaloDigi
to get the layer info from the geometry and perhaps apply the factor of two automatically (however this removes the flexibility of specifying arbitrary constants for each layer family)
Xml_Generation/CLIC_PfoAnalysis_AAAA_SN_BBBB.xml
) in the line mentioned above. For example, change 17 100
to 20 100
for an ECal with changing layer thickness after the 20th layer. If you only have one type of layer, say 40 layers, you should change the line to 100
. Especially in the last case, you should perhaps modify the calibration scripts (which assume the back family of layers have twice the absorber thickness) to either stop adding the extra calibration constant, or at least fill it with the same value as the first layer family.
MuonBarrelCollection MuonEndcapCollection
whereas it is now set to YokeBarrelCollection YokeEndcapCollection
for the CLIC model.
DDSimpleMuonDigi
configuration: MUONThreshold
was being used instead of the correct MuonThreshold
.
[ VERBOSE "MyDDMarlinPandora"] PhotonReconstructionAlgorithm::CreateRegionsOfInterests no photon candidates avaiable, no regions of interests are created [ VERBOSE "MyDDMarlinPandora"] this->CreateClustersOfInterest(clusterVector) return STATUS_CODE_INVALID_PARAMETER [ VERBOSE "MyDDMarlinPandora"] in function: Run [ VERBOSE "MyDDMarlinPandora"] in file: /afs/cern.ch/eng/clic/work/ilcsoft/HEAD-2016-02-15/PandoraPFANew/HEAD/LCContent-master/src/LCParticleId/PhotonReconstructionAlgorithm.cc line#: 77 [ VERBOSE "MyDDMarlinPandora"] iter->second->Run() throw STATUS_CODE_INVALID_PARAMETER [ VERBOSE "MyDDMarlinPandora"] in function: RunAlgorithm [ VERBOSE "MyDDMarlinPandora"] in file: /afs/cern.ch/eng/clic/work/ilcsoft/HEAD-2016-02-15/PandoraPFANew/HEAD/PandoraSDK-master/src/Api/PandoraContentApiImpl.cc line#: 175 [ VERBOSE "MyDDMarlinPandora"] Failure in algorithm 0x7df5420, PhotonReconstruction, STATUS_CODE_INVALID_PARAMETER [ VERBOSE "MyDDMarlinPandora"] HighEnergyPhotonRecoveryAlgorithm: Failed to obtain cluster list PhotonClusters [ VERBOSE "MyDDMarlinPandora"] SoftClusterMergingAlgorithm: Failed to obtain cluster list PhotonClusters [ VERBOSE "MyDDMarlinPandora"] IsolatedHitMergingAlgorithm: Failed to obtain cluster list PhotonClustersTo be investigated more, but it doesn't appear to affect the results much. Probably it can be narrowed down to a particular detector region.
[ VERBOSE "MyPfoAnalysis"] CalibrationHelper::GetMinNHCalLayersFromEdge: Unknown exception.Probably related to the access of Gear information. The error is probably emitted from
CalibrationHelper.cc
.
PandoraSettingsPhotonTraining.xml
settings file. After the retraining, a new photon likelihood xml file is obtained which can be fed back to the settings of the pandora calibration procedure to be repeated.
ClicPerformance
package under examples
with the Xml_Generation/CLIC_PfoAnalysis_AAAA_SN_BBBB.xml
file and make sure that all other parameters except the ones set by this calibration procedure, typically ending in _XXXX
, are the same.
-- NikiforosNikiforou - 2016-01-27 I | Attachment | History | Action | Size | Date | Who | Comment |
---|---|---|---|---|---|---|---|
![]() |
CalibrationPandoraAnalysisExplained.pdf | r1 | manage | 79.0 K | 2016-01-27 - 14:49 | NikiforosNikiforou |