Alignment Position Error (APE) Estimator

Complete: 4

Goal of this page

Description of the configuration and operation of the tool for estimating the Alignment Position Error (APE).

Introduction

The tool is called ApeEstimator and is located in the /UserCode here. It contains several subtools needed in addition to the one which really calculates APE parameters, and several scripts to run the procedure and to produce validation plots.

Setting up the tool

The current setup is tested and used with CMSSW_4_2_5. The recent tag is V02-01-00. After setting up the CMSSW area, do the following:

  • cvs co -r V02-01-00 -d Alignment UserCode/JohannesHauk/ApeAndCpeStudies/FullVersion/Alignment
  • cvs co -r V02-01-00 -d ApeEstimator UserCode/JohannesHauk/ApeAndCpeStudies/FullVersion/ApeEstimator
  • scram b
  • bash ApeEstimator/ApeEstimator/scripts/initialise.bash

The last command creates all relevant folders needed for the outputs, thus automated procedures in the scripts. It also copies some scripts for easier handling.

Now, it is necessary to create a .root-file containing the TTree with all relevant information about the silicon modules of the tracker. This is done by a simple standalone tool, placed in Alignment/TrackerTreeGenerator, which uses the ideal geometry. The ideal geometry is chosen, since it guarantees that selections of modules via their position space coordinates chose always the same modules. E.g. TOB modules on the same rod have the same design position in phi, but misalignment could cause a selection choosing only some modules if the cut is by accident selected around the nominal position. The file is created with:

  • cmsRun Alignment/TrackerTreeGenerator/test/trackerTreeGenerator_cfg.py

The .root-file containing the TTree can be found and browsed in Alignment/TrackerTreeGenerator/hists/TrackerTree.root, and there it is read from in the APE calculation.

Now, the tool is set up and the procedure for calculating APEs can be configured and started. However, in order to allow fast iterations and parallelisation, a private skim of the files which should be used is created, as explained in the following step.

Creation of Private Skim

Default Creation

In order to allow parallelisation and fast iterations, a private skim of files is created from the AlCaReco files. The event content is minimised to the needs for the ApeEstimator, a tighter preselection is also applied, and the files are split in size for optimising between number of files and number of events inside. To do so, the file ApeEstimator/ApeEstimator/test/batch/skimProducer.bash is run on the batch farm. It uses the configuration in ApeEstimator/ApeEstimator/test/SkimProducer/skimProducer_cfg.py. There, the used track selection is defined in ApeEstimator/ApeEstimator/python/AlignmentTrackSelector_cff.py, and the trigger selection in ApeEstimator/ApeEstimator/python/TriggerSelection_cff.py. The event content is defined in ApeEstimator/ApeEstimator/python/PrivateSkim_EventContent_cff.py.

Which dataset to process is steered via a configurable parameter using the VarParsing, which also allows a local test run.

In order to have the file names correct for the automated workflow, it is necessary to run after the skim production the script ApeEstimator/ApeEstimator/test/SkimProducer/cmsRename.sh.

After having run those two scripts for each dataset that one wants to process, the skim is ready for the automated workflow of the APE estimation. However, the folder of the CAF user diskpool to store the output, and later read it in, is not automised. This needs to be adjusted by the user.

Specific Creation using in addition a List of Good Tracks

In order to allow another event and track selection based on event contents which are already excluded in the AlCaReco files, the configuration defined in ApeEstimator/ApeEstimator/test/SkimProducer/skimProducer_cfg.py allows to read in a event-and-track list in a specific format. Thus, one can apply an event and track selection on the corresponding RECO file (if available...) or on the AOD file if the content is enough for the selection (should be always available), which can of course be done via CRAB.

The list is a TTree in a .root-file, several outputs in case of parallel processing can simply be merged using hadd. The corresponding tools for generating the list respectively reading in the list and select only chosen tracks are placed in ApeEstimator/Utils.

In fact this was used to produce the recent private skims on 2011 data placed in /store/caf/user/hauk/data/DoubleMu/ (the folder name is misleading, in fact the dataset is not obtained using DoubleMu triggers, but SingleMu triggers), and on 2011 MC placed in /store/caf/user/hauk/mc/Summer11/, since the old AlCaReco selection was not optimal (especially the selection on "isolated muons", which had a big fake rate); thus the new one which was applied later to the official streams is applied using this workaround.

The APE Estimation Tool

In order to allow parallel processing, the tool is based on two different modules. The first one (ApeEstimator) reads the events and gathers all relevant information in several .root-files. The second one (ApeEstimatorSummary) then calculates the APE values afterwards, requiring to merge the files from the first step. The tool is automated and based on 4 scripts, which need to be run sequentially, starting the next one only after all actions initiated by the previous one have finished successfully. Since the method is a local method, iterations are necessary, so the chain needs to be repeated. In the following, the configuration of the two modules is explained, the scripts to run are explained in a later section.

In general, all configurations should be done in your final blah_cfg.py. The files blah_cfi.py define the configurable parameters, mainly without doing any selection, and should never be changed, they are only templates. The files blah_cff.py give the default settings, and should only be changed in exceptional cases.

Configuration of ApeEstimator

The ApeEstimator module is coded in ApeEstimator/ApeEstimator/plugins/ApeEstimator.cc, having the configuration template in ApeEstimator/ApeEstimator/python/ApeEstimator_cfi.py with documentation of the configurable parameters, and the default settings in ApeEstimator/ApeEstimator/python/ApeEstimator_cff.py.

For testing purposes there is one configuration file ApeEstimator/ApeEstimator/test/testApeestimator_cfg.py, but the general configuration used in the automated workflow can be found elsewhere as explained later.

Event, Track and Hit Selections

The module contains the possibility of a dedicated hit and cluster selection. However, the cluster selection is common for all pixel modules, respectively common for all strip modules. Some selections are applied to both pixel and strip hits. These selections are based on intervals, you need to specify always pairs of numbers to select specific intervals, e.g for one interval (0.3,0.4) or for three intervals (0.3,0.4, 1.8,1.9,-1.7,-1.5). In case of integers, a single number can be selected by e.g. (3,3). If no number is given, no selection is applied.

The track selection is hardcoded and can only be switched on and off. This has historical reasons, and should probably be excluded, and instead the official AlignmentTrackSelector should be used before the applied refit. This would also guarantee, that the track selection is identical during the iterations, since the change of the APE values might lead to small migrations of the track parameters inside/outside the selection window due to the refit.

Some additional selections and configurations can be applied.

Choose between APE Calculation and Control Plots

Furthermore, there are two important switches, since the module can be used for the calculation of APE values, but also as analyzer only, producing zillions of control plots. Calculating APEs is defined in the cff.py as the module ApeEstimator, setting the switch calculateApe = True. Using the analyzer is defined in the cff.py as the module ApeAnalyzer, setting the switch analyzerMode = True. In principle, one could use both things simultaneously in one module, but this often makes no sense due to the sector definitions explained later: the APEs should be calculated for the whole tracker, while the huge amount of detailed validation plots should be chosen for some exemplary regions. The APE calculation also contains some validation plots which are in principle not necessary for the calculation, but since these are the most important basic validation plots, they are implemented there in order to understand the general quality of the estimated APEs.

Granularity of APEs: Sector Definition

A group of modules which should be analysed combined, and for which the same APE value is calculated and assigned, is called "sector". The sectors can be defined based on all module information stored in the TTree produced by the standalone tool mentioned above. The sector definitions need to be given to the ApeEstimator using the parameter Sectors, which is a VPSet. An empty template, not selecting anything but defining all selection parameters, is in ApeEstimator/ApeEstimator/python/SectorBuilder_cfi.py, explaining the possible arguments. Each sector is defined as a PSet, already defined sectors for the subdetectors are given in ApeEstimator/ApeEstimator/python/SectorBuilder_Bpix_cff.py, ApeEstimator/ApeEstimator/python/SectorBuilder_Fpix_cff.py, ApeEstimator/ApeEstimator/python/SectorBuilder_Tib_cff.py, ApeEstimator/ApeEstimator/python/SectorBuilder_Tid_cff.py, ApeEstimator/ApeEstimator/python/SectorBuilder_Tob_cff.py, ApeEstimator/ApeEstimator/python/SectorBuilder_Tec_cff.py. Further subdefinitions should be built in the same way as shown there. It is important to assign to each sector a name reflecting clearly the exact definition, because this can be found in all .root-files and in printouts and also histogram names, in order to see which sector the results are for. All sector definitions are then gathered in the only file to include, ApeEstimator/ApeEstimator/python/SectorBuilder_cff.py. There, the two important sector definitions (VPSets) which are used at present can be found, it is ValidationSectors for the tool in analyzer mode having the full set of validation plots, and should contain only those sectors where one wants to have a closer look at, and RecentSectors, which defines the granularity for the APE calculations, and should span the whole tracker.

Configuration of the Cluster Parameter Estimator (CPE)

The configuration of the CPE which should be used in the refit is given in ApeEstimator/ApeEstimator/python/TrackRefitter_38T_cff.py. There it is chosen which PixelCPE and which StripCPE should be used. The recent one in use is called TTRHBuilderGeometricAndTemplate, but of course the parameters can be changed also in the specific cfg.py, your configuration. But you need to ensure that it is also included in the refit definition, see below.

Configuration of the Refit

The refit itself is also defined in ApeEstimator/ApeEstimator/python/TrackRefitter_38T_cff.py. There the CPE has to be specified by its ComponentName, which is for the one mentioned above WithGeometricAndTemplate. Very important parameters which might have an influence on the results are the ones steering the hit rejection (outliers and bad pixel template fits). Again, this can be overwritten in your specific configuration.

For the refitter, a sequence is defined which needs to be included in the cfg.py, since the refit also needs the offlineBeamSpot. It also contains the selection of tracks flagged as of highPurity, since in many alignment tasks only those are selected, and so it is done here. There, one could also apply the track selection instead of within the ApeEstimator, but in the present configuration this is not done, it selects only for highPurity.

Configuration of the Geometry and the GlobalTag

The global tag and the geometry need to be specified in the cfg.py. But never change the APE, this always has to be the design one with zero APE everywhere. During the iterations of the automated workflow, the correct APE object as created in the previous iteration is taken automatically.

Output

The final output of the ApeEstimator is one file containing the relevant distributions for the second step, the ApeEstimatorSummary, and all validation plots. The output is structured in numerated folders for the defined sectors. Within each folder there is a histogram z_name, which contains only the name given to the sector and allows its identification.

Configuration of ApeEstimatorSummary

Scripts for automated workflow

Set Baseline from Design Simulation

Calculate APE Parameters

Scripts producing Validation Plots

-- JohannesHauk - 24-Jul-2012

Edit | Attach | Watch | Print version | History: r24 | r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 2012-07-25 - JohannesHauk
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback