TQAF Layer 2

Complete: 1

Introduction

The TQAF Layer 2 is the major top specific working layer based on PAT (TQAF Layer 1) objects. It offers tools for top analyses with the aim to resolve the topology of single top events and top anti-top events in all possible decay channels. At the moment development is driven by the analysis of top anti-top events in the semi-leptonic decay channel. An extension to the full leptonic decay channel is planned to be implemented soon and an extension to the full hadronic channel in analogy to the others should be straight forward. The user should keep in mind that the TQAF subsystem as well as the PAT has the aim to collect tools suitable for top analyses to solve technical and programming issues in order to get people into the physics problems as quick as possible. It does not do the analysis nor does it restrict the user's freedom in configuring, adapting, changing or extending existing implementations according to his/her needs. In some cases (like the choice/implementation of new variables for MVA methods or the choice/implementation of new constraints for the kinematic fit) users are especially encouraged actively to adapt the code according to his/her needs. Descriptions on how to do this are provided below.

In the following the most important tools of the TQAF Layer 2 are described. Their main purpose is to be used (in a fully configurable way) within the full framework their outputs being interfaced as EventHypotheses and corresponding meta information to a flexible comprising structure like the TtSemiLeptonicEvent, which may be made persistent and used within FWLite in later analysis steps. The tools are fully modularized such that they may be used standalone within the full framework though. If you like to contribute to the further development of one or the other tool or even would like to add new ones your help is appreciated. Please contact Roger Wolf then.

Kinematic Fit

Contact person Sebastian Naumann

This section will describe the interface to the kinematic fit. It still has to be added. In case of questions don't hesitate to ask Sebastian...

Description

Structure

Access

Production

Event Selection (MVA based)

Contact person Manuel Renz

Description

The TopEventSelection package allows to separate "signal" and "background" events using the MVA package developed by Christophe Saout. In the default implementation TTbar(semileptonic muon channel) events are separated from W+jets events using a Likelihood-Ratio with 10 input-variables. Besides the default the user can utilize any other process as "signal" or "background", implement his own input-variables and switch to Neural-Networks or any other TMVA-based MVA.

Structure

There are two main classes in TopEventSelection:

Two further classes are responsible for the calculation of the input-variables and their transfer to the MVA-Trainer:

Access

In case you want to use the default implementation, you have to run TQAFLayer2 on your input files to get TQAFLayer2 output. Therein the likelihood-output branch is named

double_findTtSemiLepSignalSelMVA_DiscSel_TQAF
and can easily be accessed in your Analyzer. Note: The likelihood-output for events which do not pass the event selection criteria is set to -1. All other events have outputs between 0 and 1, where background events accumulate next to 0 and signal events next to 1.

Production

In case you want to do your own MVA-Signal-Selection, you should first produce TQAFLayer1 output for your files. It is possible to run TQAFLayer1 and the TraintreeSaver in one step but you may run into performance problems, so it is better to produce TQAFLayer1 output first.

to be continued...

Jet Parton Association (MVA based)

Contact person Sebastian Naumann

Description

For analyses like differential cross section measurements, the measurement of the top mass and measurements of other top characteristics there is a special need for a full or partial event reconstruction with a proper association of reconstructed jets to the quarks of the top (anti-top) decay chain(s). For the semileptonic channel, the classes

provide an implementation of the CMS MVA package for multivariate analysis methods to find this proper jet-parton association. They are located in the plugins directory of the TopJetCombination package. The CMS MVA package is a centrally provided interface to all kind of multivariate analysis methods as likelihood, neural net and others. It also provides a full interface to the root TMVA package. It takes most technical burdens from the user's shoulders like histogram and event management and preprocessing and de-correlation of input variables. It is steered via xml steering files.

Structure

SemiLepJetCombMVACode.png

Access

Production

To produce a new .mva file which can be used as input to calculate the MVA discriminator for jet-parton hypotheses, first run the TrainTreeSaver:

cmsRun TopQuarkAnalysis/TopJetCombination/test/ttSemiLepJetCombMVATrainTreeSaver_cfg.py

The resulting tree will be stored in a file called train_save.root. You can then perform the actual training:

mvaTreeTrainer --xslt TopQuarkAnalysis/TopJetCombination/data/TtSemiLepJetCombMVATrainer.xml TopQuarkAnalysis/TopJetCombination/data/TtSemiLepJetComb.mva train_save.root

There are two output files, on one hand train_monitoring.root and on the other hand TtSemiLepJetComb.mva. To investigate the resulting train_monitoring.root via the ViewMonitoring macro, do:

ln -s $CMSSW_RELEASE_BASE/src/CMS.PhysicsTools/MVATrainer/test/ViewMonitoring.C
root -l ViewMonitoring.C
A small window is popping up that allows to have a look on the variables used by the MVA method (just click the button inputVariables->norm). To have more information on what you are looking at, you can have a look at the SWGuideMVAFrameworkTutorial.

If you want to take a look at what is happening when you make the .mva file, you can have a look at the input file TtSemiLepJetCombMVATrainer.xml in data/ To understand what is happening, please have a look as well at the corresponding section in the SWGuideMVATrainer. By default in TQAF there are 3 processors used, namely ProcNormalize, ProcMatrix, ProcLikelihood. The first one takes care about the normalization of the variables you gave as an input and gives the same number of variables back. The second one is used for the decorrelation of the variables: it checks if there is a linear correlation and calculates and applies a rotation matrix to decorrelate the variables, as an input the normalized variables are used. The output is the same number of variables as given to the input, but now the variables should be decorrelated. ProcMatrix is by default not used as input for ProcLikelihood! ProcLikelihood takes as an input the normalized variables provided by ProcNormalize and gives as output 1 variable, namely the discriminator. This discriminator can be used in a next step to select the jet parton association as the one with the highest discriminator value.

The variables currently implemented are:

  • the angle between the quarks from the hadronically decaying W boson (function: angleHadQQBar() )
  • the angle between the reconstructed W boson and b quark jet from the hadronic decay chain (function: angleHadWHadB() )
  • the angle between b quark jet and the lepton from the leptonic decay chain (function: angleLeptonLepB() )
  • the angle between the reconstructed top and antitop (function: angleTopTop() )
  • the mass difference between the reconstructed top and antitop (function: deltaMTopTop() )
  • the mass of the reconstructed hadronically decaying W boson (function: massHadW() )
  • the mass of the reconstructed leptonically decaying W boson (function: massLepW() )
  • ...
The variables can be found in the class TtSemiLepJetComb.


Help Adding new variables

If you want to use further variables just add them to the TtSemiLepJetComb class, e.g.:

class TtSemiLepJetComb {
// common calculator class for likelihood
// variables in semi leptonic ttbar decays
public:

  TtSemiLepJetComb();
  TtSemiLepJetComb(const std::vector<pat::Jet>&, const std::vector<int>,
      const math::XYZTLorentzVector&, const math::XYZTLorentzVector&);
  TtSemiLepJetComb(const std::vector<pat::Jet>&, const std::vector<int>, const math::XYZTLorentzVector&);
  ~TtSemiLepJetComb();

  double angleHadQQBar() const { return ROOT::Math::VectorUtil::Angle(hadQJet, hadQBarJet) * TMath::RadToDeg(); }
  ...
 double deltaPhiMetLepB() const {return ROOT::Math::VectorUtil::DeltaPhi(neutrino,lepBJet);}  =
In TtSemiLepJetCombEval.h you have to add a line
 values.push_back( CMS.PhysicsTools::Variable::Value("deltaPhiMetLepB",jetComb.deltaPhiMetLepB() ) );
to the evaluateTtSemiLepJetComb method such that the variable is calculated and written to the vector values that is used from the MVATrainer.

You finally have to add the new variables to the xml steering file (e.g. TtSemiLepJetCombMVATrainer_Muons.xml). It is important to care of the correct order and naming:

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<MVATrainer>
        <general>
                <option name="id">TtSemiLepJetCombMVATrainer</option>
                <option name="trainfiles">train_%1$s%2$s.%3$s</option>
        </general>
        <input id="input">
                <var name="angleHadQQBar"      multiple="false" optional="false"/>
       </input>
       <processor id="norm" name="ProcNormalize">
           <input>
                  <var source="input" name="angleHadQQBar"/>
           </input>
           <config>
                  <pdf/>
                      ...
                  <pdf/>
           </config>
           <output>
                 <var name="var1"/>
                       ...
                 <var name="var11"/>
           </output>
      </processor>
      <processor id="rot" name="ProcMatrix">
      <input>
                 <var source="norm" name="var1"/>
                     ...
            <var source="norm" name="var11"/> 
      </input>

If you finally want to use the result of the training in your analysis, you have to run the TtSemiLepJetCombMVAComputer. The example config file for cmsRun is TopQuarkAnalysis/TopJetCombination/test/ttSemiLepJetCombMVAComputer_cfg.py. Make sure to put the TtSemiLepJetCombMVAComputer in your path prior to the analyzer in which you want to read in the result of the MVA and to use the correct .mva file, i.e. the one you produced in the training performed on your favorite Monte Carlo sample and with your event selection. You might want to replace the path that is given in the TtSemiLepJetCombMVAComputer_cff.py, for example by including a line like the following in your config file:

process.TtSemiLepJetCombMVAFileSource.ttSemiLepJetCombMVA = "MySubsystem/MyPackage/data/MyJetComb.mva"

Jet Parton Association (GenEvent based)

Contact person Sebastian Naumann

Description

This revised version of the jet parton matching has been introduced in a presentation here. It provides matching of the partons for top quark pair production to jets with four different algorithms as detailed below:

Structure


Help totalMinDist

Main idea: successively use the jet-parton pair with the smallest in the event

Procedure:

  • calculate for all possible jet-parton pairs in the event and store the values in a vector
  • sort vector with respect to
  • match jet and parton corresponding to the smallest in the vector
  • remove all entries belonging to the matched jet or the matched parton from the vector
  • continue matching and removing until vector is empty

Comments:

  • default procedure in the TQAF (TopQuarkAnalysis/TopTools/src/JetPartonMatching.cc)


Help minSumDist

Main idea: find the combination of jet-parton pairs with the smallest

Procedure:

  • successively find all possible combinations of jet-parton pairs (using a recursive approach)
  • along the way, calculate for each combination
  • store information about a combination if the respective is smaller than the smallest found so far
  • in the end, match jets and partons as in the combination that was found to have the smallest

Comments:

  • minimizing prevents from having the matching for the whole event screwed up by one jet or parton
  • disadvantage: large combinatorics
  • this approach is used in the CMSSW JetMCAlgos (CMS.PhysicsTools/JetMCAlgos/plugins/CandOneToOneDeltaRMatcher.cc) as the BruteForce algorithm (not the SwitchMode)


Help ptOrderedMinDist

Main idea: the position is supposed to be measured with higher accuracy for harder particles than for softer ones

Procedure:

  • sort partons with respect to in descending order
  • starting with the hardest parton, find the jet with the smallest to this parton and match it
  • consecutively match the other partons, ignoring jets that have already been assigned to a (harder) parton

Comments:

  • it was confirmed by earlier studies that better matching is achieved for hard than for soft jets


Help unambiguousOnly

Main idea: in order to avoid mismatchings, do not tolerate any ambiguity within some

Procedure:

  • for each parton, find all jets that lie within
  • if exactly one jet is found, match it to the parton
  • if none or more than one jets are found, dismiss the whole event

Comments:

  • only suited for clean events with well separated jet-parton pairs
  • in this case, the order of the jets and partons when looping over them is of no importance
  • very high purity, the lowest efficiency

The code is located in the TopTools subdirectory of the package. The configuration files to run the jet parton matching for semi-leptonic and full-hadronic top pair production can be found in the TopEventProducers subdirectory of the package.

Access

Production

Event Hypotheses

Contact person Roger Wolf

This section will describe the currently implemented event hypotheses for top quark analyses of top anti-top event topologies in the semi-leptonic decay channel. It still has to be added. In case of questions don't hesitate to ask Roger...

Description

Structure

Access

Production

-- RogerWolf - 19 Jun 2008

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf 080121.pdf r1 manage 151.0 K 2012-05-21 - 14:11 SebastianNaumann  
Edit | Attach | Watch | Print version | History: r27 < r26 < r25 < r24 < r23 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r27 - 2012-05-21 - SebastianNaumann
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback