Summary

TauAnalysisMaker is a python script that converts a list of cuts and a few options into all the PAT selectors and TauAnalysis config required to do an analysis, making changing and rearranging cuts easy to do with only one place to edit, instead of requiring multiple locations to be edited and different naming conventions understood.

It is found in TauAnalysis/Configuration/python/tools/tauAnalysisMaker.py
An example of an analysis using it can be found in TauAnalysis/Configuration/test/runbbAHtoElecTau_cfg.py

The cut definitions for TauAnalysisMaker look like:

cuts = [
    GenPhaseSpaceCut(),
    TriggerCut(triggerPaths=cms.vstring("HLT_IsoEle15_L1I"),
    ElectronCut(cut=cms.string("pt > 15."),
    TauCut(title="My Cut", cppclass="myedfilter", params=cms.PSet(...)),
    ElecTauPairCut(title="Zero Charge", cut=cms.string("charge = 0"))
    ...
]

Concepts

Objects

An object in TauAnalysisMaker is a type of particle that starts from a pre-existing PAT collection that we want to do a series of chained selections on. Typical objects might be "electron" (from cleanLayer1Electrons), "tau" or "elecTauPair". Every cut has to be identified with which object we want to perform the cut on.

TauAnalysisMaker has a built in set of objects it understands, but that does not mean you cannot add more. For instance, if you wanted to create two different electron analysis sequences, one with loose isolation, and then compare them, you would define an extra object called looseelectron or similar, write a series of cuts using it and then add some cut stage that compares the two.

The built-in objects understood are:

  • electron
  • tau
  • muon
  • jet
  • met
  • vertex
  • elecTauPair
  • muTauPair
  • diTauPair
  • genPhaseSpace
  • trigger
If you want to override the source of a collection (not use cleanLayer1Electrons for the electron collection, for instance) or change how a composite collection is produced (eg, change the dRmin for elecTauPair production) you can provide your own definition to override the defaults.

Cuts

A TauAnalysis cut is defined using a python dictionary object which gives the type of object it operates on, the EDFilter class to use and any needed configuration parameters for that class. The cut definition is used to generate:

  • PAT Selector (both individual and cumulative if desired)
  • BoolEventSelector for this cut
  • GenericAnalyzer filter result loader
  • GenericAnalyzer analysis sequence filter and histogram manager entries
In the GenericAnalyzer, the cuts are carried out in the order in which they are listed in the cut list. The PAT selectors are grouped into sequences by object type (eg, all electron selections are performed together, then all tau selections, even if you mix them in the analysis sequence. This shouldn't affect the outcome though, providing you don't refer to other collections in a cyclic way). You have to specify the order these sequences are executed in the object_order parameter - make sure, for instance, if you are making elecTauPairs that selections on these are done after selections on =electron=s and =tau=s.

Plots

(Warning: Plotting is still fairly primitive and probably more likely to change...)

A TauAnalysis plot is one that would previously be defined manually in plotXtoYZ_drawJobs_cfi. These are stacked plots that you want made at certain points of the process (mostly at the end). A plot definition is again a python dictionary object, although it can be defined using helper functions.

You can define plots that will be made under 4 different conditions:

  • After every single cut
  • After all cuts
  • After each cut on a given object type
  • After only a specific cut

Hopefully this should allow all the combinations people want. To define a plot, the only required information is the name of the MonitorElement saved by the Histogram Manager in question.

Usage

To use TauAnalysisMaker in an analysis (see the runbbAHtoElecTau_cfg example above), you need a config file which already does PAT production (or loads PAT ntuples), and already has a process defined and a path you can insert the results into. TauAnalysisMaker ultimately returns a sequence which needs to be added to that path.

This should usually look something like:

...
from TauAnalysis.Configuration.tools.tauAnalysisMaker import *

cuts = [ ... ] # see next section
options = { ... } # see next section

maker = TauAnalysisMaker(cuts,options,process)

process.p = cms.Path(
  process.producePrePat+
  process.patDefaultSequence+
  process.producePostPat+
  maker.createObjects()+
  process.savebbAHtoElecTauPlots
)

print maker

You need to have already imported cms, set up a process, imported the TauAnalysis PAT sequences, set up a source, etc. See the example. TauAnalysisMaker has to be passed the process object as the third argument in order to add all the filters and sequences it creates to the process namespace (otherwise cmsRun will complain that they don't exist). The print maker command at the end prints out a cleaned up list of what filters have been created, their labels, titles etc.

Cuts

This is the most important part. The list of cuts is given as a plain python list, containing one python dictionary for each cut. Any of these three syntaxes works:

cuts = [
  {'object': 'electron', 'cut': cms.string('pt > 15')},
  Cut('electron', cut=cms.string('pt > 15')),
  ElectronCut(cut=cms.string('pt > 15'))
]

In the first case we write out a dictionary by hand, in the second case use the function Cut(object, **args) and in the third case use one of the special functions defined for the built-in object types. The functions just return an equivalent dictionary to the first line, but let you use the function syntax you may be more familiar with.

The possible arguments for a cut are:

  • object: The name of one of the built-in objects, or one that you have supplied a definition for. Only needed if you aren't using a specialised function like ElectronCut or DiTauPairCut.
  • title: The cut title to display in the summary table at the end of the run. This is optional if there is a cut parameter supplied (which will be used as the title).
  • label: The internal label used, for instance in selectedObjectLabelCumulative or evtSelObjectLabel=. This is generated automatically from the cut or other info and shouldn't usually need to be specified.
  • cppclass: The C++ class to use (should be an EDFilter). If this isn't given, it's assumed to be PATObjectSelector (eg PATElectronSelector). (This had to be cppclass instead of class because class is a reserved word in python).
  • plots: A python list of any plots to be made after only this cut.
  • Finally, you need to add any arguments this filter class requires, using cms.* types, eg dRmin=cms.double(0.7).
  • If you need to refer to another collection, use the syntax $last(object), eg srcNotToBeFiltered=cms.VInputTag("$last(electron)","$last(muon)").
  • If you don't specify a class (ie, we use PATObjectSelector), the parameter filter=cms.bool(False) is supplied automatically.

In most cases you should be able to define a cut using just ObjectCut(cut=cms.string('cut')). The title and label will then be generated automatically, the src and filter parameters added automatically when the sequences are constructed.

Plots

A plot is just a python dictionary, but you will usually want to define it using the Plot helper function.

Plot("monitorElementName",
     title="Plot of X for $title",
     xAxis="xAxis",
     name="filename"
)

All arguments except for the monitor element name are optional. title defaults to "Cut title: monitor element name". xAxis needs to be one of the axis defined in the xAxes PSet parameter of your DQMHistPlotter (an attempt to pick an appropriate one based on the MonitorElement name may be added). name is the filename for the output file (although DQMHistPlotter adds a prefix and fileformat to it), which defaults to "afterCut_monitorElement".

In all cases, they have to be supplied as a python list, eg:

options['plots']=[
    Plot("meName1"),
    Plot("meName2")
]

To use plotting, you have to create histogram adding, DQM loading and a drawJobConfigurator as normal. You then create an instance of TauAnalysisMaker (see above) using the same cuts and options as for analysis, call maker.createObjects() to do internal processing (FIXME) and then call maker.generateDrawJobs(myDrawJobConfigurator) to generate all the drawJobs. This prints out a list of the plots being created.

Options

In addition to cuts, you have to supply a python dictionary containing misc options. At a minimum, this needs to consist of:

options = {
    "name" : "bbAHtoElecTau" ,
    "object_order" : [ "vertex", "electron", "tau", "jet", "elecTauPair" ]
}

name is the name that is appended to all the sequences and the analyzer title. It should usually be the descriptive name of the physics process (like "ZtoElecMu") but can be anything your twisted imagination can come up with, really.

object_order is important as it defines the order in which the PAT selector sequences are run. In the example above the order is determined because:

  • electrons are checked for compatibility with the primary vertex found
  • taus are checked for overlap with electrons
  • jets are checked for overlap with electrons and taus
  • elecTauPairs are built out of electrons and taus
Every object that results in PAT selectors needs to be in this list or it will not run. If the order is wrong it will still work, but it will have undesirable consequences (eg, when if the taus are placed first they will be checked for overlap against cleanLayer1Electrons (the most recent electron collection known until the electron sequence has run) instead of selectedWhateverElectronsCumulative.

There are also a number of optional parameters to the options dictionary.

histmanagers should be a cms.VPSet of HistManager names. If you don't specify the list of histmanagers, the appropriate one for each entry in your object_order list is added, plus GenPhaseSpaceEventInfoHistManager and TriggerHistManager.

eventDumps should be a cms.VPSet of eventDump definitions. If you don't specify this then the appropriate eventDump for each LeptonLeptonPair in your object_order list is added.

objects contains a dictionary of any custom or overridden object definitions you wish to use. This is merged, with your new definitions taking priority, with the default list. An object definition looks like:

plots contains a python list of plot objects to be made after every single cut.

endplots contains a python list of plot objects to be made only at the end.

"electron": {
    "source": "cleanLayer1Electrons",
    "replace": "electronHistManager.electronSource = $last(electron)",
    "individual": True
}

Defining the source collection, how to update histmanagers when this object is filtered, and whether to create both selected...Individual and selected...Cumulative collections.

If you want to create a produced object (one produced from other objects, like elecTauPair), the definition looks like:

"elecTauPair": {
      'source': 'allElecTauPairs',
      'replace':'diTauCandidateHistManagerForElecTau.diTauCandidateSource = $last(elecTauPair)',
      'individual':True,
      'producer':{
        'class':'PATElecTauPairProducer',
        'useLeadingTausOnly':cms.bool(False),
        'srcLeg1':cms.InputTag('$last(electron)'),
        'srcLeg2':cms.InputTag('$last(tau)'),
        'dRmin12':cms.double(0.3),
        'srcMET':cms.InputTag('$last(met)'),
        'recoMode':cms.string(''),
        'verbosity':cms.untracked.int32(0)
    }
}

The extra producer definition contains the class name and all parameters required to create the EDProducer class that makes this object. The producer class will be given the label in the source field - allElecTauPairs in this case.

You can also define a plots field, containing a python list of plots to be made after each cut involving this object.

Known Issues

  • There is no way yet to use other than a MinEventSelector requring at least 1 object. The cut syntax needs some way to handle using min and max selectors.
  • It hasn't really been tested outside of ElectronTau, and even then not that widely.
  • Although it has quite a few explanatory error messages, there are still many situations where you get completely unhelpful errors that only make sense to me.
  • Label and title generation methods don't handle conflicts sensibly.

-- GordonBall - 28 Jul 2009

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2009-07-31 - GordonBall
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback