Introduction

The HistFactory is a tool to build parametrized probability density functions (pdfs) in the RooFit/RooStats framework based based on simple ROOT histograms organized in an XML file. The pdf has a restricted form, but it is sufficiently flexible to describe many analyses based on template histograms. The tool takes a modular approach to build complex pdfs from more primative conceptual building blocks. The resulting PDF is stored in a RooWorkspace which can be saved to and read from a ROOT The details of the model created by are described in the Histfactory Likelihood guide:

Here's a link to the HistFactory user's guide

The easiest way to create a HistFactory model is through the XML interface. To model an analysis, one describes it in a set of XML files, which describe the different channels considered, the samples used (both signal and backgrounds), and systematics associated with those samples. Once these xml files are properly written, they are parsed and run using "hist2workspace" which is a simple command-line executable available through ROOT.

This page specifies the format for the input XML files. List of XML tags and possible values.

Configuration files for HistFactory consist of one central XML file, here referred to as the "Top Level" XML file, and several individual xml files, one for each channel.

NEW! C++ and Python bindings instead of XML: documentation here

Combination (top level XML file)

The top level specification
  • OutputFilePrefix (Required): Prefix to the output root file to be created (inspection histograms)
  • Input: A list of xml files describing the channels to be combined (see below for description of channel xml files)
  • Function (Optional): Set a term in the model to be a function of one or more other terms. Specify the "Name" of the term to be set to a function, the "Expression" as a string that could possibly include other named parameters, and the "Dependents" as a list of those parameters.

Measurement

Configuration for a measurement.
  • Name (Required): To be used as a heading in output tables
  • Lumi (Required): Integrated lumi of the measurement
  • LumiRelErr (Required): Relative error of the lumi measurement
  • BinLow: the lowest bin number used for the measurement (inclusive)
  • BinHigh: the highest bin number used for the measurement (exclusive)
  • Mode: type of the measurement. Use "comb"
  • ExportOnly: if "True" skip fit, only export model, don't do initial fit.

Conventions

The nominal and variational histograms should all have the same normalization convention. There are a few conventions possible:

  • Option 1:
    • Lumi="XXX" in thee main XML's element, where XX is in fb^-1
    • Histograms are in fb / bin
    • Some samples have NormFactors that are all relative to prediction (eg. 1 is the nominal prediction)
  • Option 2:
    • Lumi="1" in thee main XML's element
    • Histograms are normalized to unity
    • each sample has a NormFactor that is the expected numbers of events in data
  • Option 3:
    • Lumi="1" in thee main XML's element
    • Histograms are in numbers of events / bin expected in data
    • Some samples have NormFactors that are all relative to prediction (eg. 1 is the nominal prediction)

It's up to you. In the end, the expected number is the product of them: N=Lumi*BinContent*NormFactor(s) See the PDF user's guide for more precise equations.

POI

Use this to specify which parameter is the one you'd like to measure

ParamSetting

Specify which parameters are fixed. If you include a parameter here, it is not a nuisance parameter or a POI, but a fixed parameter of the mode.
  • Val (Optional): The specific value is set here.
  • Const (Optional): Has a specific value, not set here, but where the param is defined

ConstraintTerm

For a parameter, can specify a shape other than the default Gaussian
  • Type (Required): Specify shape. Can be "LogNormal","Gaussian", "Gamma", or "Uniform"
  • RelativeUncertainty (Optional): parameter. Kyle -- add details here?

Channel (channel.xml file)

Channel

Top-level configuration for a channel
  • Name (Required): Name of this channel.
  • InputFile (Optional): input file where the input histogram can be found (use abs path). If not specified, must specify for each sample and data
  • HistoPath(Optional): the path (within the root file) where the histogram can be found
  • HistoName(Optional): the name of the histogram to be used for this, unless overridden for specific samples and data

Data

Specify where the data is
  • InputFile (Optional): input file where the input histogram can be found (use abs path). If not specified by Channel, must specify here
  • HistoPath(Optional): the path (within the root file) where the histogram can be found
  • HistoName(Optional): the name of the histogram to be used for this

StatErrorConfig

Configure the use of bin-by-bin statistical uncertainties. This tag does not activate statistical uncertainties, they must be activated at the sample level.
  • RelErrorThreshold: A minimum value for a bin's statistical uncertainty to be considered.
  • ConstraintType: The functional form of the term constraining the statistical uncertainty parameters. Current options are: "Poisson" and "Gaussian".

Sample

Signal or background contribution
  • Name (Required): Name it
  • InputFile (Optional): input file where the input histogram can be found (use abs path). If not specified by Channel, must specify here
  • HistoPath(Optional): the path (within the root file) where the histogram can be found
  • HistoName(Optional): the name of the histogram to be used for this
  • NormalizeByTheory (Optional). If set to "True", scale by luminosity. If set to False, use scaling as is
  • StatError (Optional): To include this sample's statistical uncertainty in the channel's total statistical uncertainty, set: Activate="True". By default, this will use the input histogram's uncertainties. To specify by hand a histogram describing the statistical uncertainties, set: HistoName="My_Uncertainty_Histogram_Name"

HistoSys

Specify a shape systematic
  • Name (Required): Name it
  • HistoFileHigh (Optional): input file where the high input histogram can be found (use abs path).
  • HistoPathHigh(Optional): the path (within the root file) where the high histogram can be found
  • HistoNameHigh(Optional): the name of the high histogram to be used for this
  • HistoFileLow (Optional): input file where the high input histogram can be found (use abs path).
  • HistoPathLow (Optional): the path (within the root file) where the high histogram can be found
  • HistoNameLow (Optional): the name of the high histogram to be used for this

OverallSys

Specify a rate systematic
  • Name (Required): Name it
  • High (Required): Relative high fluctuation. 10% is 1.10
  • Low (Required): Relative low fluctuation. 10% is 0.90

NormFactor

Scaling factor, which can be used to vary the parameter of interest
  • Name (Required): Name it
  • Val (Required): Nominal value
  • High (Required): Upper edge of range to scan
  • Low (Required): Lower edge of range to scan
  • Const (Required): Set to "True"

ShapeSys

A term describing systematic uncertainties that can effect the shape of a histogram in a bin-by-bin way
  • Name (Required) Name it
  • HistoName (Required) The name of the histogram describing the bin-by-bin uncertainties on this shape

ShapeFactor

A term describing the shape of a histogram in a bin-by-bin way. This term is fully flexible and completely unconstrained. A suggested usage is to make this term common to two channels. In the fit, this term will simultaneously fit to the shapes in both channels. This is a convenient way to incorporate shapes that are extracted in one channel and applied to another.
  • Name (Required) Name it


Major updates:
-- DanielWhiteson - 05-May-2011

%RESPONSIBLE% KyleCranmer
%REVIEW% Never reviewed

-- KyleCranmer - 21-Oct-2011

Topic revision: r5 - 2012-11-15 - KyleCranmer
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    RooStats All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback