Gen/GenTune - Monte Carlo (MC) Generators tuning with Rivet in Gauss

For an introduction to RIVET, Professor and their current state of implementation in LHCb please, consult this nice page containing a list of the talks to date on this subject.

Gen/GenTune is a package in the GAUSS project which provides the Gaudi algorithm RivetAnalysisHandler wrapping the Rivet::AnalysisHandler class in order to run a set of Rivet plug-ins (analysis modules) from within the Gauss run-time environment. The algorithm operates on a deep-copy of the HepMC event object provided by the ProductionTool, so it can be run as part of a large statistics generator only production and it is not designed to accept/consider pile-up events. It relies heavily on a skeleton code provided by Andy Buckley (for which the main developer kindly expresses his gratitude). However, it is adapted and still adapting (since 2013) to the specifics of the LHCb framework.

RIVET 2.[0-4].x in LHCb

Gen/GeTune v2r3 is the latest release of the LHCb interface to RIVET which is compatible to versions 1.* of the library using the AIDA histogram system. Due to dependence of external libraries (which are selected in the Gen/GENSER package), the latest compatible Gauss release is v49r1/v50r0 (using RIVET 1.9.0 and AIDA histogramming).

In versions 2.*, the RIVET library has changed histogram system from AIDA to YODA. As YODA is stand-alone package in the LCG stack for which a CMT interface was developed. The Rivet 2.x classes have suffered minor, but incompatible, interface changes that are reflected in the RivetAnalysisHandler algorithm interface and a major version change of GenTune (v3*) marks the usage of this new version of the library. At the same time the new GenTune will not offer any more support for HepMC versions prior to 2.06.

Gen/GenTune v3r0(p1) development is documented in this JIRA task: LHCBGAUSS-598. It provides for RIVET 2.* the same functionality as the algorithm released in v2r* package versions (compatible to RIVET 1.*). It is released for production since Gauss v49r2 (using RIVET 2.4.2 and YODA 1.5.9 libraries from the LCG external repository) and Gauss v51r0 (?!) for the Upgrade studies stack.

An important addition is foreseen to make the algorithm compatible with processing events with signal as generated by LHCb simulation software taking advantage of the information available at run-time through the newly implemented GenFSR interface.

Although a new histogram system is used since RIVET 2.* it does not affect the functionality of the existing analysis modules, so the following notes apply to the LHCb interface supporting both Rivet families taking into account that when AIDA/aida is mentioned YODA/yoda should be considered (if not already specified) when using latest version of the interface package.


RivetAnalysisHandler Python Interface

This section enumerates the up-to-date properties of the RivetAnalysisHandler algorithm and a short description. For earlier versions of the Python user interface (with incompatible/major changes) please, see the sub-sections below.

  • MCEventLocation -- Location on TES where the HepMC events are read from (LHCb::HepMCEventLocation::Default)
  • BaseFileName -- The base file name (prefix of filenames) to write results to ("MyRivet")
  • RunName -- The name of the run to prepended to YODA plot paths ("LHCB")
  • Analyses -- A list of names of the analyses to run ([] - i.e. empty Python list)
  • AnalysisPath -- List of additional file paths where analysis plugins should be looked for, e.g. add os.path.abspath('.') when analysis lib(*.so) is in the option file directory ([])
  • CorrectStatusID -- Switch that controls the transformation of status ID of particles (given by EvtGen) back to PYTHIA defaults (False)
  • CorrectCrossingAngles -- Instructs the algorithm to automatically detect and correct for beam crossing angles (True)
  • xSectionValue -- The externally provided cross-section for the present run; the value is ignored when not forced and available from HepMC event (-1.0). Please, be aware that it is expressed in pb (picobarns) according to HepMC recommendations and not in mb (millibarns), the default unit used by LHCb!
  • forceXSection -- Forces the algorithm to set the provided cross-section value for each event (False)
  • LogSuppressionSoftLimit -- Internal statistical messages print-out suppression soft limit (30)
  • LogSuppressionHardLimit -- Internal statistical message print-out suppression hard limit (200)
  • LogSuppressedOutputFrequency -- Internal statistical message print-out suppression frequency (10)

RivetAnalysisHandler Python Interface in Gen/GenTune v2*

How to develop a RIVET plugin in LHCb

These are the general steps to follow in the development of a new RIVET plugin for published or unpublished LHCb measurements to be used in various MC studies and tunings.

A RIVET plugin consists in:

  • a piece of C++ code sub-classing Rivet::Analysis [1] -- usually stored in a <plugin ID>.cc file (e.g.
  • a file containing meta information (<plugin ID>.info) and general parameters controlling the way the plugin is treated by the framework (e.g. pT cuts, beam energies, requirement of process cross-section to be provided by the MC generator in HepMC events, etc.)
  • the reference data provided in <plugin ID>.aida or <plugin ID>.yoda depending on the RIVET version currently implemented in LHCb (1.* and 2.*, respectively). This file is either downloaded from the HepData reaction database [3] or can be created by the developer
  • finally <plugin ID>.plot contains instructions for generating and styling the plots like axis type (linear-log) and labels, plot labels, legend position, etc. (see RIVET documentation for a detailed review of the instructions available)

The development of a RIVET plugin implies the following steps:

  1. For published results (in a public paper), one has to decide which experimental points are to be sent to HepData and follow the procedure described on the submission policy page. In case you want to develop a plugin for unpublished results (e.g. an analysis note), this step should be skipped and the reference file must be generated privately/manually. For more details see this dedicate section.
    Warning, important If/when creating a reference data file manually, avoid using data set/plot names that start with underscore (_) character. Such names are used internally, especially in YODA, and the corresponding distributions will be ignored by RIVET tools when creating comparison plots.
  2. At the moment setting up the environment for Gauss should provide you access to the official script provided by the RIVET team (prefixed by rivet-). The developer should use rivet-mkanalysis to generate a basic template for all the files which are needed by the plugin. At this stage if the <Inspirehep ID> (see last component in the <plugin ID> example above) of the published article, which is provided as a command line argument, is found in HepData the reference file will be automatically downloaded. Otherwise the developer will have to generate the reference file manually.
  3. The development process should be straight-forward as basic information is nicely given in the template files. The important aspect if to check with the analysis proponents and verify that the cuts and their implementation matches the filtering algorithm used for MC production at generator level.
  4. Once the plugin is written, it needs to be validated (as much as possible) against the very MC generators and specific configurations that were used in the published paper. It is recommended as final challenge to present the results of the validation process either in a meeting of the physics working group that proposed the paper or in a Simulation meeting.
  5. Pack everything nicely in a tarball and send it via email to the RIVET team. Expect serious delay to no reply and watch for new releases of RIVET as usually plugins are released without email chatter when their tests are running fine (really hope they will afford to change this behaviour in the future).

Step-by-step Development of a New RIVET plug-in for LHCb

Let us start a step-by-step review of the development process for a new RIVET plugin in the LHCb environment. To set up the proper environment and get the latest available version of Gen/GenTune you must initialize the Gauss environment for a compatible version of Gauss that allows you access to the required version of MC generators.

Warning, important With deployment of lxplus machines running only SLC6 the compiler fails (at C++ std library level) for other CMTCONFIG values and older Gauss versions. These issues are under investigation so please, use only the commands below to set up your work environment.

$ LbLogin -c x86_64-slc6-gcc48-opt

$ SetupProject Gauss v48r3

Please, keep in mind that each Gauss version comes with a pre-compiled version of Gen/GenTune (which you can use when writing new plug-ins a.k.a. analysis modules). In order to use a particular version of the package one can create a development environment for Gauss (add --build-env option on the command line) and then issue getpack Gen/GenTune v2r3. Please, let the developers know if you find undocumented incompatibilities between a version of the package and the Gauss environment you need to use.

However, before continuing it is recommended to create a new working directory and cd to it.

Some of the helper rivet-* scripts are buggy and under review whether they should be altogether replaced by working tools in Gen/GenTune. For now (Rivet 1.9.0) we'll follow the step prescribed by the RIVET developers and apply workarounds to make things work.

As mentioned above the first step of developing a RIVET plugin is to ensure the experimental data points are submitted and published by HepData. If this is the case RIVET offers a nice command to generate the template for your plugin. Currently it is recommended to run it like:

$ rivet-mkanalysis -v LHCB_<year>_I< record ID>

Trying this for LHCB_2013_I1208105 (the RIVET ID for the LHCb energy flow measurement plugin - we'll use this ID from here on instead of the generic ID format) will generate all the files needed for a complete plugin. Well, actually in rivet 1.8.3, due to a bug which is solved in later versions of the script, the reference file LHCB_2013_I1208105.aida/.yoda is not actually retrieved from HepData, so changing the URL from where the script tries to download the reference (mind the -v argument above!) in the following way:

Getting data file from HepData at

one can get the correct reference file with

$ wget -O LHCB_2013_I1208105.aida

At this point you should have in your working directory the following files (please bear in mind that we use LHCB_2013_I1208105 as a nicer placeholder for a generic plugin ID):

  • LHCB_2013_I1208105.aida/yoda - the reference file which you should not have the need to change if properly downloaded from HepData. Of course you are expected to create this file by hand in case the data points can not be exported from HepData (details).
  • - the meta-information accompanying the plugin in YAML format. On the fields in this file, there are some of particular interest:
    • Beams, Energies which dictate the compatibility of the plugin with a particular MC generator set up in terms of beam types and beam energy.
    • NeedCrossSection which indicates that the plugin requires the production cross-section to be either provided by the MC generator or set up externally.
    • Status which you'll set to VALIDATED when the plugin is ready to be sent to the RIVET team
  • LHCB_2013_I1208105.plot - which contains information and commands to be used in the generation of the plots/histograms (e.g. axis captions, axis type: log vs. linear, axis range, legend position, etc.)
  • - the file which will contain the actual code of your plugin. Tip: Though not officially recommended, it is easier to rewrite this file starting from the corresponding file from a similar analysis, if you know of one already published in RIVET - yet, one should take particular care to avoid inheriting unwanted features from the old plugin code.

Every now and then, through-out the development process it is best to try and compile your plugin to catch in time various errors and avoid clogging the compilation log with strange errors given by gcc going haywire on nesting errors. For this task RIVET provides the script rivet-buildplugin which works well on lxplus or any system with AFS yet, due to some hard-coded paths (Dec, 2013), it was found to be almost useless when used from CVMFS -- special solutions are being designed in the form of alternative scripts (where necessary), e.g., see the lbrivet-buildAM in Gen/GenTune v2r3 (as of May, 2015). As stated in the RIVET official documentation the build command for a (series of) plug-in(s) (or analysis module) would be the following:

$ rivet-buildplugin Rivet<custom_name>.so <list_of_source_files *.cc> <custom_C++_compilation_flags>

The custom_name can be any name which helps you remind what was compiled in that specific library, for example when you develop alternative version of some algorithm and need to keep track of the library that is loaded at run time. The list_of_source_files is a space separate list of *.cc file names containing the code of multiple analysis modules that you may want to bundle together (for instance this what the make command produces for the LHCb internal plug-in repository in the LbRivetPlugins package). Finally, custom_C++_compilation_flags is optional and at the moment (Jun, 2015) may be useful when compiling with gcc48 or later to suppress compilation warnings. In particular, and only for these versions of the gcc compiler (watch your CMTCONFIG enviroment variable) you can use -Wno-unused-local-typedefs to suppress lots of compilation warnings that otherwise would make debugging quite tiresome.

When the plugin is ready and it compiles without errors, the validation procedure may start. Currently, the validation is done by choosing a set of well known MC generators and specific tunes for them (if possible to match the ones used in the published paper or some internal document/note), running your plugin over a specific number of events (to be put in NumEvents field of the .info file at the end) and comparing the resulting distributions to the ones given by the reference data or other tune. At this stage, the developer should watch for excess or missing statistics in any of the distribution bins and try to explain the behaviour (usually find the coding error that produces it). Particular care is to be taken for the case when the MC generators used in producing the published papers are not available or the LHCb framework is no longer compatible with that version, as the differences one sees may very well be explained by improvements/corrections in the generation model. Therefore, it would be good to perform this activity in close collaboration with one of the proponents of the analysis. However, it is good practise to prepare some slides with the comparison of the distributions and present them to the physics working group which did the analysis of the experimental data as a final check-up. Only when your code is formally approved by the WG and/or the proponents the plug-in/analysis module can be considered validated by LHCb. Once a plugin passes this last test, it is ready for release via e-mail to the RIVET team (don't forget to change the Status to VALIDATED in the .info file before that ).

While you wait for replies from the RIVET team you can commit your tested plugin to the special package developed in the LHCb software stack for this cases: the private LbRivetPlugins package (follow the wikiword link for details).

Running RIVET plugins in LHCb

To run a plugin either for testing or tuning purposes one must use Gauss and Gen/GenTune with a Python based option file as it is done for every application on the LHCb stack. The package Gen/GenTune contains in the directory $GENTUNEROOT/options/example all the files needed to compile and run a working plugin in the LHCb environment. The followings are adapted from the README.txt file also available at this location. For a comprehensive discussion of the beam conditions expected by RIVET and the corresponding beam settings provided by Gauss see this section.

To run your plug-in create a Python option file similar to the one included in this example directory (also mind the package Python interface) and issue, for instance:


Tip, idea Notice that the user should select a compatible beam option file that fixes the number of primary pp interactions to 1 (see Python code here).

The command above will create the myRivetGaussMC.aida file as output which can be further processed with Rivet built-in scripts to obtain, for instance, a nice web page presenting all the histograms:

$ rivet-mkhtml --mc-errs myRivetGaussMC.aida

$ firefox ./plots/index.html &

The setEnvRivet BASH script is provided (also in the example directory) to properly set the environment before running either a modified/newly developed analysis plugin in GenTune itself (e.g. overriding plugin meta-information and separately from the plugins bundled in the RIVET release) or using the auxiliary rivet-* tools to further process the output histograms. See the Rivet manual for more specific information on these environment variables.

Please, be aware that in order for your plugin to be detected by the RIVET framework and used properly by the auxiliary tools the .so library file containing the compiled plugin must be present in RIVET_ANALYSIS_PATH, i.e. the current directory where you're testing the plugin. Also please notice that .info files use the YAML format for which # is the character signalling the beginning of a comment which extends to the end of the line.

GAUSS beam options for RIVET runs

The degree of detail for beam configuration available in Gauss needs to be drastically restricted in order to meet the beam characteristics implied by the RIVET framework. As RIVET is mainly developed for direct comparison between measurements (unfolded for detector effects and often extrapolated to generator level) and model predictions, analysis modules should be designed considering that exactly one primary pp interaction occurs in each (non empty) event at the origin. Furthermore the HepMC event is considered to be defined in the pp centre-of-mass frame. To meet these requirements, it is recommended that GenTune users set the beam using the already defined option files from AppConfig which limit the number of primary pp interactions to 1 and eliminate the beam crossing angles $APPCONFIGOPTS/Gauss/Beam* However, these settings still introduce a Gaussian smearing of the primary vertex position (to simulate the effect of the beam profile - which in the absence of crossing angles, projected on the XY plane, is a circle) which users should be aware when designing and testing their analysis module. Ideally, before submitting to the RIVET team a collaboration validated analysis module, the developers should take care to derive a version of the code that fixes the primary vertex position to (0, 0, 0).

LHCb internal repository for RIVET plug-ins (LbRivetPlugins)

As mentioned above, once you have a running piece of code that is able to reproduce the published distributions you are welcome to commit it (the .cc, .aida, .plot and .info files but also special option files if needed) to the LbRivetPlugins package and/or send them to me so that I can test it along with the rest of plug-ins residing in the package and commit in your name.

Please, take also some time to document yourself about the purpose of LbRivetPlugins before committing and update the twiki page. Do not confuse this operation for the finalization of the plug-in development process which occurs only when the plug-in is sent to the RIVET team and made available on line (at least as independent .tar.gz until it gets tested and included in a future Rivet release). The latter procedure was simplified since March, 2015, when the RIVET team opened up restricted access to their on-line repository of preliminary plug-ins. However, at collaboration level, an analysis module should be considered finalized once the authors show the validation plots during a meeting of the physics WG that made the measurement and analysts (ideally the proponents) formally approve the code reproduces the measured distributions at generator level.

An additional request (from the RIVET team) is that plug-in source code must be accompanied by validation plots and reference to Monte Carlo productions/generator steering files used to produce the events for the validation studies. Therefore, plug-in authors are kindly requested to send me also the *.aida files produced during validation and the generator steering information (event type, special tunes) used in their analyses to produce samples for comparison between experimental data and theoretical models. These additional files would be required to be able to upload a complete plug-in submit for inclusion in future versions of RIVET.

Gen/GenTune setup policy

As Gen/GenTune and Gauss are separate packages belonging to the same project, their configuration through Python option files should be as modular as possible. Currently, in v2* series, this policy is not correctly implemented, especially with respect to beam conditions. The modular configuration policy is due to be implemented with the next major upgrade of the Gen/GenTune package v3+ which also marks the adoption of Rivet 2.x library in LHCb. For now, users can easily enforce this policy by disregarding the beam option files released with the package and import in their option files the corresponding beam options from $APPCONFIGOPTS/Gauss according to the Gauss version used for the MC event generation. As pile-up is not supported by Rivet, you are advised to use the following piece of Python code in your job options to fix the number of primary pp interactions per event to 1:

from Configurable import Generation
# Set a fix number of interaction to 1
gaussGen = Generation("Generation")
gaussGen.PileUpTool = "FixedNInteractions"

External References

  1. RIVET code documentation
  2. Professor official page
  3. HepData official page

This is just a very short introduction to using RIVET in LHCb through the Gen/GenTune package in Gauss, so please let me know if you spot any inconsistencies or you run into trouble following these recipes.

Preliminary Policy for Early Measurement Plug-ins

Almost all proposed LHCb early measurements are production measurements for which RIVET plug-ins can/should be already developed. As a starting point, one can use an older measurement plug-in or write the plug-in for the older similar measurement and use it further to derive the plug-in for the new measurement. If distributions and binning are kept more or less the same, one ca derive a pretty good reference from the data points in the previous measurement. Therefore, the missing reference file is not a show stopper.

LHCb Generator Tuning with Professor/RIVET

Here we give a short description and lots of useful links for the generator tuning programme ongoing at LHCb. An important issue is to take into account that at LHCb the event generation part is controlled by a suite of packages with data and control interfaces provided by the GAUSS project. Thus, at LHCb the tuning of a generator is closely linked to the complete GAUSS configuration controlling the run conditions for all software packages affecting the event. In other words one should take care that the tuned generator parameter values are expected to be slightly influenced by different software components acting on the event layout at specific generation stages.

The following links point to various pages which contain the description of the strategy and RIVET plugins used to optimize various generators. The LHCb's first goal is to provide a forward region specific tune of the Pythia 8 generator using LHC measurements from Run 1 and starting from already released tunes obtained in the central rapidity region.

Pythia8 Tuning Programme

  1. Tune of the parameters controlling the heavy flavour production
  2. Light hadron production parameters tuning.
  3. Optimization of parameters controlling global event characteristics (energy flow, particle multiplicity and densities, etc.)

More details here...

RIVET and HepData in LHCb

The Durham High Energy Physics Database (HepData) has developed over the past four decades with the purpose of offering open-access to scattering data from particle physics experiments. As such it is actively sustained by the RIVET collaboration which requires that each RIVET analysis module made publicly available through their framework is backed-up by a record in the HepData repository. Thus, a HepData record of at least the data points for the plots you want your RIVET analysis module to reproduce should be made publicly available before your code gets released in a future version of RIVET or on the buffer repository available here.

A impact-colour-coded wish list of the RIVET development team with the analysis modules requested from LHCb may be consulted here. In case you can provide human resources for developing such code, please, feel free to contribute.

As of January , 2017 the old portal was superseeded by the new portal. Since February, 2017 new encoded data sets can be submitted only using the new portal. An updated procedure for submitting published measurements is detailed below:

  • allocate a few minutes during one of the final EB meetings to decide with members of the WG, reviewers, EB readers and/or the Physics Coordinator(PC) which distributions are of real interest for theorists and model validation and/or (generator) tuning
  • it is highly recommended, especially for encoding of highly requested measurements or measurements which need a further RIVET analysis module to be developed, to open a JIRA task under the LHCb Simulation project (LHCBGAUSS) and assign HepData as the affected component(s). This task may be used before and during the encoding to decide on encoding strategy, supplemental data to be published on the HepData portal or changes of the persons who are responsible for the encoding and validation of the record prior to public release.
  • contact the LHCb technical liaison for HepData (at the moment send AlexGrecu and your WG conveners an e-mail) and specify the arXiv ID, LHCb number of your paper and (preferably) the Inspire ID if available. Without Inspire ID there is no possibility to proceed any further so expect some delay at this step until a record of your article becomes available in Inspire. Submitting your paper to arXiv is proven to ensure an early import into Inspire. (To speed-up things please, also mention the physics WG in the subject of your message, e.g. " QEE : HepData record for arXiv: 1606.2131 (InspireID 1324567)")
  • the LHCb liaison will open the above-mentioned JIRA task (if it is not already opened by you). With the release of the new portal in February, 2017, any scientist would be able to interact on a given record using CERN SSO or ORCid to authenticate. LHCb liaison(s) or coordinator(s) are responsible for creating the temporary record slot and appoint/manage the Uploader(s) and Reviewer(s) (these persons could be nominated on the JIRA task along with replacements).
  • The official documentation for measurements encoding is detailed here and in references therein. Some LHCb specific notes follow:
    • it was found that, especially for encoding figures, the alteration of the .C files prepared for the CDS record works perfectly in outputting data into YAML format
    • also quick LaTeX (or UTF-8 text) parsers can be quickly written in Python to output the table contents in YAML format (yet the code depends on the particular format of the data source)
    • if you need further help or you are able to provide valuable code, you may want to visit and use the existing software tool advertised on the portal submission quick introduction page
    • the portal supports the .oldhepdata format providing a (Python based) conversion service on site which you can also clone from and use offline. Some tools automating parts of encoding could be found on the old portal), LHCb/Analysis/Ostap package or LbRivetPlugins package (to be migrated to GIT!). All tools output .oldhepdata format.
    • Nevertheless, please, beware that automatically generated YAML output is preliminary and always needs manual intervention/validation to ensure proper encoding (and parsing by the portal).
  • Encoder(s)/Uploader(s) must take care to reproduce faithfully the published values. Numbers are expected to have a conservative fixed number of decimal digits. It is preferred to quote separate statistical and systematic uncertainties, extracting them from an analysis note if not present in the published paper (always checking with proponents/reviewers that the source document is up-to-date and corresponds to published data). It is advised to quote separately (if possible) systematic uncertainties corresponding to different source. For multiple uncertainties use sensible keywords to distinguish between various sources of systematic uncertainty (e.g. lumi for uncertainties due to luminosity estimation). Correlation tables could always be added as supplemental material if they may be important for theorists. As advised by the HepData group, new encoders should take a look at similar records from HepData.
  • once the record is ready the Uploader needs to notify the Reviewer to double-check the encoding complies as close as possible to the goals of the database, i.e.
    • the main process/decays and observables are correctly identified - Tip, idea use build-reaction and other facilities of the old portal (until they become available at
    • since HepData is mainly a database it is important to make sure that the encoding uses same key words, observable names as other similar measurements
    • make sure the number of decimals is consistent (at least) at the level of the same table - Warning, important HepData converter will not add trailing zeroes (at least it currently does not do such modifications)
    • verify that encoded data tables are visualised on the portal (where supported) and propose alternative encoding methods if, for instance, splitting a table would reproduce distributions published in the corresponding paper (yet, such issues should be decided already at time of encoding or discussed preferably on JIRA)
    • verify that LaTeX expressions are rendered correctly on the portal page
    • ...
  • the Reviewer needs to mark all tables as Passed before Coordinator is allowed to make the record public
  • if errors are spotted after public release, the Coordinator should be asked to re-open the encoding by creating v2 of the record (and the whole process involving the portal must be followed from the start)

Once proponents have the final form of the encoded record, they are encouraged to proceed with the development/adjustment of the RIVET analysis module code (if such a code was foreseen for the analysis) to pick up and work with the reference file made available through the HepData web interface (even using the temporary record).

-- AlexGrecu - 2019-06-10

Edit | Attach | Watch | Print version | History: r35 < r34 < r33 < r32 < r31 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r35 - 2019-06-10 - AlexGrecu
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCb All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback