Part 1: Using MadGraph to generate parton-level events

In the first part of the exercise, we will use the matrix element generator MadGraph5 _amc@NLO, or in short MG5. MG5 can perform automatic matrix element predictions for many processes at leading and next-to-leading order accuracy in QCD. Because of it's ease of use for processes both in and beyond the standard model, it is one of the most widely used software tools to model the hard interaction.

Setting up CMSSW

While MG5 can be used completely standalone, we will still set up a CMSSW area just to make sure that everyone has the same libraries loaded and can profit from pre-installed software such as LHAPDF.

source /cvmfs/
scram p CMSSW_7_1_30
cd CMSSW_7_1_30/src
cd -

Task 1: Using standalone MG5

In this part of the exercise, we will learn how to use MG5 to generate W boson events in proton-proton collisions.

Input files

What type of events are produced and what settings are used is defined in the MadGraph steering files, which are called run_card.dat and proc_card.dat.

In CMS, we maintain a large archive of these steering files for all samples that are generated in the official CMS production. The cards are stored in a GitHub repository. Let's check it out:

git clone -b CMSDAS2018_26x

The cards we are interested in are located in this subfolder:

ls genproductions/bin/MadGraph5_aMCatNLO/cards/examples/wplustest_4f_LO/


The proc_card file defines process to be generate (hence the name proc). Look at the syntax:

generate p p > w+, w+ > l+ vl

What does this statement mean? Refer to this MG tutorial for documentation on the syntax INSERTLINKHERE. What modifications of this statement can you think of? What could we replace the "w+ > l+ vl" part with?

Setting up Madgraph

We will download MG5 version 2.6.0 from a CMS-hosted mirror of the official release:

tar xf MG5_aMC_v2.6.0.tar.gz
rm MG5_aMC_v2.6.0.tar.gz
cd MG5_aMC_v2_6_0

While all interaction with MG5 can be entirely scripted (which is useful for larger scale production, as we shall see later), in this exercise it will be instructive to use the interactive command shell provided by MG5.


You will be asked whether you want to update to a newer version of MG5. Don't.

Running Madgraph

To get the generation started, we can now paste the contents of the proc_card line by line. What does each line do? Make sure to pay attention to the console output you get in reply to your input.

Tip: If you want to execute a shell command from inside of the MG5 console, preface it with an exclamation mark, e.g."

!cat  ../genproductions/bin/MadGraph5_aMCatNLO/cards/examples/wplustest_4f_LO/wplustest_4f_LO_proc_card.dat

The generation can be started by typing launch. MG5 will ask you a few more questions. When asked about the run_card, one can either use a default run card by just pressing ENTER, edit the default run_card by hand or provide a path to run card of one's choice. Please provide the path to the pre-made run_card:


Inspecting the output

After the run has terminated, have a look at the reported cross-section, and check out the output file. (Tip: If you have trouble finding it, you can search for it:

find -name '*.lhe.gz'

To get the human-readable LHE output file, extract the gzipped file, and inspect with less:

gzip -d output_file.lhe.gz
less output_file.lhe

Task 2: Using the gridpack workflow

As mentioned previously, interactive running of madgraph is useful for exercises and smaller tests, but ultimately not practical for large-scale production. To avoid having to use the interactive mode, one can use gridpacks instead. A gridpack is simply an archive file that contains all the executable MG5 code needed to produce LHE events for a given process. It has the advantage that once it is created, it is a one-button program to generate events, no thinking required.

In this part of the exercise, we will use the same input cards as before to create a gridpack, run it, and compare the results to before.

Creating the gridpack

The script infrastructure to create gridpacks is maintained in the genproductions repository. Gridpacks are generated using the gridpack_generation script, which we will run in local mode, i.e. on the machine we are currently logged in to. Note that scripts are provided to run the gridpacks on other computing infrastructures such as the CERN batch system and CMSConnect, which is useful for more complicated processes.

To create a gridpack, we simply call gridpack_generation and pass the process name and card location to it:

cd genproductions/bin/Madgraph5_aMCatNLO
time ./ wplustest_4f_LO cards/examples/wplustest_4f_LO local

(Note that there are naming conventions for the card. For a given process name $NAME, the input cards must be named as $NAME_run_card.dat, $NAME_proc_card.dat,etc...)

After the gridpack has been created (this should take approximately 5 minutes), we can extract and run it to produce LHE events:

mkdir work
cd work
tar xf ../*.xz


The output file will be cmsgrid_final.lhe.

Comparing the output LHE files

There are multiple ways of analyzing an LHE file, each of which has its own advantages and disadvantages. For the purpose of this exercise, we will use the most straightforward tool: MadAnalysis (MA). MA is a tool designed to be used by theorists to analyze parton-level LHE files, particle-level HEPMC files or even events with DELPHES detector simulation. We can install MA directly from the MG5 console:

cd MG5_aMC_v2_6_0
install MadAnalysis5

After installation finishes, exit the MG5 console and launch MA:


LHE files can be imported as different datasets:

import wplustest_4f_LO/Events/run_01/unweighted_events.lhe as STANDALONE
import ../genproductions/bin/MadGraph5_aMCatNLO/work/cmsgrid_final.lhe as GRIDPACK

Now we can user simple syntax to define the distributions we would like to look at:

# set STANDALONE.xsection = 1
# set GRIDPACK.xsection = 1

plot PT(mu+) 20 0 100
plot PT(mu-) 20 0 100
plot PT(l+) 20 0 100
plot PT(mu+ vm) 20 0 100
plot M(mu+ vm) 40 40 120
plot M(e+ ve) 40 40 120
plot M(ta+ vt) 40 40 120
plot M(l+ vl) 40 40 120

and start the analysis:

set main.stacking_method = superimpose

The analysis output can be viewed as HTML:

firefox ANALYSIS_0/HTML/index.html

What observations do you make? Are the two datasets consistent? What are the shapes of the lepton pT distributions? What is the shape of the pT distribution of the W system? Are these shapes physical?

Feel free to experiment here and plot other quantities you find interesting. You can check a syntax reference here: INSERTLINKHERE.

Part 2: Generating particle-level events

Setting up CMSSW

# Setup
scram p CMSSW_9_3_9_patch1
cd CMSSW_9_3_9_patch1;
cd -;

TASK 3: Parton showering

To generate physical collision events, the LHE files we have created need to be showered. Parton showering accounts for non-perturbative QCD effects with phenomenological models. The most used tool for parton showering in CMS is Pythia 8 (P8). While we could again run P8 completely on our own (you can download it from the P8 website INSERTLINKHERE, compile and run it on your laptop if you'd like), we are going to use the CMSSW GeneratorInterface in this exercise. Simply stated, CMSSW can act as a wrapper around various generators and chain their outputs together. This makes it quite easy to run large-scale production from a gridpack to the finished (Mini/Nano)AOD file.

Use cmsDriver to shower an existing LHE file

All CMSSW data and MC processing is controlled with the tool and its options. As an input to the tool, configuration fragments are used. Note that the fragments must be located in a CMSSW package directory structure like below and you must build your CMSSW release area after adding or changing the file. Many people have spent hours trying to figure out why their config fragment does not do what it should. Always remember the folder structure. Always remember to build.

# Copy the config fragment for showering
mkdir -p ${CONFIG_PATH}
cd ${CMSSW_BASE}/src;
scram b;
cd -;

Have a look at the fragment.

We can run the fragment using this cmsDriver command:

cmsDriver CMSDAS/GEN/python/ \
--mc \
--eventcontent RAWSIM,LHE \
--datatier GEN,LHE \
--conditions 93X_upgrade2023_realistic_v5 \
--beamspot HLLHC14TeV \
--step GEN \
--geometry Extended2023D17 \
--era Phase2_timing \
--filein file:unweighted_events.lhe \
--fileout file:GEN.root \
--no_exec \
-n 1000

At first, the many options to cmsDriver may seem prohibitively complicated. However, for practical purposed one can look up the cmsDriver command for any given central production campaign and use that.

Since we specified the --no_exec option, cmsDriver does not start the actual computation, but merely generates a CMSSW python configuration that encodes what we asked it to do. It can simply be run with cmsRun:


Without opening the output file, can you already discern some differences with respect to the LHE input file?

Inspecting the output

Since the output files are now in EDM format, we cannot use a simple text editor to inspect the events. As a simple alternative, CMSSW contains the ParticleListDrawer module:

cmsRun inputFiles=file:/path/to/GEN.root

What differences do you notice compared to the LHE events you have seen before?

Also for distribution plots, we can not use the same method as above: MA5 does not read EDM files. Instead, we will use simple CMSSW modules to plot basic quantities. Copy the configuration input file to our configuration directory and build:

cd ${CMSSW_BASE}/src;
scram b;
cd -;

Now you can run the actual plotting and check out the resulting histograms in a ROOT file:

rootbrowse analyzed.root

As in the LHE-level case, look at the distributions and try to make sense of them. What are the differences compared to before?

-- AndreasAlbert - 2018-08-02

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 2018-08-03 - AndreasAlbert
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback