Scripts for Produce and Run a Precompiled POWHEG Tarball

Foreword

This is a twiki of Generators Team concerning the MC production at $\sqrt{s}=$ 8, 13, and 14 TeV with the POWHEG generator. Particularly, this twiki page teaches users how to create a pre-compiled POWHEG tarball. The precompiled POWHEG tarball here is similar to the Madgraph5_aMC@NLO gridpack. The tarball contains POWHEG executable pwhg_main, grid files, a script runcmsgrid.sh, and input cards powheg.input. The tarball could be used together with the generic script run_generic_tarball.sh when doing the CMS official production of LHE files. The tarball could also be run locally to produce LHE files.

The workflow is as follows.

  • Prepare a proper input card for your physics process.
  • Either put your input card locally, or you can use the example input card from generator web (<scram_arch_version>/powheg/<powhegbox_version>/<beam_energy>*) or under CMS generator AFS area: /afs/cern.ch/cms/generators/www/<scram_arch_version>/powheg/<powhegbox_version>/<beam_energy>. When your input card is in a local area, you need to specify the fulll path of this input card in the input arguments. Process-dependent additional files will be fetched by the script too.
  • Create a POWHEG tarball by running a POWHEG job in a local SLC6 machine or using the lxplus batch system.
  • Run the precompiled tarball to produce LHE files
    • Run the tarball locally. See the detailed instruction here
    • Run the tarball via official CMS framework. See the detailed instruction here
      • Put the output tarball in a CMS generator public area /afs/cern.ch/cms/generators/www/<scram_arch_version>/powheg/tarball/
      • For a better bookkeeping, make sure you put the tar ball in a directory that matches the SCRAM_ARCH version and the beam energy of the MC production.
      • If the directory does not exist, please inform the generator group Silvano Tosi <Silvano.Tosi@cern.ch> and Shin-Shan Eiko Yu <syu@cern.ch>
      • Once the tarball is seen by the frontier, one could run the tar ball to produce Powheg LHE files using the generic script run_generic_tarball.sh

Source code and input cards

Caveat

  • The old script create_powheg_tarball.sh is obsolete and not supported any more.
  • You must have a github account. Please refer to http://cms-sw.github.io/cmssw/faq.html for general information about git and register to it.
  • This instruction has been tested in lxplus and the following CMSSW versions. Note, each CMSSW version has its corresponding SCRAM_ARCH. You could find out the available CMSSW releases in each SCRAM_ARCH settings by typing "scram list" after doing "cmsenv".
    • slc6_amd64_gcc472: CMSSW_5_3_30
    • slc6_amd64_gcc481: CMSSW_7_1_20
  • If you want to use two different powheg.input separately for the tarball production and for the LHE production, please remember to untar the produced tarball and replace the powheg.input file with your desired input file for the LHE production.
  • Do not create multiple grids using the same work directory
  • Only POWHEG BOX 2 allows the possibility of producing LHE files with scale/PDF weights in xml format. The W and Z processes in powhegboxV1_Sep2014.tar.gz could produce weights but NOT in xml format.
  • In order to set the random seed and number of events externally via the script runcmsgrid.sh, you MUST remember to include the following two lines in your input file
      numevts NEVENTS    ! number of events to be generated
      iseed    SEED    ! initialize random number sequence 
  • In order to produce scale/PDF weights in the LHE files, the following settings are required in the POWHEG V2 input data card. With the proper input card settings below, the script runcmsgrid.sh produces LHE files containing weights with 9 variations of scales, 100 variations of NNPDF3.0, 52+1 variations of CT10, and 50+1 variations of MMHT2014nlo68cl PDF error sets, and αs variations of these 3 PDF sets. The nominal PDF set one should use is NNPDF3.0. An example of output LHE file with weights could be found at: /afs/cern.ch/work/s/syu/public/LHE/cmsgrid_final.lhe.
       lhans1   260000    ! pdf set for hadron 1 (LHA numbering for NNPDF 3.0)
       lhans2   260000  ! pdf set for hadron 2 (LHA numbering for NNPDF 3.0)
       renscfact  1d0   ! (default 1d0) ren scale factor: muren  = muref * renscfact 
       facscfact  1d0   ! (default 1d0) fac scale factor: mufact = muref * facscfact 
       pdfreweight 1       ! PDF reweighting
       storeinfo_rwgt 1    ! store weight information
  • In order to re-use the existing grid files, the following settings are required in the powheg input data card:
       use-old-grid    1 ! if 1 use old grid if file pwggrids.dat is present (<> 1 regenerate)
       use-old-ubound  1 ! if 1 use norm of upper bounding function stored in pwgubound.dat, if present; <> 1 regenerate
  • If min-lo is turned on, one MUST store events with negative weights.
       minlo 1            ! default 0, set to 1 to use minlo
       withnegweights 1 ! default 0, 
  • If you are not using MINLO process, you are STRONGLY recommended to check that the negative weights are not important, and that they are not important in the specific phase space of the analysis for which the sample is produced. If yes, you need to make sure the parameter "withnegweights" is set to 1. You could find out how to check the fraction of negative weights here. The known processes that have significant fraction of negative weights are listed below.
    • VBF_H (1%), VBF_Z_Z (7%), VBF_HJJJ (25%)
  • The process gg_H_2DHM requires additional input branching fraction files in the working directory: br.*3_2HDM. The files are available here.
  • The process gg_H_MSSM requires additional input file in the working directory: powheg-fh.in. The file is available here.
  • The process VBF_HJJJ requires additional input card in the working directory: vbfnlo.input. See an example here.
  • The process W requires additional input PDF file in the working directory: cteq6m. The file is available here.
  • The process ttH takes a large memory for compiling. It is the best to submit the tarball production job to a specific machine that has a larger memory. If you want to use LSF batch system, remember to add the following flags of bsub (the flags require 50M of memory)
  •       bsub -q 1nw -C 0  -R "rusage[mem=50000]" $PWD/runJob.sh $PWD slc6_ttH.txt
    
    • The authors of ttH gave the following suggestions regarding the generation of tarball (gridpacks) and LHE files.
         "We recommend usingtheoption "fakevirt" forthegeneration ofthegrids. 
          To this end, activatethefakevirt option, but setncall2to zero.Theprogram
          will then generate grids quickly, but stop after that stage. After this step is completed, 
          de-active fakevirt (this is important to get true NLO predictions), setncall2to
          thenumber of your choice, and re-run pwhg_main."
    
    • Variations of the radiation damping can be done with the attached runcmsgrid_powheg_hdampgrid.sh. The example script expects a default of hdamp=mt and creates the variations specified in harray {0=off, mt/2, mt*2}. Scale variations are performed for each hdamp value. The following lines need to be added to the input card:
           hdamp 172.5
           dampreweight 1      ! h_damp reweighting
    
    • This instruction has been tested using the example of powheg input data cards as in the link. Users who try different CMSSW, platform, data cards, or physics processes are at their own risk. If any error occurrs, please report back to the generator group.

    This twiki page shows how to use the provided python script to create a pre-compiled POWHEG tarball package. The package will contain the POWHEG executable pwhg_main, grid files, a script runcmsgrid.sh as well as input cards: powheg.input and JHUGen.input. (the later is process dependent)

    Step by step tutorial for the gg_H_mass_effect process in POWHEG BOX 2.

    For the parallel jobs need to be finished before entering the next stage. The workflow is slightly different for single job production and parallel jobs production.

    The workflow is as follows.

    • Prepare a proper input card for your physics process.
    • Prepare the input card files locally.
    • Compile the POWHEG source based on the chosen process
    • Generate grid pack either with single processor or multi-processor to run locally or submit to batch system
    • Run the precompiled tarball to produce LHE files -- using the scripts from this twiki page.


    Note!! The instructions would also work for CMSSW_5_3_26, though not tested.

    Step 1: Compiling the POWHEG source in an SLC6 machine

    • Log on to an SLC6 lxplus machine

        ssh -Y lxplus.cern.ch
    

    • Create a CMSSW work area in the lxplus, better in the /afs/cern.ch/work so that you have enough disk space. Note for the corresponding SCRAM_ARCH.

        setenv SCRAM_ARCH slc6_amd64_gcc481 
        cmsrel CMSSW_7_1_14
        cd CMSSW_7_1_14/src
        cmsenv
    

    • Checkout the scripts

        wget https://raw.githubusercontent.com/yuanchao/usercode/master/GenProductions/bin/run_pwg.py
        git clone git@github.com:cms-sw/genproductions.git genproductions
        mv genproductions/bin/Powheg/*.sh .
        mv genproductions/bin/Powheg/patches .
    

    • Get the input card files

        wget https://raw.githubusercontent.com/cms-sw/genproductions/master/bin/Powheg/examples/gg_H_quark-mass-effects_withJHUGen_NNPDF30_13TeV/gg_H_quark-mass-effects_NNPDF30_13TeV.input
        wget https://raw.githubusercontent.com/cms-sw/genproductions/master/bin/Powheg/examples/gg_H_quark-mass-effects_withJHUGen_NNPDF30_13TeV/JHUGen.input
    

    • One could run run_pwg.py to d/l and compile POWHEG source into a directory my_ggH. If having already
    an compiled binary of POWHEG, one can skip this step.

        cmsenv
        python ./run_pwg.py -p 0 -i gg_H_quark-mass-effects_NNPDF30_13TeV.input -m gg_H_quark-mass-effects -f my_ggH -q 2nd
    

     Definition of the input parameters:
      (1) -p grid production stage [0]  (compiling source)
      (2) -i intput card name [powheg.input]
      (3) -m process name (process defined in POWHEG)
      (4) -f working folder [testProd]
      (5) -q batch queue name (run locally if not specified)
     

    • If a proper URL path is given for the input cards, run_pwg.py will d/l it, ex.
       -i slc6_amd64_gcc481/powheg/V2.0/13TeV/examples/DMGG_NNPDF30_13TeV/DMGG_NNPDF30_13TeV.input 
      .
    • To run all steps of single process mode in one-shot, just set the stage option as
       -p f 
      .
    • If an interactive job takes more than 1 hour, the job may be killed. One could also submit jobs to the lxplus batch system to create the powheg tar ball. The available queue on lxplus are '2hr', '1nd', '2nd'...

    #single

    Step 2.1: Running the grid production with single process

    • The three internal stages of POWHEG can be finished in one step

        python ./run_pwg.py -p 123 -i gg_H_quark-mass-effects_NNPDF30_13TeV.input -m gg_H_quark-mass-effects -f my_ggH -q 2nd -n 1000
    

     Definition of the input parameters:
      (1) -p grid production stage '123' stands for single process through out the three internal stages
      (2) -i intput card name [powheg.input]
      (3) -m process name (process defined in POWHEG)
      (4) -f working folder [testProd]
      (5) -q batch queue name (run locally if not specified)
      (6) -n the number of events to run
    

    Step 2.2: Running the grid production with parallel processes

    • The three internal stages of POWHEG can be done after all jobs of the previous stages finished.

        python ./run_pwg.py -p 1 -x 1 -i gg_H_quark-mass-effects_NNPDF30_13TeV.input -m gg_H_quark-mass-effects -f my_ggH -q 2nd -t 5000 -n 1000
    

     Definition of the input parameters:
      (1) -p grid production stage '1', '2', or '3'
      (2) -x grid refinement stage '1', '2', or '3' for grid production stage '1'
      (3) -i intput card name [powheg.input]
      (4) -m process name (process defined in POWHEG)
      (5) -f working folder [testProd]
      (6) -q batch queue name (run locally if not specified)
      (7) -t the total numbers events to run
      (8) -n the number of events in each parallel jobs
    

    • In this example, 5 jobs with 1000 event each will be submitted to queue '2nd'.

    Step 3: Create the POWHEG tarball

        python ./run_pwg.py -p 9 -i gg_H_quark-mass-effects_NNPDF30_13TeV.input -m gg_H_quark-mass-effects -f my_ggH -k 1
    

     Definition of the input parameters:
      (1) -p grid production stage '9' stands for tarball creation
      (2) -i intput card name [powheg.input]
      (3) -m process name (process defined in POWHEG)
      (4) -f working folder [testProd]
      (5) -k keep the validation .top plots [0]
    

    Step 4: Checking the integration grids

    https://twiki.cern.ch/twiki/bin/viewauth/CMS/PowhegBOXPrecompiledCheckGrids

    Step 5.1: Running the precompiled tar ball locally

    First un-tar the pre-built gridpack tarball with:
        tar xvzf testGrid.s_gg_H_quark-mass-effects.tgz
    

    • For single processor production, use the following command:
        ./runcmsgrid.sh <numberOfEvents> <RandomSeed> 1
    

    • For multi-processor production, use the patched version (beta) as:
        ./runcmsgrid_par.sh <numberOfEvents> <RandomSeed> 1
    

    Step 5.2: Running the precompiled tar ball via externalLHEProducer

    • Example config python file for cmsDriver.py could be found here.

    • An LHE file cmsgrid_final.lhe or cmsgrid_final_XXXX.lhe is now produced. You could find an example of LHE file with weights at /afs/cern.ch/work/s/syu/public/LHE/cmsgrid_final.lhe.

    How to check the fraction of events with negative weights

    When "withnegweights" is set to 1, POWHEG log file will print out the information of negative weights as follows:

     tot:   10.396661228247105      +-   2.2674314776262974     
     abs:   10.552610613248710      +-   2.2674303490903318     
     pos:   10.474635920747978      +-   2.2674300770006690     
     neg:   7.7974692500833637E-002 +-   1.9475025001334601E-003
      powheginput keyword ubsigmadetails       absent; set to   -1000000.0000000000     
     btilde pos.   weights:   10.474635920747978       +-   2.2674300770006690     
     btilde |neg.| weights:   7.7974692500833637E-002  +-   1.9475025001334601E-003
     btilde total (pos.-|neg.|):   10.396661228247105       +-   2.2674314776262974     
     negative weight fraction:   6.7595163087627646E-003
     

    However, to check how the kinematic distributions are affected, one must compare the histograms filled with both weights and those with only positive weights. You could use either Rivet or LHEAnalyzer to study the effect.

    -- YuanChao - 2015-10-21

    Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
    Topic revision: r8 - 2015-10-27 - unknown
     
      • Cern Search Icon Cern Search
      • TWiki Search Icon TWiki Search
      • Google Search Icon Google Search

      Main All webs login

    This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
    Ideas, requests, problems regarding TWiki? Send feedback