Previous page

b-Jet charge ID using DNN


This page provide useful information about implementation and training of DNN used to identify charge of jets

On this page:

Usefull Links

Production of inputs


The following code used to produce frat ntuples used for traning of the neural net

cmsrel CMSSW_10_0_1
cd CMSSW_10_0_1/src/
git cms-init
git clone
cd DeepNTuples
git checkout 94X
# Add JetToolBox
git submodule init
git submodule update
scram b -j 4

  • Initial setup:

cd CMSSW_10_0_1/src

  • to run local tests execute:
cmsRun DeepNtuplizer/production/ inputFiles=$file maxEvents=2000 skipEvents=1000 outputFile=output1

Full list of files can be found here. If file doesn't load copy it locally.

xrdcp root://$file .
pfl=`curl -ks "${site}&lfn=${lfn}&protocol=srmv2" | grep PFN | cut -d "'" -f4`
env -i X509_USER_PROXY=/tmp/x509up_u58751 gfal-copy -n 1 $pfl "file:///`pwd`/miniAOD.root"

Before running the code, add extra variables (such as jet_charge) to the file:

   // jet charge implementation:
   gen_parton_pdgid_ = 0; gen_hadron_pdgid_ = 0; gen_hadron_pt_ = 0;
   if(jet.genParton()) gen_parton_pdgid_ = int(jet.genParton()->pdgId());
   if(jet.jetFlavourInfo().getbHadrons().size()) {
      gen_hadron_pdgid_ = jet.jetFlavourInfo().getbHadrons().at(0)->pdgId();
      gen_hadron_pt_ = jet.jetFlavourInfo().getbHadrons().at(0)->pt();

Charge of the jet can be obtained from the charge of the b-quark or b-hadron. The implementation is the following (introduced in the file):

if(jet.genParton()) gen_parton_pdgid_ = int(jet.genParton()->pdgId());
if(jet.jetFlavourInfo().getbHadrons().size()) gen_hadron_pdgid_ = jet.jetFlavourInfo().getbHadrons().at(0)->pdgId();

slim ntuples:

To slim the ntples modify the ntuple_JetInfo::fillBranches() function to skip unwanted jets. For example:
  • Skim only leptonic b-jets: if(isPhysLeptonicB_ && isPhysLeptonicB_C_) returnval=false
  • Skim only hadronic b-jets: if(isPhysB_==0 && isPhysBB_==0) returnval=false;


The training is performed using the DeepJetCore package, which can be exported with the Singularity container:

env -i PATH=/usr/bin/ SINGULARITY_CACHEDIR="/tmp/$(whoami)/singularity" singularity run -B /home -B /eos -B /afs --bind /etc/krb5.conf:/etc/krb5.conf --bind /proc/fs/openafs/afs_ioctl:/proc/fs/openafs/afs_ioctl --bind /usr/vice/etc:/usr/vice/etc /eos/home-j/jkiesele/singularity/images/deepjetcore3_latest.sif

Build a new subpackage: BJetChargeID --data
cd BJetChargeID; source

Once you have the data in the example_data folder, convert it to format that will be used by the ML algorithm (modify modules/datastructure/ to setup correct inputs and labels):

   // labels
   truth = np.expand_dims((urfile.array("gen_parton_pdgid")/5+1)/2, axis=1)
        lep_pt=[]; lep_charge=[]
        muons_pt = urfile.array("muons_pt")
        electrons_pt = urfile.array("electrons_pt")
        muons_charge = urfile.array("muons_charge")
        electrons_charge = urfile.array("electrons_charge")   
   for i in range(urfile.numentries):
          nmu=len(muons_pt[i]); nel=len(electrons_pt[i])
          if nmu:
              if nel and (pt < electrons_pt[i][0]):
          elif nel:
        feature_array = np.concatenate([
                                np.expand_dims(lep_pt, axis=1), 
                                np.expand_dims(lep_charge, axis=1)
                                ],axis=1) -i example_data/train_files.txt -o train_data -c TrainData_example

The training is done using the following script (modify the Train/ script to set your own model):

python3 Train/ train_data/dataCollection.djcdc result_train --gpu none

The results will appear in result_train, to predict output for a new sample run: result_train/KERAS_check_best_model.h5 result_train/trainsamples.djcdc example_data/test_files.txt result_test --gpu none

The output file in result_test/ will contain predicted values

To compare predicted to truth you can do the following:

root -l example_data/out_bjet_lep_10.root
tree->Draw("gen_parton_pdgid : tree.prob_isA","","box")

-- MichaelPitt - 2020-01-23

Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r11 - 2020-06-22 - MichaelPitt
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback