How to work with files for Good Luminosity Sections in JSON format

Introduction

This twiki talks about files that describe which luminosity sections in which runs are considered good and should be processed. In CMS, these files are in the JSON format. (JSON stands for Java Script Object Notation). To find the most current good luminosity section files in JSON format, please visit

NOTE: Legend of colors for this twiki:

GRAY background for the commands to execute  (cut&paste)
GREEN background for the output sample of the executed commands
BLUE background for the configuration files  (cut&paste)
PINK background for the lines of C++ code  (cut&paste)

How to understand the text of Good Luminosity files in JSON format

A typical good lumisection file looks like:

{"132440": [[157, 378]], 
 "132596": [[382, 382], [447, 447]],
 "132598": [[174, 176]],
 "132599": [[1, 379], [381, 437]]
}

So the general format is


{"Run Number":[Lumi range, Lumi range, Lumi range, ...],
 "Run Number":[Lumi range, Lumi range, Lumi range, ...],
 ...}

where "Lumi range" could be a single lumi section like [382,382] or range like [157,378]

In this good lumi section file:

  • 132440 is the Run Number and [[157, 378]] means good lumisections from 157 to 378 (both inclusive) in Run 132440.

  • 132596": [[382, 382], [447, 447]] - Means good lumisections 382 and 447 in Run 132596. The lumisection is always a range, so a single lumisections 382 and 447 are written as [382,382] and [447,447].

  • "132599": [[1, 379], [381, 437]] - Means lumisections 1 to 379 and 381 to 437 in Run 132599 in Run 132599

How to compare Good Luminosity files in JSON format

The python script compareJSON.py will run different comparisons between two different files.

With this script you can check

  • The union of two files (--or)
  • The intersection of two files (--and)
  • The subtraction of two files (--sub) - all of the lumi sections in the first file that are not in the second file.
  • The difference of two files (--diff) - all lumi sections that appear in one of the files and not the other (i.e., compareJSON.py --diff alpha.json beta.json is equivalent to compareJSON.py --sub alpha.json beta.json and compareJSON.py --sub beta.json alpha.json)

The Good Luminosity files in JSON format used as examples in this twiki are here: alpha.json beta.json

For all options except --diff, you can specify a third file which will contain the output. For example, to see the intersection between alpha.json and beta.json: execute the following command

compareJSON.py --and alpha.json beta.json

and you see the output as follows

{"132596": [[382, 382], [447, 447]],
 "132598": [[174, 176]],
 "132599": [[1, 379], [381, 437]]}

To have the results saved in output.json:

compareJSON.py --and alpha.json beta.json output.json

And here are the other options and their corresponding output in use:

compareJSON.py --or alpha.json beta.json

{"132440": [[157, 378]],
 "132596": [[382, 382], [447, 447]],
 "132598": [[174, 176]],
 "132599": [[1, 379], [381, 437]],
 "132601": [[1, 207]]}

compareJSON.py --sub alpha.json beta.json

{"132440": [[157, 378]]}

compareJSON.py --sub beta.json alpha.json  

{"132601": [[1, 207]]}

compareJSON.py --diff alpha.json beta.json

'alpha.json'-only lumis:
{"132440": [[157, 378]]}

'beta.json'-only lumis:
{"132601": [[1, 207]]}

Other Utilities for JSON formatted files

All of these scripts (compareJSON.py and all following scripts) have a --help option for further details..

python Library

The compareJSON.py script and all the scripts below are based on the python library in FWCore/PythonUtilities. If you are programming in python (including in the CMSSW configuration language) you may find it easier to use the python directly.

To use the code, first import it:

import LumiList as LumiList

now you can construct a LumiList object in several different ways:

ll1 = LumiList(file = 'myFile.json')
ll2 = LumiList(url = 'https://cern.ch/path/to/file_json.txt')             # Not available in all versions
ll3 = LumiList(lumis = [[1001,1], [1001, 2], [1003, 1], [1003, 3]])       # Pairs of run number, lumi number
ll4 = LumiList(runsAndLumis = {'1001' : [1, 2], '1003' : [1, 3]})         # Dictionaries where the key is the run number and the value is a list of lumis
ll5 = LumiList(runsAndLumis = [{'1001' : [1, 2], '1003' : [1, 3]}])       # A list of objects like above. This is a fast way to construct a LumiList from outputs from lots of files, etc.
ll6 = LumiList(compactList = {'1001' : [[1,2]], '1003' : [[1,1], [1,3]]}) # The same format as the regular good lumi file
ll7 = LumiList(runs = [1001, 1003])                                       # This corresponds to every lumi in the listed runs

Once you have a LumiList object (or two) you can easily do lots of things with them:

nl1 = ll1 - ll2                 # Give me all the lumis in ll1 not in ll2
nl1 = ll1 + ll2                 # Give me all the lumis in ll1 or in ll2
nl1 = ll1 | ll2                 # Same as ll1 + ll2, just different notation
nl1 = ll1 & ll2                 # Give me all the lumis that are in both ll1 and ll2
len(ll1)                        # How many runs are in the LumiList?
ll1.removeRuns([1001,2002])     # Remove runs and all their lumis from a LumiList (not available in all versions)
ll1.selectRuns([1001,2002])     # Select only these runs if they exist in LumiList (not available in all versions)
ll5.getDuplicates()             # Get a list of all the duplicates found during construction (not available in all versions)

You can also get various representations of the data in a LumiList

print ll1                                    # Give a nice representation of the LumiList print(ll1) in Python3
ll1.getCompactList()                         # In the same format as the regular good lumi file
ll1.getLumis()                               # Pairs of run number, lumi number
ll1.getCMSSWString                           # CMSSW representation: '1001:1-2,1003:1,1003:3'
ll1.getVLuminosityBlockRange                 # a VLuminosityBlockRange suitable for configuring CMSSW
nl1.writeJSON(fileName='myNewFile.json')     # Write out the results of your modifications to a new .json file

printJSON.py

printJSON.py in FWCore/PythonUtilities , that prints out these files in a much more human readable fashion

printJSON.py alpha.json 

{"132440": [[157, 378]],
 "132596": [[382, 382], [447, 447]],
 "132598": [[174, 176]],
 "132599": [[1, 379], [381, 437]]}

instead of a single line as they usually are:

cat alpha.json 

{"132440": [[157, 378]], "132596": [[382, 382], [447, 447]], "132598": [[174, 176]], "132599": [[1, 379], [381, 437]]}

fjr2json.py

fjr2json.py in FWCore/PythonUtilities will read cmsRun framework job reports and print the list of lumis that have been processed in JSON format.

fjr2json.py somedir/*.xml

will run over all fjr in somedir and print out the JSON format to the screen.

fjr2json.py --output=ran.json somedir/*.xml

will save the results to ran.json.

Note that if you have used CRAB, you can just use the crab -report option to retrieve the same file.

edmLumisInFiles.py

edmLumisInFiles.py in DataFormats/FWLite (tag V01-11-00 or greater) takes a list of EDM files for input and prints out the list of lumis contained in JSON format.

edmLumisInFiles.py  data_Run14*

{"140362": [[29, 31], [60, 61]],
 "141961": [[62, 64], [85, 85], [87, 87]]}

A working example is:

edmLumisInFiles.py /afs/cern.ch/cms/Tutorials/TWIKI_DATA/CMSDataAnaSch/CMSDataAnaSch_Data_387.root

would give the following output

{"149011": [[575, 576], [699, 699]]

--intLumi will print the total integrated luminosity (recorded and delivered) to the screen (as well as a note pointing out that lumiCalc.py is the official method to calculate integrated luminosities; see the LumiCalc TWiki).

As with fjr2json.py, you can also use the --output option to save the results in a file.

filterJSON.py

filterJSON.py in FWCore/PythonUtilities (tag V01-04-00 or later) will read in a JSON formatted file and keep only runs that meet requested minimum or maximum run number.

filterJSON.py --min 140380 old.json

will print to the screen all runs greater than or equal to 140380.

filterJSON.py --min 140380 --max 141220 old.json --output new.json

will save to new.json all runs greater than or equal to 140380 and less than or equal to 141220.

You can also specify individual runs to be removed.

filterJSON.py --min 140380 --max 141220 --runs 140381,140385 --runs 140388 old.json --output new.json

will save to new.json all runs greater than or equal to 140380 and less than or equal to 141220 and explicitly removed runs 140381, 141385, and 141388. You can either add many runs with a comma separated list (e.g., 140381,140385) or you can use multiple --runs options.

csv2json.py

cvs2json.py in FWCore/PythonUtilities (tag V01-05-00 or later) will extract run and luminosity section in a CSV file and print output in JSON format. Uses --output option to save output to file (instead of printing to screen).

cvs2json.py input.csv --output output.json

By default, the script assumes that the 0th column is the run number and 1st column is the lumi section. You can control this with --runIndex and --lumiIndex options.

mergeJSON.py

mergeJSON.py in FWCore/PythonUtilities (tag V01-07-00 or later) will merge different JSON files together.

mergeJSON.py first.json second.json- --output=total.json

will take the runs in first.json as well as the runs in second.json.

mergeJSON.py first.json:132000-140999 second.json:141000- --output=total.json

will take the runs in first.json between 132000 and 14999 as well as the runs in second.json greater than or equal to 141000.

filterCSVwithJSON.py

filterCSVwithJSON.py in FWCore/PythonUtilities (tag V01-07-00 or later) will filter CSV files (e.g., those created by lumiCalc.py using option lumibyls or lumibylsXing), keeping only lumi sections that are in the JSON file.

filterCSVwithJSON.py short.json long.csv short.csv

will take the lumi sections in short.json from long.csv and create short.csv.

How to use Good Luminosity Section files in

CRAB

For this please use the following link Running over selected luminosity from the CRAB documentation.

Warning: When running CRAB do not follow the instructions for cmsRun. In other words do not mix Framework methods with the CRAB settings.

cmsRun

This section tells you how to use a file of good lumi sections to configure CMSSW. Usually the files in JSON format of luminosity sections are used as inputs into CRAB. But if you want to run interactively on the same lumi sections, you can use the little code snippet bellow. With CMSSW 3.8 and higher it works out-of-the-box. For earlier releases, one can check out FWCore/PythonUtilities tag V01-00-02 and begin using it. You might also have to check out CMS.PhysicsTools/PythonAnalysis (needed for LumiList module):

import FWCore.ParameterSet.Config as cms
import PhysicsTools.PythonAnalysis.LumiList as LumiList
myLumis = LumiList.LumiList(filename = 'goodList.json').getCMSSWString().split(',')
process.source.lumisToProcess = cms.untracked.VLuminosityBlockRange()
process.source.lumisToProcess.extend(myLumis)

A more compact syntax is available starting with CMSSW 5.0.0. For CMSSW 4.x releases it can be used after checking out PhysicsTools/PythonAnalysis V00-05-03 and building it.

import PhysicsTools.PythonAnalysis.LumiList as LumiList
process.source.lumisToProcess = LumiList.LumiList(filename = 'goodList.json').getVLuminosityBlockRange()

For Run 2 (CMSSW_7_4_X), the tool should be imported from a new location:

import FWCore.PythonUtilities.LumiList as LumiList
process.source.lumisToProcess = LumiList.LumiList(filename = 'goodList.json').getVLuminosityBlockRange()

FWLite

This below works with CMSSW 3.8 and higher. For earlier releases, one can check out FWCore/PythonUtilities tag V01-00-02 and begin using it.

You can find another complete example in CMSSW of how to access good run/lumi lists in FWLite here.

To use a good luminosity file in FWLite , first you need to have a configuration file that loads whichever file in JSON format you want.

cat loadJson.py

The loadJson.py file is HERE and is also shown below

import FWCore.PythonUtilities.LumiList as LumiList
import FWCore.ParameterSet.Types as CfgTypes
import FWCore.ParameterSet.Config as cms

# setup process
process = cms.Process("FWLitePlots")
process.inputs = cms.PSet (
    lumisToProcess = CfgTypes.untracked(CfgTypes.VLuminosityBlockRange())
)

# get JSON file correctly parced
JSONfile = 'Cert_132440-139790_7TeV_StreamExpress_Collisions10_JSON.txt'
myList = LumiList.LumiList (filename = JSONfile).getCMSSWString().split(',')

process.inputs.lumisToProcess.extend(myList)

In FWLite, you want to load in that configuration file. If a good luminosity file in JSON format is present, load it.

   PythonProcessDesc builder (argv[1], argc, argv); // or "myConfigFile.py"
   edm::ParameterSet const& inputs =
      builder.processDesc()->getProcessPSet()->
      getParameter("inputs");
   
   std::vector jsonVector;
   if ( inputs.exists("lumisToProcess") ) 
   {
      std::vector<edm::LuminosityBlockRange> const & lumisTemp =
         inputs.getUntrackedParameter<std::vector<edm::LuminosityBlockRange> > ("lumisToProcess");
      jsonVector.resize( lumisTemp.size() );
      copy( lumisTemp.begin(), lumisTemp.end(), jsonVector.begin() );
  }

Finally, you want to be able to check if this given event is part of the good luminosity file or not. If no good luminosity file is loaded, this function will always return true.

bool jsonContainsEvent (const std::vector< edm::LuminosityBlockRange > &jsonVec,
                        const edm::EventBase &event)
{
   // if the jsonVec is empty, then no JSON file was provided so all
   // events should pass
   if (jsonVec.empty())
   {
      return true;
   }
   bool (* funcPtr) (edm::LuminosityBlockRange const &,
                     edm::LuminosityBlockID const &) = &edm::contains;
   edm::LuminosityBlockID lumiID (event.id().run(), 
                                  event.id().luminosityBlock());
   std::vector< edm::LuminosityBlockRange >::const_iterator iter = 
      std::find_if (jsonVec.begin(), jsonVec.end(),
                    boost::bind(funcPtr, _1, lumiID) );
   return jsonVec.end() != iter;

}

Where you would call this from inside your event loop:


   for( evevnt.toBegin(); ! event.atEnd(); ++event )
   {
      if ( ! jsonContainsEvent (jsonVector, event) )
      {
          // this event is not in a good lumi section
          continue;
      }
   } // event loop

-- SudhirMalik - 30-Jul-2010

Topic attachments
I Attachment History Action Size Date Who Comment
Texttxt 100720_alpha.json.txt r1 manage 0.1 K 2010-07-30 - 20:33 SudhirMalik  
Texttxt 100720_beta.json.txt r1 manage 0.1 K 2010-07-30 - 20:33 SudhirMalik  
Texttxt 100720_loadJson.py.txt r1 manage 0.5 K 2010-07-30 - 20:33 SudhirMalik  
Edit | Attach | Watch | Print version | History: r28 < r27 < r26 < r25 < r24 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r28 - 2018-07-02 - FrancescaRicciTam
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback