Analyzing Millions of Gigabyte of LHC Data for CMS - Discover the Higgs on OSG


  • Demo is showing analysis workflow: discover the Higgs on OSG
    • Analysis code was prepared in the CMSSW framework:
      • EDAnalyzer accessing reconstructed tracks and writing out ROOT file with histograms:
        • transverse momentum of reconstructed tracks: pT [GeV]
        • di-track invariant mass: mmu,mu [GeV]
        • invariant mass of two di-track-objects: mZ,Z [GeV]
    • Dataset discovery: use DBS/DLS discovery page to check availability and location of datasample: Higgs->ZZ->4mu
    • Analysis job execution on the GRID using CRAB

  • Used components of the CMS software and computing environment
    • CMSSW: CMS software framework and EDM
    • DBS/DLS discovery webpage:
      • DBS: Dataset Bookkeeping System, database of datasets and their files
      • DLS: Dataset Location Service, database of location(s) of datasets (which dataset is available at which site)
    • CRAB: CMS Remote Analysis Builder, user tool to submit and control batch analysis jobs to the GRID

  • CMSSW:
    • Based on a bus model, user schedules modules which are run by the main framework application: cmsRun
    • User interaction with the framework application is done through configuration file called parameter-set
    • Parameter-set instantiates modules, instance is labeled by the module label
    • 4 different types of modules, two main user modules:
      • EDProducer: uses input from the event and produces new output which is stored in the event
      • EDAnalyzer: uses input form the event and performs operations on input, does not store anything in the event (preparation shown in this demo)

  • Locations:
    • User interface (UI): interactive login nodes at Fermilab (UAF)
    • GRID sites: one of the seven US-CMS T2 sites
      • University of Nebraska, Lincoln (UNL, OSG middleware)
      • University of Wisconsin, Madison (Wisconsin, OSG middleware)
      • California Institute of Technology (Caltech, OSG middleware)
      • Massachusetts Institute of Technology (MIT, OSG middleware)
      • Purdue University (Purdue, OSG middleware)

Setup environment and prepare user area

Analysis code preparation

Dataset discovery

Analysis job execution on OSG

jobtype                = cmssw
scheduler              = condor_g

datasetpath            = <dataset name discovered with discovery page
pset                       = <parameter-set for analysis code>
total_number_of_events = 100
events_per_job         = 10
output_file            = <histogram file name>

se_white_list          = <destination site>
virtual_organization   = cms
lcg_catalog_type       = lfc
lfc_host               =
lfc_home               = /grid/cms
  • create jobs
crab -create
  • submit jobs
crab -submit all -continue
  • status check
crab -status -c
  • output retrieval
crab -getoutput -c

Finalize analysis: histograms

  • post processing: add histogram files of individual jobs using ROOT tool
cd crab_?_*_*/res
hadd histograms.root *.root
  • Show histograms
root histograms.root


Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 2007-03-24 - OliverGutsche
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback