4.1 Data Analysis in the Full Framework

Complete: 4
Detailed Review Status

Contents

Goals of this page

This page provides an up-to-date overview of the role of CMSSW Framework (aka Full Framework) in user's analysis.

Introduction

The Full Framework is CMS's main tool for data processing and is thus intimately connected with user's analysis in a number of ways:

  1. it is used to produce data`upstream' of the user's analysis: in HLT, Reconstruction, and Skimming
  2. it is used in PAT-tuple production in group and user skims
  3. it could also be used for making histograms and plots
  4. it could be used by the user to further adjust the content of the PAT files used in the analysis

The objective of this whole chapter is to demonstrate the key applications of the Full Framework in points #3 and #4 above.

Making plots in Full Framework, and the Interactive Analysis

The Full Framework is capable of creating and filling histograms, and is able to manage many histograms produced by a number of ED Analyzers. There are two ways to utilize this capability:

  • monitoring and validation of Full Framework jobs
  • interactive use

The CMS Framework is perfectly suited for creating and filling lots of histograms that can be used to keep track of what is happening in the event processing. The histograms produced this way can be used to quickly identify whether the job is working correctly and whether the output file makes sense (without analyzing the output file more thoroughly in FW Lite). This can be extended to a more detailed validation as well. In fact, the Full Framework is a great tool for making many known plots.

However, this approach suffers from reduced interactivity. In FW Lite, one can have rapid iterations in the think-click-plot-think cycle. In cmsRun, the same cycle takes a bit longer (sometimes many times longer, depending on the application), which ultimately slows down the progress of the analysis as well.

That being said, if the analysis requires the use of detailed Geometry, Alignment, and other kinds of Calibration Constants (possibly requiring database access), the only way to achieve it is in the Full Framework.

Adjusting the content of the user's PAT-tuple

The users will naturally want to control what is in their analysis data sample (e.g., PAT-tuples), as that defines what is possible in the interactive stage of the analysis. In addition, having data sample which is too large also slows down the interactive analysis, simply because, due to a larger I/O, it takes longer to process the same number of events. So the key is to learn how to

  • add the necessary information (in terms of other ED products, or CMS.UserData attached to PAT objects)
  • remove the unnecessary information (in terms of making PAT objects which are `just right' in size)

These topics will be covered in detail in the PAT workbook sections below.

-- Main.Altan Cakir - 09 Oct 2017

Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r11 - 2019-01-29 - StevenClark


ESSENTIALS

ADVANCED TOPICS


 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback