Difference: CutLang (1 vs. 3)

Revision 32019-07-31 - SezenSekmen

Line: 1 to 1
Changed:
<
<

CutLang: A particle physics analysis description language and runtime interpreter

>
>

CutLang: A HEP analysis description language and runtime interpreter

 
Line: 11 to 11
 

Analysis description language syntax and writing rules

Changed:
<
<
The analysis description language (ADL) in CutLang is a declarative domain-specific language customized to express physics analysis-specific concepts. The current CutLang ADL is based on the [[https://arxiv.org/abs/1605.02684]Les Houches Analysis Description Accord proposal (sections 15 and 16)]].
>
>
The analysis description language (ADL) in CutLang is a declarative domain-specific language customized to express physics analysis-specific concepts. The current CutLang ADL is based on the Les Houches Analysis Description Accord proposal (sections 15 and 16).
  The ADL in particular focuses on event processing operations. Its core includes:
  • simple and composite object definitions (jets, muons, Ws, RPV stops, )

Revision 22019-07-30 - SezenSekmen

Line: 1 to 1
Deleted:
<
<
META TOPICPARENT name="TWiki.WebPreferences"
 

CutLang: A particle physics analysis description language and runtime interpreter

Added:
>
>

Introduction

CutLang is an analysis description language and runtime interpreter for high energy collider physics data analyses. The analysis description language aims to express all elements of a data analysis in an easy and unambiguous way. The runtime interpreter reads and interprets the analysis operations directly and runs the analysis on event data.

CutLang aims to serve as a regular tool for the high energy community in general: from experimental analysts and phenomenologists to educators in areas from analysis design to preservation.

Analysis description language syntax and writing rules

The analysis description language (ADL) in CutLang is a declarative domain-specific language customized to express physics analysis-specific concepts. The current CutLang ADL is based on the [[https://arxiv.org/abs/1605.02684]Les Houches Analysis Description Accord proposal (sections 15 and 16)]].

The ADL in particular focuses on event processing operations. Its core includes:

  • simple and composite object definitions (jets, muons, Ws, RPV stops, )
  • object and event variable definitions (isolation variables, transverse mass, aplanarity, angular variables, BDTs, ...)
  • event selection definitions (signal, control, validation, ... regions)
  • CutLang ADL also includes some standard visualization operations, in order to direct its runtime interpreter.

Further operations with selected events (background estimation methods, scale factor derivations, etc.) can vary greatly, and thus may not easily be considered within the ADL scope.

The ADL consists of:

  • a plain text file describing the analysis using a HEP specific language with syntax rules that include standard mathematical and logical operations and 4-vector algebra.
  • a library of self-contained functions encapsulating variables that are nontrivial to express with the ADL syntax

Current ADL syntax is capable of describing the majority of generic event processing operations, however work is in progress to further improve and generalize the scope.

Blocks

The ADL comprises of blocks with a keyword-value structure:

blocktype blockname
  # general comment
  keyword1 value1
  keyword2 value2
  keyword3 value3 # comment about value3

Blocks allow a clear separation of analysis components.

Types of blocks

block name functionality
object / obj Object definition block. Produces an object type from an input object type by applying selections.
region / algo Event selection block.
info Contains analysis information. NOT implemented yet.
table Generic block for tabular information. NOT implemented yet.

Keywords

Types of keywords:

keyword name functionality
define Define variables, constants.
select Select objects or events based on criteria that follow the keyword.
reject Reject objects or events based on criteria that follow the keyword.
take / using / : Define the mother object type in an object block.
sort sorts an object in an ascending or descending order with respect to a property.
weight Weight the events with a (currently constant) number in the region block.
histo Fill histograms.

Operators

Type Operators
Comparison operators >, <, =>, =<, ==, [] (include), ][ (exclude)
Mathematical operators +, -, *, /, ^,
Logical operators AND / &&, OR / = =
Ternary operators condition ? true-case : false-case
Optimization operators ~= (closest to), != (furthest from) (optimal particle sets are assigned negative indices)
Lorentz vector addition operator LV1 + LV2

Functions

Default functions

Some very generic widely-used functions are included in the ADL and are understood by CutLang interpreter.

Type functions
Mathematical functions abs(), sin(), cos(), tan(), log(), sqrt()
Reducers Size() (considering sum(), min(), max(), any(), all(), ...
HEP-specific functions dR(), dphi(), m()
Object attributes CutLang syntax treats all object attributes as functions, e.g. Pt(jets[0])

External / user functions

Variables that cannot be expressed using the available operators or standard functions would be encapsulated in selfcontained functions that would be addressed from the ADL file

  • Variables with non-trivial algorithms: MT2, aplanarity, razor variables,
  • Non-analytic variables: Object/trigger efficiencies, vatiables computed with MVAs,

IMPORTANT: External functions are only limited to expressing complex object or event variables. Writing any simpler analysis operation that can already be expressed with the ADL syntax (e.g., an event selection sequence) as an external function is against the purpose of the ADL idea.

The external functions are provided with a general purpose language (c++, in current CutLang implementation). They are compiled with the interpreter once and are recognized afterwards.

Analysis description language writing rules

  • Multiple blocks of each type can be written.
  • Multiple derivations from the same initial object type can be made (e.g. jets -->
cleanjets --> verycleanjets)
  • Using an object region block, Size(object) >= 0
  • ...
  • Runtime interpreter and framework

    CutLang runtime interpreter:

    • No compilation. Directly runs on the ADL file.
    • Written in c++, works in any modern Unix environment.
    • Based on ROOT classes for Lorentz vector operations and histograms
    • ADL parsing by Lex & Yacc: relies on automatically generated dictionaries and grammar.

    CutLang framework: CutLang interpreter + tools and facilities

    • Reads events from ROOT files, from multiple input formats like Delphes ATLAS & CMS open data, LVL0, CMSnanoAOD, FCC. More can be easily added.
    • All event types converted into predefined particle object types.
    • Includes many internal functions.
    • Output in ROOT files. Analysis algorithms, cutflows and histograms for each region in a separate directory.

    Download

    CutLang interpreter and several ADL examples can be downloaded, and running instructions can be obtained from here.

    CutLang also has a binder interface.

    Documentation

    • S. Sekmen, G. Unel, ``CutLang: A Particle Physics Analysis Description Language and Runtime Interpreter", Comput.Phys.Commun. 233 (2018) 215-236, arxiv:1801.05727

     -- SezenSekmen - 2019-07-30 \ No newline at end of file

    Revision 12019-07-30 - SezenSekmen

    Line: 1 to 1
    Added:
    >
    >
    META TOPICPARENT name="TWiki.WebPreferences"

    CutLang: A particle physics analysis description language and runtime interpreter

    -- SezenSekmen - 2019-07-30

     
    This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
    Ideas, requests, problems regarding TWiki? Send feedback