ADL: An Analysis Description Language for particle physics data analysis

Analysis description language syntax and writing rules

Analysis Description Language (ADL) is a domain specific and declarative language that describes the physics content of a collider analysis in a standard and unambiguous way.

  • Domain specific: Customized to express analysis-specific concepts. Reflects conceptual reasoning of particle physicists.
  • Declarative: Tells what to do but not how to do it.
  • Easy to read: Clear, self-describing syntax rules.
  • Designed for everyone: experimentalists, phenomenologists, students, interested public…

ADL web page: cern.ch/adl.

ADL is a language, independent of software frameworks. Any framework recognizing ADL can run analyses written in ADL.

  • Focus directly on physics, not on programming.
  • Communicate analyses easily between groups, exp, pheno, students, public.

ADL syntax is based on the earlier languages Les Houches Analysis Description Accord proposal (sections 15 and 16) and on CutLang.

ADL in particular focuses on event processing operations. Its core includes:

  • simple and composite object definitions (e.g. jets, muons, W bosons, RPV stops, …)
  • object and event variable definitions (isolation variables, transverse mass, aplanarity, angular variables, BDTs, ...)
  • event selection definitions (signal, control, validation, ... regions)
  • ADL also includes some standard visualization operations, in order to direct its runtime interpreter.

Further operations with selected events (background estimation methods, scale factor derivations, etc.) can vary greatly, and thus may not easily be considered within the ADL scope.

ADL consists of:

  • a plain text ADL file describing the analysis algorithm using an easy-to-read DSL with clear syntax rules.
  • a library of self-contained functions encapsulating variables that are non-trivial to express with the ADL syntax (e.g. MT2, ML algorithms). Internal or external (user) functions.

Current ADL syntax is capable of describing the majority of generic event processing operations, however work is in progress to further improve and generalize the scope.

Blocks

ADL comprises of blocks with a keyword-expression structure:

blocktype blockname
  # general comment
  keyword1 expression1
  keyword2 expression2
  keyword3 expression3 # comment about value3

Blocks allow a clear separation of analysis components.

Blocks used in core analysis algorithm description:

Block name Functionality Related keywords
object / obj Object definition block. Produces an object type from an input object type by applying selections. take, select, reject
region Event categorization. select, reject, weight, bin, sort, counts, histo, save
info Contains analysis information such as the experiment, center-of-mass energy, luminosity, publication details, etc.  
table Generic block for tabular information, such as efficiency values versus variable ranges tabletype, nvars, errors

Keywords used for expressing analysis results, counts, etc.:

Block name Functionality Related keywords
countsformat Expresses the processes for which external counts are included and the format of counts process

Keywords

Keywords used in core analysis algorithm description:

Keyword name Functionality and usage Usage example Related block
select Select objects or events based on criteria that follow the keyword. select pT(Jet[0]) > 200 (in region) object, region
select [condition(s)] select pT(Jet) > 30 (in object)
reject Reject objects or events based on criteria that follow the keyword. reject pT(Jet[0]) < 200 (in region) object, region
reject condition(s) reject pT(Jet) < 30
bin Describe search bins: Single bin in each line bin HT [] 500 800 and MET [] 300 500 region
bin [condition(s)]
bins Describe search bins: Multiple bins on a single variable bins HT 500 800 1000 region
bins [variable] [bin boundaries]
define Define variables, constants define W = Jet[0] + Jet[1] --
bins [variable] [bin boundaries]
take / : Define the mother object type (defined in the input file or in another object block) take AK4jets object
take [object from which the new object is derived]
sort Sort an object in an ascending or descending order wrt a property. sort m(Jet) descend region
sort [attribute]([object]) ascend / descend
weight Weight events weight HLTSingleJet 0.95 region
weight [weightname] [weight value or function]
tabletype Specifies type of the table (e.g. efficiency, scale factor...) tabletype efficiency table
tabletype [table type]
nvars Number of variables in a table nvars = 2 table
nvars [number of variables]
errors Type of errors indicated in a table (currently only True or False) errors True table
errors True or False

Keywords used in analysis visualization and output control:

Keyword name Functionality and usage Usage example Related block
histo Fill 1 or 2-dimensional histograms histo hjet1pt , "jet 1 pT (GeV)", 40, 0, 1000, pT(jets[0]) region, object
histo [hname] "[htitle] [nbins], [min], [max], [varname]
save Save variables per event or events. Useful for making reduced ntuples or inputs to ML analysis save DNNvars jetHT bjetHT MET dR(jet[0], jet[1]) region
save [filename] [format] [variables]

Keywords used for expressing analysis results, counts and errors from external sources, e.g. papers (BG estimations, signal cutflows, etc):

Keyword name Functionality and usage Usage example Related block
process Specify process and the format for which external counts are given process est, "Total estimated BG", stat, sys countformat
process [name], "[title]", [errors] process obs, "Observed data"
counts Give counts and errors in a format defined within the coutformat block and all processes defined within via the process keyword. counts results 50.7 + 6.7 - 5.0 +- 5.1 , 50 region
counts [values and errors of process 1] , [values and errors of process 2] , ...

Keywords providing generic and publication information about the analysis:

Keyword name Functionality Related block
title, experiment, sqrtS, lumi Generic information about the analysis info
id, publication, arXiv, hepdata, doi Publication information info

Operators

Type Operators
Comparison operators >, <, =>, =<, ==, [] (include), ][ (exclude)
Mathematical operators +, -, *, /, ^,
Logical operators AND / &&, OR / II
Ternary operators condition ? true-case : false-case
Optimization operators ~= (closest to), != (furthest from) (optimal particle sets are assigned negative indices)
Lorentz vector addition operator LV1 + LV2

Functions

Default functions

Some very generic widely-used functions are included in the ADL and are understood by CutLang interpreter.

Type functions
Mathematical functions abs(), sin(), cos(), tan(), log(), sqrt()
Reducers Size() (considering sum(), min(), max(), any(), all(), ...
HEP-specific functions dR(), dphi(), m()
Object attributes CutLang syntax treats all object attributes as functions, e.g. Pt(jets[0])

External / user functions

Variables that cannot be expressed using the available operators or standard functions would be encapsulated in selfcontained functions that would be addressed from the ADL file

  • Variables with non-trivial algorithms: MT2, aplanarity, razor variables, …
  • Non-analytic variables: Object/trigger efficiencies, vatiables computed with MVAs, …

IMPORTANT: External functions are only limited to expressing complex object or event variables. Writing any simpler analysis operation that can already be expressed with the ADL syntax (e.g., an event selection sequence) as an external function is against the purpose of the ADL idea.

The external functions are provided with a general purpose language (c++, in current CutLang implementation). They are compiled with the interpreter once and are recognized afterwards.

ADL writing rules

  • Multiple blocks of each type can be written.
  • Multiple derivations from the same initial object type can be made (e.g. jets --> cleanjets --> verycleanjets)
  • ...

Making ADL executable

ADL can be rendered executable by any frameworks that can parse and understand its syntax.

-- SezenSekmen - 2020-06-07

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2022-11-19 - SezenSekmen
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCPhysics All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback