AyanaHollowaySandbox
Introduction
This page is intended to introduce
users to the
Streaming Test datasets. These datasets are a first pass at making Monte Carlo datasets that are like real detector data in the
inclusive or exclusive data streaming scenarios, and creating the bookkeeping tools to interact with these datasets. By using the streaming test datasets, you can provide feedback on the two streaming scenarios, and on the prototype tools.
Before using these pages, you should probably have answers in mind for the following questions:
- What trigger(s) will collect the events for my analysis?
- What processes (signal or background) am I trying to understand with this data?
From these pages you can then learn
- What datasets contain the events passing my required trigger(s), and how can I get them?
- Will the streaming test data give me a reliable (signal or background) estimate for my process?
You will also be able to find
- How to check the status and location of streamed and mixed datasets, and who is responsible for providing these.
- (If and) where the RunLumiDB (luminosity and conditions database prototype) is available, and how to use it
Overview
What are raw data streams?
The concept of data streams is discussed in
DataStreaming. In short, they are datasets of raw data catgorized by physics content (trigger signature). For the purpose of the streaming test, there are five categories of physics triggers:
- Electron
- Muon
- Tau and missing energy
- Photon
- Jet and SumET
For the streaming test, events from csc production are written to different datasets according to their trigger signatures.
What are the streaming datasets?
There are 110 streaming datasets in the
Streaming Test. These roughly represent the output of
10 half-hour runs in early data-taking. Events from different runs belong to different datasets. For each run, the events are also separated into 11 streams. The 11 streams are described below.
Inclusive and Exclusive (sets of) datasets
For each run in the Streaming Test, the same data is available in two classification schemes:
inclusive stream datasets and
exclusive stream datasets. (Only one of these schemes will be implemented by ATLAS for storing detector data.) The inclusive stream set-of-datasets is made up of five datasets for each run, corresponding to the five trigger categories. Events that are passed by more than one trigger type are written to
each of the corresponding datasets. The exclusive stream set-of-datasets is made up of six datasets for each run. The events that are passed by more than one trigger type are written to a special dataset, called the
overlap dataset.
ATLAS will use either the inclusive stream or exclusive stream scenario. Users of streaming test data are invited to try both scenarios and give feedback.
What was used to make the streaming datasets?
- 3.6M events from existing csc11 RDO files from 11.0.42.X production
- 18 pb-1 of the following SM processes from csc11 event generation:
- Inclusive dijets ( pT above 560 GeV )
- Gamma-jet (pT above 70 GeV )
- Drell-Yan electron and muon pair production
- W -> e nu, W -> mu nu, W -> tau nu (leptonic and hadronic tau decays)
- Z -> e e, Z -> mu mu
- Top pair production, single top production
- For other processes, simulating 18 pb-1 was not feasible. A smaller (sample-dependent) integrated luminosity was used for these processes was used.
- Lower pT dijets
- Lower pT Gamma-jet
Since these processes are mostly collected by prescaled triggers, we were able to correct the number of events from these processes in some streams. That is, if the generated cross section for the sample was at least 18 pb-1/Ntrigger prescale, the number of events passed by the prescaled trigger will be correct. However, please read these [[#PreScale] [important caveats]] related to sample weighting if you are interested in high cross section processes (as signal or background).
- McEventCollections are removed from the streamed RDO events.
- Trigger decisions from 12.0.3 (patched) LVL1+LVL2 simulation. The triggers are used used to
- filter the input, and to
- sort the output into streams.
- The ESD/AOD are being made with 12.0.6.2 production, using geometry ATLAS-DC3-02 and trigger menu STR-01.
- Trigger objects in the reconstructed data (TriggerDecision, L2Result, TrigJet collections, etc.) are from release 12.0.6.2 and TriggerMenu STR-01. Currently, this menu does not include any EventFilter algorithms. This means that objects created by EF are not available in the ESD or AOD files.
How do I get the streaming data?
- The replication status of datasets from the streaming test can be checked [http://panda.atlascomp.org/?mode=listRDOReplications here] for RDO
and [http://panda.atlascomp.org/?mode=listAODReplications here] for AOD. If you want to download one or more files from these datasets to test
your analysis code, you can use dq2 tools to find them:
- Choose the stream for your studies based on the trigger-to-stream table below. The dataset names corresponding to these streams are
number |
Stream |
Dataset names |
0 |
Jet (jet and sum-ET triggers) |
streamtest.04???.inclJet.* , streamtest.04???.exclJet.* |
1 |
Ele |
streamtest.04???.inclEle.* , streamtest.04???.exclEle.* |
2 |
Muo |
streamtest.04???.inclMuo.* , streamtest.04???.exclMuo.* |
3 |
Pho |
streamtest.04???.inclPho.* , streamtest.04???.exclPho.* |
4 |
Tau (tau and missing ET triggers) |
streamtest.04???.inclTau.* , streamtest.04???.exclTau.* |
- Find the datasets for your stream from one or more runs. To find all inclusive muon datasets, you might run
[user@host] % dq2_ls streamtest.*.inclMuo.*
Note: to find all
exclusive datasets containing muon-triggered events, you must include the overlap streams
[user@host] % dq2_ls streamtest.*.exclMuo.* ; dq2_ls streamtest.*.overlap.*
;
Details about the triggers used to create streams
This trigger decision is ''not'' stored as a
TriggerDecision object, but is available in the event header of the RDO, ESD, and AOD. The
TriggerDecision in the ESD/AOD uses a
similar menu, STR-01. Some signatures in this table are not immediately available in STR-01, because of steering restrictions, or restrictions in the number of possible LVL1 thresholds.
The following menu was used in the trigger decision. (It is not an official
TriggerMenuVersion, so to re-run this trigger sequence some special XML files are needed.) To use exactly this menu, you can copy these files:
and use them in your trigger configuration. You might need the following configuration scripts:
Signature |
LVL1 Requirements |
LVL2 Requirements |
LVL2 Bit |
Stream |
STR-01 equivalent? |
jet25 |
0 |
0 (= Jet) |
y |
jet50 |
1 |
0 |
y |
jet90 |
2 |
0 |
y |
jet170 |
3 |
0 |
y |
jet300 |
4 |
0 |
y |
- |
jet550 |
6 |
0 |
y (different threshold at LVL1) |
4jet50 |
7 |
0 |
y |
4jet110 |
8 |
0 |
n |
sumet1000 |
9 |
0 |
y |
sumjet1000 |
10 |
0 |
y |
- |
e15i |
12 |
1 (= Ele) |
n |
e25i |
13 |
1 |
y |
2e15i |
14 |
1 |
y |
e15i&mu10 |
15 |
1 |
n (see below) |
- |
mu6 |
17 |
2 (= Muo) |
y |
mu10 |
18 |
2 |
y |
mu20 |
19 |
2 |
y |
2mu10 |
20 |
2 |
y |
- |
g20i |
22 |
3 (= Pho) |
y |
g60 |
23 |
3 |
y |
2g20i |
24 |
3 |
y |
tau35i |
25 |
4 (= Tau) |
y |
tau35i&etmiss45 |
26 |
4 |
n (see below) |
jet45&etmiss45 |
27 |
4 |
n (see below) |
jet70&etmiss70 |
28 |
4 |
n (see below) |
etmiss200 |
29 |
4 |
y |
etmiss1000 |
30 |
4 |
y |
- |
In STR-01, the signatures that include combinations of two kinds of trigger (like
tau35i&etmiss45
) are not possible in the steering. If you want to see whether these triggers were satisfied in an ESD or AOD file, you can do something like
sg.retrieve(td,"MyTriggerDecision")
useEvent = (td->isTriggered("tau35i",TriggerDecision::L2) && td->isTriggered("etmiss45",TriggerDecision::L1) )
To see which triggers were passed in the streaming/filtering stage (using 12.0.3 offline trigger simulation code), do
#include "StreamMix/StreamTrigConfig.h"
...
sg.retrieve(eh,"StreamingEventHeader")
TriggerInfo ti = eh->trigger_info()
if ((1 << StreamTrigConfig::TAU_35I_MET_45)& ti->l2TriggerInfo()) //note! The use of l2TriggerInfo() may change in future offline releases.
What else should I know?
- This page is not finished.
- Because different software releases were used to "stream" the data and to make ESD/AOD/TAG files, there are some small discrepancies in the trigger decisions accessible in the TriggerDecision object and in the EventHeader. For the same reason, there might be events in the streamed datasets which appear not to pass any triggers in the TriggerDecision. Finally, the STR-01 table does not include any prescales, which can cause more apparent discrepancies.
- Combining triggers of different types is not implemented in the TriggerHLTSteering. If you wish to "confirm the trigger decision offline" using the TriggerDecision for this kind of complex trigger, check the corresponding signatures in the TriggerDecision, and require that both signatures are satisfied.
- As noted above, many csc production samples used (especially QCD) are smaller than 18 pb-1. Using prescales, we have endeavored to produce the correct number of events in the final samples which pass triggers dedicated to these processes (such as low-pT jet triggers). However, it was not possible to simultaneously produce the correct number of these events passing other triggers (such as lepton signatures).
--
AyanaHolloway - 17 Jan 2007