L1 Trigger Hardware Validation
e-space
Goal of this page
Documents the general validation strategy of the
Level-1 Trigger Hardware employing the bit-level emulation of the trigger electronics.
Introduction
The validation workflow encompasses the following:
- unpacking of the detector and trigger digis
- execution of the full emulator chain over those data
- (templated) comparison between data and emulator
- automated histograming through dqm, both online and offline
Emulator
The L1 trigger emulator is a cmssw software implementation that replicates the trigger hardware logic.
Detector conditions and data unpacking
Data unpacking is performed using the corresponding standard sequence.
The detector conditions are set via the global tag employed.
The HardwareValidation sequence
The sequence implements the full (tpg thru gt) emulator chain and templated comparison.
Emulator sequence
The emulator chain involves every trigger stage, for both muon and calorimeter paths, from trigger primitive generation (TPG), regional reconstruction, all the way to the final global trigger decision bits. Each module takes as input the corresponding hardware-readout data collections. This allows the independent verification of each trigger component.
The emulator configuration must coincide with the that employed by the hardware. This synchronization is provided by the o2o database mechanism.
Data-Emulator comparison
The common comparison between the hardware readout data and the corresponding emulator response is done in a templated fashion. The outcome of the comparison is added to the event.
Analysis
The final step involves the filling and displaying of analysis histograms. These are implemented using the dqm framework.
Standard plots
The results of the comparison are retrieved, and used to generate a common set of histograms for every subsystem. This is the primary information passed to the shift crew for monitoring.
Subsystem-specific and expert level modules
Further detailed information specific to each trigger component is provided as appropriate through correspoding subsystem dqm modules.
Subsystem experts to add brief description.
Workflows
Automated workflows are implemented through the DQM framework.
Online
The L1TEMU application is run in real time, off the Storage Manager, at P5.
It executes data unpacking, the HardwareValidation sequence, and the dqm-based analyses. It is run as a standalone job deployed by the central dqm teams, over a sampling of the data.
It is intended to give feedback to the shift crew at P5
during the course of the run.
Offline
The offline workflow processes the full statistics of the collected datasets.
The execution of the emulator sequence at this stage provides a detailed analysis of the L1 data, allowing the identification of lower frequency disagreements, and yields a more robust
certification of the L1 trigger data.
The HardwareValidation sequence is currently
not being executed in the offline (re)processing of the data at Tier-0. Discussions have been ongoing with the Offline teams, requests have been placed for this to be corrected. Meanwhile, due to lack of input, the offline histograms being produced are empty.
Standalone
Standalone jobs have involved quasi-prompt, caf-based and offline analyses of global run data, pattern-based tests during commissioning.
Description of monitorables and shift instructions
Histogram description
Online DQM
Offline DQM
gui lore
Testing and Results
Re-run validation job.
Get packages
cvs co L1Trigger/HardwareValidation; cvs co DQM/L1TMonitor.
Execute
test job
as
cmsRun testEmulMon_cfg.py
(or
cmsRun
l1temulator_dqm_sourceclient-live_cfg.py
for online testing) after activating subsystem of interest therein.
Inspect results
l1demon.root
and
dump.txt
.
(i) Over craft data. (ii) Over first collision data (november 4th, GR09_H_V6::All, eg
input
).
Historic trends
Level of agreement per subsystem for the last
many runs:
see history
Representative results obtained for the
100 runs prior to 120111
:
?
Summary
?
GMT
?
RPC
?
DTTF
?
DTTPG
?
CSCTF
?
CSCTPG
?
GCT
?
RCT
?
HCAL
?
ECAL
?
GT
Agreement level trends per subsystem for
last 100 runs:
see trend
The
trend plots show the agreement level per subsystem as a function of run, where non zero values indicate increasing disagreement.
Technically, the trend is realized as the evolution of x-mean of the standard agreement-type histogram for each subsystem.
They may be accessed from the dqm gui as follows:
select a subsystems
error flag
plot under L1TEMU,
click on the cms logo or 'View details',
in the
Strip Chart
tab select to display
x mean
along with the number of previous runs (default 100).
Craft 09
Here are representative results obtained online for
run 110998
:
Summary
?
GMT
?
RPC
?
CSCTTF
?
CSCTP
?
DTTF
?
DTTPG
?
GCT
?
RCT
?
ECAL
?
HCAL
?
GT
Note that a grid-enabled browser is needed to access the plots.
First collisions
Summary for golden run
122314
Browsing the emulator dqm results in the central gui
Emulator dqm results are automatically displayed in the
DQM online gui
. To view them:
- install grid certificates (one time)
- open the gui
- click on
Run #
and select the run number to be inspected
- in
Workspace
, select L1TEMU
for a summary selection
- in
Workspace
, select Everything/L1TEMU
for visualizing all results
- navigate the folder structure from subsystems to desired plots
Status summary
L1 |
trigger |
o2o |
masked |
agreement |
threshold |
sign-off |
comments |
subsystem |
validated |
exp'd |
obs'd |
warning |
error |
shift use |
Muon |
GMT |
yes |
mip, iso |
1.0 |
1.0 |
|
|
yes |
|
RPC |
yes |
|
1.0 |
0.6 |
|
|
not yet |
after fw fixes 98% agreement expected |
DT TF |
yes |
|
<1.0 |
~0.7 |
0.90 |
0.80 |
yes |
update threshold online |
CSC TF |
yes |
|
1.0 |
~0.8 |
|
|
not yet |
|
DT TPG |
|
|
<1.0 |
~0.03 |
|
|
not yet |
considerably less emulator than data; global tag expected in 39x |
CSC TPG |
yes |
|
1.0 |
~0.9 |
|
|
not yet |
improvements scheduled |
|
Calo |
GCT |
yes |
|
1.0 |
1.0 |
0.999 |
0.95 |
yes |
update threshold online |
RCT |
yes |
|
1.0 |
~0.98 |
|
|
yes |
few runs still fluctuate, update threshold online |
Hcal TPG |
yes |
|
1.0 |
~0.3 |
|
|
not yet |
more data than emulator (zero-suppression) |
Ecal TPG |
|
|
1.0 |
- |
|
|
not yet |
more data than emulator non-empty towers |
|
Global |
GT |
yes |
|
1.0 |
- |
|
|
not yet |
comparison criteria to be updated |
|
historical trend :
RPC
DTTF
CSCTF
DT
CSC
GCT
RCT
Hcal
Ecal
All
:
Development notes
common
cleanup leftover using directives (collected by Jim)
DQM/L1TMonitorClient/interface/L1TdeECALClient.h:21:using namespace std;
DQM/L1TMonitorClient/interface/L1TDTTPGClient.h:21:using namespace std;
DQM/L1TMonitorClient/interface/L1THcalClient.h:36:using namespace std;
DQM/L1TMonitor/interface/L1TdeECAL.h:29:using dedefs::DEnsys;
DQM/L1TMonitor/interface/L1TDEMON.h:27:using dedefs::DEnsys;
L1Trigger/HardwareValidation/interface/L1Comparator.h:37:using dedefs::DEnsys;
L1Trigger/HardwareValidation/plugins/L1DummyProducer.h:40:using dedefs::DEnsys;
L1Trigger/HardwareValidation/plugins/L1EmulBias.h:37:using dedefs::DEnsys;
adapt checks of configuration keys to enable subsystems, needed given change in TSC naming convention
done/tag deployed related: add summary of check subsystem configuration
changed operator default in struct defining object
odering
from true to false; this cured job getting stuck processing csc altc which have no specification for template overloading implemented as expected (10.3.12)
beginJob(void) migration.
(complete)
cleanup of spurious dependencies in buildfiles
(done)
for use of L1TriggerKey link against CondFormats/L1TObjects instead of CondTools/L1Trigger (Werner)
tbd
Add run configuration check, TSC key, for disabling alarms/info from excluded subsystems.
(done)
Expand gui layouts, one-line plot description in quick collection.
Revise granularity of information to be recorded in run registry for l1 trigger data certification.
(tbd)
Verify ascii dump with digi comparison for rpc (rare event unexpected printout).
(tbd)
As mentioned in dpg meeting (09.10.26), the following actions have been implemented: (i) masked all subsystems but gmt, gct (until others sign-off), (ii) updated shift instructions accordingly, (iii) gct qt's thresholds set to 99% (error and warning).
(done)
Offline workflow. The hardware validation sequence needs to be incorporated: update cmsdriver, update default workflow. Standard emulator dqm code is integrated (producing empty results until emulator sequence is activated). Reminder of this outstanding request sent; Offline and Trigger management made aware. (F'09)
Develop and maintain the hwv sequence in confdb, allowing HLT farm based l1 emulator dqm.
Volunteer welcome.
GMT
Another occurrence,of data-only eg run
122885
trend
shows absence of emulated or empty digis; may be related to online db issues.
(investigating)
RPC
Need to verify online the expected agreement after firmware modifications, 98% is expected once these fixes activated (M.Szleper). (S'09)
Spy on a recent run:
117953
DT TF
Need to confirm online expected agreement after improvements in dqm code (J.Troconiz).
DT TPG
Need to investigate almost absence of non-empty emulated objects relative to data.
CSC TF
Issue detected online in 350, accessing csc tf information in comparator: extraction of
L1MuRegionalCand from
CSCCorrelatedLCTDigiCollection (Feb'10). (J.Gartner: this collection can be removed from the check for the time being; to be revised later; detailed checks of data-emul are being performed)
tbd
Subsystem specific code being integrated. In the process of using this to check standard plots (J.Gartner). (S'09)
tbd
Restrict readout bx window (-1,0,+1) for comparison (J.Gartner). (
done)
CSC TPG
Plan to implement new class containing detid+digis to be passed for comparison (L.Redjimi, S.Valuev). (F'09)
Problem verified in logn beam splash run (120)while sorting (very large) csc alct collections; to be fixed; ctp is disabled for the meanwhile.
Begin processing the 7th record. Run 120020, Event 3066, LumiSection 307 at 08-Nov-2009 15:31:43 CET
...
L1Comparator::process -ing system:6 (CTP), data type 22...
L1Comparator::process debug (size 1131,1113).
DEcompare: creating instance of type: CSC tpa, data size:1131 ncand:1131, emul size:1113 ncand:1113.
L1Comparator::process system:CTP(id 6) type:CSC tpa(22) ndata:1131 nemul:1113 (size 1131,1113).
L1Comparator::process print:
data: CSC ALCT #1: Valid = 1 Quality = 3 Accel. = 0 PatternB = 0 Key wire group = 5 BX = 0
...
GCT
Small disagreements exceptionally verified for run
121993
.
Cause of small jet disagreement observed in test enable events. Cause identified as overflow handling, firmware to be fixed (J.Marrouche, J.Brooke, G.Heath).
Plan to extend dqm code to energy sums, rings (A.Tapper)
Done.
Warning & error thresholds to be set to 99%; this is what can be expected until the firmware fix (for the errors seen in CRAFT09) is installed (J.Brooke).
RCT
comment (from S.Savin) EEs are still not in time; in usual config they feed only 5%, but in some runs (egg EE only tests) expect to fail at 100% level
variation of agreement level in run range
121861-122923
Experts not yet comfortable allowing inspection of plots by shifters (S.Savin).
now activated
Asked experts to check and validate standard plots (eg employ expert code). Revisited several recent runs (with M.Bacthis), standard results appear in agreement with expert plots.
The calorimeter runs different parts and configurations, which change dqm output; prefer to wait until situation stabilizes.
Spy on a recent run:
116734
HCAL TPG
Poor disagreement observed (110998) is thought to be due to zero-suppression, which needs implementation for standard plots (P.Tsang). (S'09)
!ZS run recipe used offline: allow |emul.et - data.et| < threshold (~12), and shift +/-1 SOI for HB/HE
!o2o in place starting with craft run 109354; add: process.HcalTPGCoderULUT.LUTGenerationMode = cms.bool(False)
ECAL TPG
change this in the online cfg (from Pascal, mar 11); offline needs adjustment
process.load("SimCalorimetry.EcalTrigPrimProducers.ecalTriggerPrimitiveDigis_readDBOffline_cff")
Agreement is observed in private dqm (P.Paganini). Asked experts to verify. (S'09)
o2o is being deployed.
GT
GT-expert dqm module crash reported by online central dqm team to be followed up experts (09/12).
Awaiting correct comparison criteria from experts (09/11).
Bit lists display now correctly (Th.Themel).
Negative response for the system follows from use of embedded definition of collection equality in the subsystem's DataFormats.
Expert confirms this 'definition is clearly not appropriate'; will look into it, and provide a compatible one. (V.Ghete).
Investigating standard plots. Expert module being deployed.
Further discussion are now logged in the forum:
e-space
Responsible:
NunoLeonardo
Created 13 Oct 2009