Comments the TurboSP paper

Version 1.1

Sent on 6th September 2018.

Agnieszka Dziurda

Comments without replies have been addressed.


General/Major

- The previous paper used only "Turbo Stream", now reading the paper I found, Turbo model, Turbo bandwidth, Turbo trigger, Turbo trigger line, Turbo events, Turbo mode and so on. Actually "Turbo stream" is used maybe three times. I would use less jargon in the entire document or if you really want to keep some of the above names, define them in the text. Turbo bandwidth -> bandwidth of the Turbo stream Turbo events -> events selected by the Turbo stream etc etc etc

We've reduced the Turbo terms down to just 'stream' and 'model', and have tried to reduce the amount of jargon elsewhere (e.g. trigger lines → selections).

- I am just thinking if it makes sense to introduce TurboSP as an acronym for the Turbo Selective Persistence. It is not done so far in the text, but is commonly used during our presentations.

We couldn't find a place to fit this in, so have left TurboSP out. We've been consistent in using 'selective persistence' to describe the technique, so hopefully anyone who's heard 'TurboSP' will pick up that we're the same thing.

- When you describe Run 2 trigger, I would make numbers in line with the Reco&HLT paper (CERN-LHCb-DP-2016-002) In particular: line 81: 1 MHz -> about 1 MHz. line 92: 12 kHz -> about 12.5 kHz The Reco&HLT paper will be out earlier than this paper (the review is almost over), therefore I would like you to cite the paper.

The citation's a good idea. I can't find any reference to LHCb-DP-2016-002 internally though, where is it being written?

Added a (placeholder) citation anyway and adjusted the numbers.

- Many times you use the five times higher luminosity argument. I would like to make it clear it implies only to LHCb. In particular, in line 6 you wrote "experiments", while it is not true. For instance ATLAS goes from max lumi: 2.1*10^34 cm-2s-1 in Run 2 to only ~2.3 in Run 3, so 5 times argument is correct but only for the LHCb.

It was actually a TODO in the paper source to check this! Corrected and highlighted.

- When you discuss the 10 PB buffer I would again cite Reco&HLT paper as it is described there. In particular, for line 126: "days or even weeks" I think we have some official numbers here.

I tried to find something but couldn't. Do you know where?

- Across the paper you are saying "performing physics analysis inside the trigger", while it sounds you would get the physics observables out of the trigger, not really selected candidates. I would be in favour of saying "performing physics selection inside the trigger"

- You are using a lot of "can", "may" etc. I would avoid it.

Have tried to remove the instances where these are weasel words ("may lead to performance improvements") but there are quite a few instances where something is unavoidably conditional ("may be required for a measurement").

- You discuss TurboSP in the context of Run 3. I would be very careful here= . Especially the text between 154-172 needs a lot of polishing. I see the message which you would like to send but there are a couple of issue:


the computing model for Run 3 is not finalized, therefore I would avoid hard statements like "cannot cope" changing them to more optimistic message.
the comparison between today's charm, and tomorrow's beauty needs to be carefully states. First, it should be in line with Computing Model TDR, secondly thanks for the TurboSP we could add more lines and expand more already reach physics program, while for Run 3 we might be in the situation when it is TurboSP or nothing. So we change from "we might want to use", to "we have to use" if you see what I mean. My conclusion would be to rephrase this paragraph and make it more optimistic and less discussable.

Rephrased to be more optimistic.

- You might want to cite http://cds.cern.ch/record/1670985 for example in line 8, but also in other places.

- line 122-123: from your sentence it sounds like entire HLT2 was run only in the LHC downtime. I would write it more in the spirit of: "In the LHCb trigger system about 20% of the L0 accepted events were temporarily buffered on the EFF nodes and processed in the gaps between stable beam periods. This deferred triggering method allowed LHCb to increase the data sample for physics analysis." (but please do not copy-paste as it is from other publication)

- line 180: I think you overestimated 60. Looking at the monitoring plots from the entire Run 2 it is more like 40.

Indeed! I pulled 60 out of thin air, and looking for the real number was on the TODO list. Is there something citable for this? Maybe LHCb-DP-2016-002 will state it.

- line 283: 10% is really promising. Could you please explain more how did you get this number? or could you please point me to the presentation?

This number is based on studies performed by the flavour tagging group. They keep every long and upstream track that is associated to the same PV as the signal candidate, and this adds around 10 kB to the average event size. For reference, a signal candidate is typically 4-6 kB, and the raw event is around 110 kB.

- line 298: resulting in more accurate physics measurements. In principle you are right, but would not make explicit translation: larger calib sample -> more accurate physics measurements. I would rather make the link: larger calib sample -> more precise efficiency determination.


Minor

For the editorial comments I put sometimes "?" as my taste says it, but I guess you as native speakers can judge if I am correct or wrong.

Figures:


Fig 1: I like the scheme, however I am thinking how to make it more informative. What if you add in green boxes: L0: hardware trigger HLT1: partial reconstruction HLT2: full reconstruction

Fig 1: mentioned on page 2, and is on page 4, please move to page 3.

Fig 2: Do not break words within boxes

Fig 3: I see the dashed/solid lines, but can you highlight what is saved with a color, perhaps green? The figure itself is more informative itself, but then when I read text I don't see immediately translation. To easily follow the text, I would assume seeing on the plot some tracks labels as charged pions charged etc. I am also thinking if it makes sense to somehow indicate the min/max event size, to give a feeling about the magnitude.

Fig 4: would make it more general: VELO raw bank -> sub-detector raw bank

Text:


- line 5: it would read better for me with "to meet this challenge" at the beginning of sentence i.e. "To meet this challenge, at the Large Hadron Collider (LHC), both..."

- line 14: trigger line is kind of jargon, maybe better to the define it as the set of selections etc

- line 23: such for -> such as?

- line 28: Run 2 -> Run 2 in 2015. - line 29: Run 1 -> Run 1 (2010-2012)

- line 32: I would maybe write something like: "broadening the physics programme of the LHCb experiment."

- line 47: if the computing model document is public by the time you are done with review, you might want to aslo cite it here

- line 81: 1 MHz -> about 1 MHz. - line 81: event filter farm -> Event Filter Farm (EFF) - line 82: high lever trigger (HLT) -> High Level Trigger (HLT)

- line 84: simplified charged particle reconstruction -> a partial reconstruction of charged tracks

- line 92: to be consistent with Reco&HLT paper I would write 12.5 kHz - line 98: selections -> selection? - line 99: depending on the stream the event is sent to -> depending on destination stream?

- line 113: permanent storage -> the permanent storage. - line 116: as I am not native speaker I had to read it 3 times to understand: "each stage more processing time than the previous", maybe try to rephrase it? - line 120: in periods when data are not being collected -> during inter-fill periods

- line 123: (2010=E2=80=932012) - remove if given earlier

- line 137: offline-equivalent -> offline-quality - line 137: in the final trigger stage (HLT2) - line 138; a second reconstruction offline -> any additional reconstruction offline. - line 138: offline, if instead -> offline. The objects... - line 139: can be written out (...) directly -> are directly written out to the per... - line 139: can then be -> are - line 139: Analyses -> Physics measurements? - line 141: can reduce -> reduces

- line 142: Analysis of trigger-level information offline has been used -> I don't get what you mean.

- line 146: the trigger reconstruction is worse than if the full event were reconstructed offline -> and the quality of the reconstruction is worse than achieved offline.

- line 156: The reduced event formats described thus far, maybe it is my poor English, but I would prefer: Thus, the reduced event formats cannot be...

- line 181-183: rephrase.

- line 240: as introduced -> has been introduced - line 241: which line authors can set -> remove - line 241: a flag -> an user flag? - line 249: if at all -> remove

- line 273: One expected use-case is flavour tagging, on which a large number of the analyses rely to determine the initial flavour of a beauty or anti-beauty meson. -> One expected use-case is flavour tagging, a determination of the initial flavour of a beauty or anti-beauty meson, which a large number of the analyses rely on.

- line 318, 339: beam-beam -> proton-proton - line 318: The trigger lines -> These trigger lines - line 321: these triggers -> these trigger lines - line 327: the Turbo bandwidth -> the bandwidth used for the Turbo stream - line 340: and so and a

- line 352: for more physics -> for broadening physics programme - line 358: which would have otherwise -> which otherwise would have - line 366: physics analysis -> physics measurements - line 369, 370: triggers -> trigger lines

Version 1.2

Sent on 3rd October 2018.

Agnieszka Dziurda

Comments without replies have been addressed.

- I am just thinking if it makes sense to introduce TurboSP as an acronym for the Turbo Selective Persistence. It is not done so far in the text, but is commonly used during our presentations.

We couldn't find a place to fit this in, so have left TurboSP out. We've been consistent in using 'selective persistence' to describe the technique, so hopefully anyone who's heard 'TurboSP' will pick up that we're the same thing.

Ok, fair enough.

- When you describe Run 2 trigger, I would make numbers in line with the Reco&HLT paper (CERN-LHCb-DP-2016-002) In particular: line 81: 1 MHz -> about 1 MHz. line 92: 12 kHz -> about 12.5 kHz The Reco&HLT paper will be out earlier than this paper (the review is almost over), therefore I would like you to cite the paper.

The citation's a good idea. I can't find any reference to LHCb-DP-2016-002 internally though, where is it being written?

Added a (placeholder) citation anywaa and adjusted the numbers.

The TWiKi page is here: https://twiki.cern.ch/twiki/bin/view/LHCbInternal/HLTReco_Run2Perf I think the CDS entires have not been yet created, it is something to follow up. It should be out soonish.

Thanks. I've updated the title in the bibliography entry, and will do the rest once it's out.

- When you discuss the 10 PB buffer I would again cite Reco&HLT paper as it is described there. In particular, for line 126: "days or even weeks" I think we have some official numbers here.

I tried to find something but couldn't. Do you know where?

The HLT&Reco paper says:

"The total disk buffer of the EFF is 10 PB, distributed such that farm nodes with faster processors get a larger portion of the disk buffer. At an average event size of 55 kB passing HLT1, this buffer allows for up to two weeks of consecutive HLT1 data taking before HLT2 has to be executed. Therefore, it is large enough to accommodate both regular running (where, as we will see, the alignment and calibration is completed in a matter of minutes) and to serve as a safety mechanism to delay HLT2 processing in case of problems with the detector or calibration."

It's nicer to be more concrete, so I changed the sentence to:

As the buffer is so large, `real time' can be up to two weeks during nominal data-taking~\cite{LHCb-DP-2016-002}.

I am happy with all other answers.

Here are my additional comments after reading second draft.


General

1. After having a second read I think it would be better to have an upgrade discussion in a one place. I would suggest: -- split Sec 4, leave Sec. 4 for Run 2, and add Sec 5 about prospects. -- move discussion which you have at the end of Sec. 2.1 to Sec. 5.

I think that's a great idea, otherwise the focus does jump around a lot from 'today' to 'upgrade'.

Added a Section 5, and reworked the paragraphs there to flow better and give a little more detail.

2. When you talk about the complete reconstruction persistence, line 249, you say "The full reconstruction is slightly smaller than the raw event", I would expand it a bit more, by explaining reader what are exactly the difference between two. I see the gain from 48 to 69 kB in the event size, but what is the cost? What are you, aren't you able to do afterwards? etc

I think most of this information is given in the paragraph, but I've reworded the sentence as:

"In comparison to saving the raw event this saves disk space and requires no further processing offline, however the information needed to re-run the reconstruction offline is discarded."


Specific

line 18: "allowing for a continuation offline of whatever analysis was performed in the trigger", rephrase

Have gone for "allowing for the analysis performed in the trigger to be continued offline".

line 34: the experiment -> the LHCb experiment (let's make it clear) line 35: I would remove comment about ATLAS/CMS.

line 38: the Upgrade detector -> the upgraded detector ? line 39: a large fraction of beauty program -> some fraction of beauty program

line 54: I would add at the beginning something like: "The paper is organised as following"

Fig 1: caption "Schematic of the LHCb trigger scheme" - rephrase

Have changed to "An overview of the LHCb trigger scheme".

Fig 1: Having a second look, I would change a top box: "LHCb detector". L0 is the detector response, so it is also "LHCb detector" etc etc. Maybe write beam-beam collision, or bunch crossing rate, or pp collision or something like that. In addition, I am a bit puzzled by the numbers, you have half of them in Hz and second half in MB, while for instance in the text you refer to 12.5 kHz, it is nowhere in the plot (I get it is not easy, because you quote separately for each stream)

Added the individual stream output rates, taken from Figure 33 in the Run 2 performance paper. Note that these don't quite add up to 12.5 kHz.

line 94: all detector information -> all available detector information.

line 109: separate offline application -> maybe some Ref?

I can't find a suitable 'Brunel reference', even after looking here:

http://lhcb-comp.web.cern.ch/lhcb-comp/Reconstruction/LHCbNotesOfInterest.html

line 187: around 40 tracks -> in average 40 tracks

line 231-237: comprises: - the set (...) , - the tracking (...) .

Are you suggesting lowercase "the"? Uppercase is consistent with the formatting of the parent list (which has "The" and "All").

Fig 3. I think we misunderstood each other here. In terms of colors, I was hoping to see something like in your PPG talk smile

Ah! I see. Updated.

line 281: which many measurements require -> which many, for instance time-dependent, measurements require
Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2018-10-25 - AlexPearce
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCb All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback