Difference: StrippingWorkFlow (1 vs. 6)

Revision 62009-10-01 - unknown

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"

LHCb data reduction (stripping)

Line: 49 to 49
 This step reruns the stripping selections taking as input the DST produced in step 2, and produces several DSTs, one per stripping stream. The stripping result is stored as a DST object. It re-runs the L0 and replaces on the RawEvent the L0 decision RawBank produced by Boole. It also runs HLT1 and stores the result as an additional RawBank on the RawEvent.

Program version: DaVinci v24r2p3
Changed:
<
<
Options AppConfig/v3r4/options/DaVinci/DVStrippingDST-MC09.py, to be updated
>
>
Options AppConfig/v3r7/options/DaVinci/DVStrippingDST-MC09.py
 
Database: SQLDDDB v5r11
Database tags: SIMCOND MC09-20090402-vc-md100
DDDB MC09-20090602

Open questions

Changed:
<
<
  • Which is the the right DaVinci version? DaVinci v24r2p3 should be used, as it contains fixes to some Hlt lines. This version is tagged and ready for release, will be announced as soon as all tests have been performed in the nightlies.
  • Do the options above re-run L0, run HLT1, and store the results in RawBank as required? No. Juan has committed new options to CVS, a new release of AppConfig is needed
>
>
 
  • Have the output DST and the new options been checked? Not yet. Juan will produce a DST using the new options and give it to Thomas who will check the content.

Answered questions

Added:
>
>
  • Which is the the right DaVinci version? DaVinci v24r2p3 should be used, as it contains fixes to some Hlt lines.
  • Do the options above re-run L0, run HLT1, and store the results in RawBank as required? Yes.
  * Are the L0 and HLT1 that are run using a correct TCK and is the L0Conf().TCK option is still needed? Yes, as confirmed to Juan by Miriam and Gerhard.
  • Does the information in the RawBanks contain all that is needed for people interested in TIS/TOS studies? In principle yes, since the same information is stored as in the real data. If something is missing this is the occasion to find out!
  • Is the stripping result stored on the DST? Yes: there is a TES directory created for each stripping selection that passes, and this is saved to the DST

Revision 52009-09-30 - unknown

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"

LHCb data reduction (stripping)

Line: 57 to 57
 Open questions
  • Which is the the right DaVinci version? DaVinci v24r2p3 should be used, as it contains fixes to some Hlt lines. This version is tagged and ready for release, will be announced as soon as all tests have been performed in the nightlies.
  • Do the options above re-run L0, run HLT1, and store the results in RawBank as required? No. Juan has committed new options to CVS, a new release of AppConfig is needed
Deleted:
<
<
  • Are the L0 and HLT1 that are run using a correct TCK? In principle yes, but Juan will check with Gerhard, in particular whether the L0Conf().TCK option is still needed
 
  • Have the output DST and the new options been checked? Not yet. Juan will produce a DST using the new options and give it to Thomas who will check the content.

Answered questions

Added:
>
>
* Are the L0 and HLT1 that are run using a correct TCK and is the L0Conf().TCK option is still needed? Yes, as confirmed to Juan by Miriam and Gerhard.
 
  • Does the information in the RawBanks contain all that is needed for people interested in TIS/TOS studies? In principle yes, since the same information is stored as in the real data. If something is missing this is the occasion to find out!
  • Is the stripping result stored on the DST? Yes: there is a TES directory created for each stripping selection that passes, and this is saved to the DST
  • Are all B candidates and intermediate resonances stored on the DST? Yes.

Revision 42009-09-29 - MarcoCattaneo

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"

LHCb data reduction (stripping)

Changed:
<
<
The LHCb computing model foresees that not all events recorded by the data acquisition are made available for physics analysis. After an initial reconstruction of all events, an event selection step (stripping) is executed - this is an analysis job that selects events based on "stripping sekections" provided by the physics groups. Only events passing one or more of these selections are made available for further analysis.
>
>
The LHCb computing model foresees that not all events recorded by the data acquisition are made available for physics analysis. After an initial reconstruction of all events, an event selection step (stripping) is executed - this is an analysis job that selects events based on "stripping selections" provided by the physics groups. Only events passing one or more of these selections are made available for further analysis.
 

Stripping framework

The stripping framework is a framework based on DaVinci within which the stripping selections are defined. More details are available here
Line: 17 to 17
 In MC09, the RAW data was produced by Boole v18r1 and the RDST data by Brunel v34r7. In MC production, these datasets are not stored in separate files, they are combined in a single DST file which is the output of the Brunel production step.

Step 1: DaVinci (stripping)
Changed:
<
<
This step reads the RDST from the MC09 production (taken from the MC09 DST files), it runs the default stripping and produces a “full” event tag collection (FETC): it stores the result of the stripping selection for each event on the RDST.
>
>
This step reads the RDST from the MC09 production (taken from the MC09 DST files), it runs the default stripping and produces a “full” event tag collection (FETC): it stores the result of the stripping selection for each event read from the RDST.
 
Changed:
<
<
Program version: DaVinci v24r2p2
>
>
Program version: DaVinci v24r2p1
 
Options AppConfig/v3r4/options/DaVinci/DVStrippingETC-MC09.py
Database: SQLDDDB v5r11
Database tags: SIMCOND MC09-20090402-vc-md100
Line: 27 to 27
  The output is an FETC. This output is not stored permanently; it is only used as input to Step 2
Added:
>
>
Open questions
  • How can the stripped dataset be normalised? In the past, the FETC was used to normalise the stripped dataset to the number of input events. If we do not save the FETC, and in the presence of failed jobs, this normalisation will be lost. This problem will eventually be solved by using the File Summary Records (FSR), but this is not ready for this stripping. Marco will check whether the FETC can be saved this time, or whether an alternative method exists using the book-keeping. In any case this should not be a showstopper for this production.
 
Step 2: Brunel
This step runs the reconstruction to produce a DST containing all the events selected by Step 1. The inputs are the FETC produced by Step 1 and the RAW from the MC09 production (taken from the MC09 DST files), the output is a DST containing the OR of all the stripping selections. There are two possibilities for the version of Brunel:
  1. The same version used in the original MC09 production. This guarantees consistency between the stripping done in Step 1, and that to be done in Step 3.
Changed:
<
<
  1. A more recent version, using the same version of REC as the version of DaVinci to be used in Step 3. This guarantees consistency of the reconstruction (in particular track pattern recognition) between Steps 2 and 3, but means that some of the events selected in Step 1 will not be re-selected in Step 3. Conversely, any events that would have been selected by step 3 but were not selected by step 1 are lost. Note that there isn’t yet a DaVinci version using the same REC version as the latest version of Brunel (v35r6) , only v35r5. Neither has undergone the extensive validation of the MC09 version.

Open questions

  • Which of options 1. and 2. is preferred by the physics groups? The computing group prefers Option 1, since it has already been commissioned and is better tested. Note that option 1 is a pure computing exercise in the case of MC data: since the input is an MC DST and the output is also an MC DST produced by an identical programme, Steps 1 and 2 could be skipped in a future stripping of MC data that uses option 1.
>
>
  1. A more recent version, using the same version of REC as the version of DaVinci to be used in Step 3. This guarantees consistency of the reconstruction (in particular track pattern recognition) between Steps 2 and 3, but means that some of the events selected in Step 1 will not be re-selected in Step 3. Conversely, any events that would have been selected by step 3 but were not selected by step 1 are lost.
 
Added:
>
>
For this stripping exercise it is decided to use option 1, since it has already been commissioned and is better tested. Note that option 1 is a pure computing exercise in the case of MC data: since the input is an MC DST and the output is also an MC DST produced by an identical programme, Steps 1 and 2 could be skipped in a future stripping of MC data that uses option 1.
 
Changed:
<
<
  Option 1 Option 2
Program version: Brunel v34r7 Brunel v35r5
Options AppConfig/v2r9p1/options/Brunel/MC09-Stripping.py AppConfig/v3r4/options/Brunel/MC09-Stripping.py
Database: SQLDDDB v5r11
Database tags: SIMCOND MC09-20090402-vc-md100
DDDB MC09-20090602
>
>
Program version: Brunel v34r7
Options AppConfig/v2r9p1/options/Brunel/MC09-Stripping.py
Database: SQLDDDB v5r11
Database tags: SIMCOND MC09-20090402-vc-md100
DDDB MC09-20090602
  The output is a DST. This output is not stored permanently; it is used as input to Step 3.

Step 3 DaVinci (stripping)
This step reruns the stripping selections taking as input the DST produced in step 2, and produces several DSTs, one per stripping stream. The stripping result is stored as a DST object. It re-runs the L0 and replaces on the RawEvent the L0 decision RawBank produced by Boole. It also runs HLT1 and stores the result as an additional RawBank on the RawEvent.
Changed:
<
<
Program version: DaVinci v24r2p2
Options AppConfig/v3r4/options/DaVinci/DVStrippingDST-MC09.py
>
>
Program version: DaVinci v24r2p3
Options AppConfig/v3r4/options/DaVinci/DVStrippingDST-MC09.py, to be updated
 
Database: SQLDDDB v5r11
Database tags: SIMCOND MC09-20090402-vc-md100
DDDB MC09-20090602

Open questions

Changed:
<
<
  • Is this the right DaVinci version, or should we use v24r2p3? DaVinci v24r3 has some firther bug fixes and could be ready in time.
  • Do the options above re-run L0, run HLT1, and store the results in RawBank as required? No.
  • Are the L0 and HLT1 that are run using a correct TCK?
  • Does the information in the RawBanks contain all that is needed for people interested in TIS/TOS studies?
  • Is the stripping result stored on the DST?
  • Are all B candidates and intermediate resonances stored on the DST? Yes.
  • What happens when different selections in a stream select the same B candidates? It gets copied twice, since the output locations are different for each selection.
  • Juan mentioned the possibility to save other information (eg. PV refit). Is any of this already in? (Not necessary for this stripping.) It the ReFitPVs property of the relevant DVAlgorithms is set to true, this gets done automatically. If not, some options have ot be set up.
>
>
  • Which is the the right DaVinci version? DaVinci v24r2p3 should be used, as it contains fixes to some Hlt lines. This version is tagged and ready for release, will be announced as soon as all tests have been performed in the nightlies.
  • Do the options above re-run L0, run HLT1, and store the results in RawBank as required? No. Juan has committed new options to CVS, a new release of AppConfig is needed
  • Are the L0 and HLT1 that are run using a correct TCK? In principle yes, but Juan will check with Gerhard, in particular whether the L0Conf().TCK option is still needed
  • Have the output DST and the new options been checked? Not yet. Juan will produce a DST using the new options and give it to Thomas who will check the content.

Answered questions

  • Does the information in the RawBanks contain all that is needed for people interested in TIS/TOS studies? In principle yes, since the same information is stored as in the real data. If something is missing this is the occasion to find out!
  • Is the stripping result stored on the DST? Yes: there is a TES directory created for each stripping selection that passes, and this is saved to the DST
  • Are all B candidates and intermediate resonances stored on the DST? Yes.
  • What happens when different selections in a stream select the same B candidates? It gets copied twice, since the output locations are different for each selection.
  • Is it possible to save other information (eg. PV refit)? If the ReFitPVs property of the relevant DVAlgorithms is set to true, this gets done automatically. If not, some options have ot be set up. Not needed for this stripping
  • It is quite possible that some of the output DSTs will contain zero selected events, resulting in empty or non-existent output files. Is this a problem for DIRAC? Possibly, but this is likely to be very rare and will be fixed only if needed. This is a short term problem since in future even files with no events will contain at least an FSR.
  • Are the names of the streams OK and interfaced to the book-keeping? Yes
  The outputs of this step are one DST per stripping stream.
Deleted:
<
<
More open questions
  • It is quite possible that some of the output DSTs will contain zero selected events, resulting in empty or non-existent output files. Is this a problem for DIRAC?
  • Are the names of the streams OK and interfaced to the book-keeping?
 
Step 4: Merger
This is a technical step, does not require any specific application, just gaudirun.py with standard DST merging options. For each stripping stream it combines the (small) DSTs of individual stripping jobs into a larger DST whose file size is optimized for storage.
Line: 77 to 78
 This is a DaVinci job that, for each stripping stream, reads the merged DST and uses the stripping result stored on the DST to produce a “selection” ETC (SETC). This SETC contains the stripping result for each event on the stripped DST, and can be used by analysis jobs to select only events passing a given stripping selection in a given stripping stream.

Open questions

Changed:
<
<
  • Which DaVinci version should be used? v24r2p2 or v24r2p3 (preference for the latter, but not a show-stopper if it isn't released in time)
  • Which are the options for doing this, and are they released in AppConfig? New options need to include the necessary L0 and Hlt options.
>
>
  • Which DaVinci version and options should be used? This functionality has not yet been implemented. Juan will give to Anton the stripped DST from the Step1 tests, who will then make the necessary changes to DaVinci. In any case this step is performed in a separate production from steps 1 to 3.
 

-- MarcoCattaneo - 2009-09-29

Revision 32009-09-29 - unknown

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"

LHCb data reduction (stripping)

Line: 19 to 19
 
Step 1: DaVinci (stripping)
This step reads the RDST from the MC09 production (taken from the MC09 DST files), it runs the default stripping and produces a “full” event tag collection (FETC): it stores the result of the stripping selection for each event on the RDST.
Changed:
<
<
Program version: DaVinci v24r2p1
>
>
Program version: DaVinci v24r2p2
 
Options AppConfig/v3r4/options/DaVinci/DVStrippingETC-MC09.py
Database: SQLDDDB v5r11
Database tags: SIMCOND MC09-20090402-vc-md100
Line: 48 to 48
 
Step 3 DaVinci (stripping)
This step reruns the stripping selections taking as input the DST produced in step 2, and produces several DSTs, one per stripping stream. The stripping result is stored as a DST object. It re-runs the L0 and replaces on the RawEvent the L0 decision RawBank produced by Boole. It also runs HLT1 and stores the result as an additional RawBank on the RawEvent.
Changed:
<
<
Program version: DaVinci v24r2p1
>
>
Program version: DaVinci v24r2p2
 
Options AppConfig/v3r4/options/DaVinci/DVStrippingDST-MC09.py
Database: SQLDDDB v5r11
Database tags: SIMCOND MC09-20090402-vc-md100
DDDB MC09-20090602

Open questions

Changed:
<
<
  • Is this the right DaVinci version, or should we use v24r2p2?
  • Do the options above re-run L0, run HLT1, and store the results in RawBank as required?
>
>
  • Is this the right DaVinci version, or should we use v24r2p3? DaVinci v24r3 has some firther bug fixes and could be ready in time.
  • Do the options above re-run L0, run HLT1, and store the results in RawBank as required? No.
 
  • Are the L0 and HLT1 that are run using a correct TCK?
  • Does the information in the RawBanks contain all that is needed for people interested in TIS/TOS studies?
  • Is the stripping result stored on the DST?
Changed:
<
<
  • Are all B candidates and intermediate resonances stored on the DST?
  • What happens when different selections in a stream select the same B candidates?
  • Juan mentioned the possibility to save other information (eg. PV refit). Is any of this already in? (Not necessary for this stripping)
>
>
  • Are all B candidates and intermediate resonances stored on the DST? Yes.
  • What happens when different selections in a stream select the same B candidates? It gets copied twice, since the output locations are different for each selection.
  • Juan mentioned the possibility to save other information (eg. PV refit). Is any of this already in? (Not necessary for this stripping.) It the ReFitPVs property of the relevant DVAlgorithms is set to true, this gets done automatically. If not, some options have ot be set up.
  The outputs of this step are one DST per stripping stream.
Line: 77 to 77
 This is a DaVinci job that, for each stripping stream, reads the merged DST and uses the stripping result stored on the DST to produce a “selection” ETC (SETC). This SETC contains the stripping result for each event on the stripped DST, and can be used by analysis jobs to select only events passing a given stripping selection in a given stripping stream.

Open questions

Changed:
<
<
  • Which DaVinci version should be used?
  • Which are the options for doing this, and are they released in AppConfig?
>
>
  • Which DaVinci version should be used? v24r2p2 or v24r2p3 (preference for the latter, but not a show-stopper if it isn't released in time)
  • Which are the options for doing this, and are they released in AppConfig? New options need to include the necessary L0 and Hlt options.
 

-- MarcoCattaneo - 2009-09-29

Revision 22009-09-29 - MarcoCattaneo

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"

LHCb data reduction (stripping)

Line: 20 to 20
 This step reads the RDST from the MC09 production (taken from the MC09 DST files), it runs the default stripping and produces a “full” event tag collection (FETC): it stores the result of the stripping selection for each event on the RDST.

Program version: DaVinci v24r2p1
Changed:
<
<
Options AppConfig/v3r4/options/DaVinci/DVStrippingETC-MC09.py
Database: SQLDDDB v5r11 with tags:
SIMCOND MC09-20090402-vc-md100
>
>
Options AppConfig/v3r4/options/DaVinci/DVStrippingETC-MC09.py
Database: SQLDDDB v5r11
Database tags: SIMCOND MC09-20090402-vc-md100
 
DDDB MC09-20090602
Added:
>
>
The output is an FETC. This output is not stored permanently; it is only used as input to Step 2

Step 2: Brunel
This step runs the reconstruction to produce a DST containing all the events selected by Step 1. The inputs are the FETC produced by Step 1 and the RAW from the MC09 production (taken from the MC09 DST files), the output is a DST containing the OR of all the stripping selections. There are two possibilities for the version of Brunel:
  1. The same version used in the original MC09 production. This guarantees consistency between the stripping done in Step 1, and that to be done in Step 3.
  2. A more recent version, using the same version of REC as the version of DaVinci to be used in Step 3. This guarantees consistency of the reconstruction (in particular track pattern recognition) between Steps 2 and 3, but means that some of the events selected in Step 1 will not be re-selected in Step 3. Conversely, any events that would have been selected by step 3 but were not selected by step 1 are lost. Note that there isn’t yet a DaVinci version using the same REC version as the latest version of Brunel (v35r6) , only v35r5. Neither has undergone the extensive validation of the MC09 version.

Open questions

  • Which of options 1. and 2. is preferred by the physics groups? The computing group prefers Option 1, since it has already been commissioned and is better tested. Note that option 1 is a pure computing exercise in the case of MC data: since the input is an MC DST and the output is also an MC DST produced by an identical programme, Steps 1 and 2 could be skipped in a future stripping of MC data that uses option 1.

  Option 1 Option 2
Program version: Brunel v34r7 Brunel v35r5
Options AppConfig/v2r9p1/options/Brunel/MC09-Stripping.py AppConfig/v3r4/options/Brunel/MC09-Stripping.py
Database: SQLDDDB v5r11
Database tags: SIMCOND MC09-20090402-vc-md100
DDDB MC09-20090602

The output is a DST. This output is not stored permanently; it is used as input to Step 3.

Step 3 DaVinci (stripping)
This step reruns the stripping selections taking as input the DST produced in step 2, and produces several DSTs, one per stripping stream. The stripping result is stored as a DST object. It re-runs the L0 and replaces on the RawEvent the L0 decision RawBank produced by Boole. It also runs HLT1 and stores the result as an additional RawBank on the RawEvent.

Program version: DaVinci v24r2p1
Options AppConfig/v3r4/options/DaVinci/DVStrippingDST-MC09.py
Database: SQLDDDB v5r11
Database tags: SIMCOND MC09-20090402-vc-md100
DDDB MC09-20090602

Open questions

  • Is this the right DaVinci version, or should we use v24r2p2?
  • Do the options above re-run L0, run HLT1, and store the results in RawBank as required?
  • Are the L0 and HLT1 that are run using a correct TCK?
  • Does the information in the RawBanks contain all that is needed for people interested in TIS/TOS studies?
  • Is the stripping result stored on the DST?
  • Are all B candidates and intermediate resonances stored on the DST?
  • What happens when different selections in a stream select the same B candidates?
  • Juan mentioned the possibility to save other information (eg. PV refit). Is any of this already in? (Not necessary for this stripping)

The outputs of this step are one DST per stripping stream.

More open questions

  • It is quite possible that some of the output DSTs will contain zero selected events, resulting in empty or non-existent output files. Is this a problem for DIRAC?
  • Are the names of the streams OK and interfaced to the book-keeping?

Step 4: Merger
This is a technical step, does not require any specific application, just gaudirun.py with standard DST merging options. For each stripping stream it combines the (small) DSTs of individual stripping jobs into a larger DST whose file size is optimized for storage.

Step 5: Tagger
This is a DaVinci job that, for each stripping stream, reads the merged DST and uses the stripping result stored on the DST to produce a “selection” ETC (SETC). This SETC contains the stripping result for each event on the stripped DST, and can be used by analysis jobs to select only events passing a given stripping selection in a given stripping stream.

Open questions

  • Which DaVinci version should be used?
  • Which are the options for doing this, and are they released in AppConfig?
 

-- MarcoCattaneo - 2009-09-29

Revision 12009-09-29 - MarcoCattaneo

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="LHCbComputing"

LHCb data reduction (stripping)

The LHCb computing model foresees that not all events recorded by the data acquisition are made available for physics analysis. After an initial reconstruction of all events, an event selection step (stripping) is executed - this is an analysis job that selects events based on "stripping sekections" provided by the physics groups. Only events passing one or more of these selections are made available for further analysis.

Stripping framework

The stripping framework is a framework based on DaVinci within which the stripping selections are defined. More details are available here

Stripping workflow

The sequence of steps needed to produce a stripped DST ("stripping workflow") is shown in the following picture:
stripping_workflow.JPG

MC09 stripping workflow

What follows describes the workflow used to strip MC09 data, it refers to the picture above

In MC09, the RAW data was produced by Boole v18r1 and the RDST data by Brunel v34r7. In MC production, these datasets are not stored in separate files, they are combined in a single DST file which is the output of the Brunel production step.

Step 1: DaVinci (stripping)
This step reads the RDST from the MC09 production (taken from the MC09 DST files), it runs the default stripping and produces a “full” event tag collection (FETC): it stores the result of the stripping selection for each event on the RDST.

Program version: DaVinci v24r2p1
Options AppConfig/v3r4/options/DaVinci/DVStrippingETC-MC09.py
Database: SQLDDDB v5r11 with tags:
SIMCOND MC09-20090402-vc-md100
DDDB MC09-20090602

-- MarcoCattaneo - 2009-09-29

META FILEATTACHMENT attachment="stripping_workflow.JPG" attr="" comment="" date="1254216028" name="stripping_workflow.JPG" path="stripping workflow.JPG" size="31066" stream="stripping workflow.JPG" tmpFilename="/usr/tmp/CGItemp57897" user="cattanem" version="1"
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback