HLT2 piquet guide

Controlling Hlt2

Accessing LHCb2 (Hlt2 control panel)

  • One screen next to the shift leader should show LHCb2.
  • From remote you can login to a ui machine (ssh ui from plus).
  • /group/online/ecs/Shortcuts315/LHCb/ECS/ECS_UI_FSM.sh .
  • Right click on LHCb_Hlt2.

Default state

  • While in normal physics data taking HLT2, should run automatically.
  • If it is in NOT_READY state or deallocated, ask the shift leader why this is the case and follow up with the Run Chief if you are not convinced.
  • If you are on shift coming out of a technical stop, ask in which mode it should be.
  • In order to process incoming runs automatically, Hlt2 has to be running and runs have to be marked as "ready to be processed" by default.

Automatic marking of runs to be processed by Hlt2

  • You can check the settings on the LHC TOP panel next to the shift leader. Where it says 'MD, RAMP, ...', is a button called "settings" in a box called "Big Brother".
  • Click the settings button, and you see a new panel with a box called "Big Brother Automatic Actions".
  • There is a checkbox in there called 'Auto start HLT2'. If it is unchecked, runs will not be marked as "ready to be processed".

Activating Hlt2

  • Switch on the Auto Pilot
  • Alternatively, change the state to Allocate, then Configure, then Start Run.
  • Hlt2 will start processing all runs which are marked "ready to be processed".
  • If the option "Auto start HLT2", was not set in Big Brother, you have to follow the instructions below to change the status of runs recorded by Hlt1.
  • The same holds if an calibration or alignment tasked had failed.

Stopping of Hlt2

  • In case you need to stop Hlt2 due to a problem and to avoid the loss of date.
  • Send the STOP command from the Hlt2 run control and make sure the auto pilot is switched off.

Adjusting the processing configuration of runs

You have to manipulate the status (ready to be processed), the alignment version or the TCK configuration of a run not yet processed by Hlt2.

Before doing that make sure you exactly know what you are supposed to change. If you select the wrong things, data are lost.

  • Open the FarmStatus panel by clicking on the button with the blue circle and an "i" in it, next to Farm Node status.
  • You can show legends for the color coding by clicking on the buttons at the bottom.
  • Right click on the run in the list of of runs on the left side.
  • Select "View local RunDB". You can select more than one run by holding shift. A new window pops up.

Changing the status

  • If runs are not ready to be processed, you should see that the status of the runs is set to -1, which means they are not processed when Hlt2 is activated.
  • Right click on one run or select several runs, select "Set READY for Hlt2".
  • You cannot select this, when one of the alignment or calibrations tasks did not produce a xml file with the calibration constants for this run.

Changing alignment versions

  • An alignment or calibrations tasks did not produce a xml file with the calibration constants for this run. In this case the xml file has to be selected by hand.
  • This task is usually performed by the Alignment piquet, contact him/her before any action.
  • Right click on a run in question and select "Change Alignment version". The ones where the calibration succeeded have a version number and the box on the right is ticked.
  • Check with the corresponding person which version of the xml files you should apply before doing anything.
  • In case the runs are very short, < 5 minutes, take the previous or next one.
  • Always, write down the run numbers in the logbook, subject: "Manually applied alignment version for ".
  • More information is found here

Changing the trigger configuration

  • The Hlt2 TCK version is set when the data are processed by Hlt1. (Add here where to change the default.)
  • If a bug is discovered in the TCK, it might be necessary to change the Hlt2 TCK.
  • First you have to change the default Hlt2 Trigger configuration. Follow the instructions here how to install a new TCK and how to change the trigger configuration.
  • In the FarmStatus panel
    • select the runs to be updated
    • right click and click on "View Local RunDB"
  • Select the runs, right click and select "Change Trigger Configuration". Apply the one you just updated.
  • Make sure the TCK was updated for all selected runs!
    • reopen the last window ("View Local RunDB")
    • the HLT2 TCK is displayed in parentheses in the Params (rightmost) column.
    • sometimes not all runs have been updated! -> repeat the procedure and check again.
    • Hover over the run in the list for more information

Hlt2 Checklist

When changing the Hlt2 TCK follow these instructions before switching on the Hlt2 Automatic mode.
  • First step, manual running:
    • Find the JIRA task which describes the changes in this TCK such that you know to which lines you should pay particular attention.
    • Look at the Hlt2 FarmStatus to find a run number with the new Hlt2 TCK. Preferably, use a long run with many files.
    • Then adapt the following commands to the current Moore version, Hlt1 output rate, Hlt2 TCK and file for a given run.
source /group/hlt/sattelite/MooreOnlinePit_v25r4/InstallArea/x86_64-slc6-gcc49-opt/setupMoore.sh
python $PRCONFIGROOT/Moore_RateTest.py --Online --evtmax=10000 --input_rate=81000 --split="Hlt2" --TCK=0x21371609 --inputdata=/net/hlta0101/localdisk1/hlt1/Run_0181197_20160804-001709.hlta0101.mdf | tee log_ratetest.txt
    • Look at the log file and check if new and changed lines accept some events. Obviously this is only possible if the line has a sufficient rate. Main point is that the job does not crash and the log file shows no unexpected warnings or errors.

  • Second step, process some runs:
    • Open an error logger. Check in the farm status which nodes are idling and filter in the error logger on their output.
    • Mark one or a few runs as ready to process as described above. Preferable, use a short run, such that in case of errors, the data loss is small.
    • The idling nodes should start processing these runs.
    • Watch the error logger for unexpected errors and warnings.
    • Important when Routing bits changed: Go to
      /daqarea/lhcb/data/2016/RAW/<stream>/LHCb/COLLISION16/<run>
      and check that files appear for the given run for every expected stream. Inspect the raw files and check that the streams are as expected.
    • Open the presenter, go to History mode, select the runs which are processing, and look at the mass plots and Hlt2 rates.
  • If the above looks ok, Hlt2 can be switched to Automatic mode. Mark the remaining runs as ready to be processed.

Troubleshooting

If HLT2 is in ERROR for a long period of time

  • Check the FarmStatus page, in particular whether many nodes are yellow (bad) or have a zero or NaN input rate (select show input rate from drop down menu on the top left of the panel). If either of these is the case, call the online piquet.
  • If not, go down into the PVSS panel for HLT2 and check how many excluded nodes or subfarms you have. The excluded nodes/subfarms which are in red are meant to be out. If you have more than a dozen nodes or any subfarms which are excluded but appear in white in this panel, try to reinclude them. If this does not work, call the online piquet.
  • If neither of these is the case, dig further into the panel and try and understand if there is one specific node which keeps going into error. If so, call the online piquet and ask them to check this node.

If you suspect a problem

If you have a reason to suspect a problem with the HLT2 processing but the system itself is not in error, you can do a few things
  • Check the configuration from the PVSS panel, paying special attention to the TCK and database tags
  • Check the Vs and alignment status for the suspect runs, checking with the alignment/calibration responsibles in case of any doubts (call in the first instance)
  • Copy an HLT1 file from a specific node's localdisk/hlt1 folder to your online home area, then run HLT2 on it by hand with the configuration, and check the output against what you see in the DAQAREA from the processing in the farm.
  • Make a note of any differences and follow upimmediately.

To check the configuration

  • The HLT2 configuration is shown on the right hand side of the split HLT configuration panel.

Conditions

  • Check that HLT2 is using All.py
  • Check if the reported versions are correct, if in doubt check the xml files in for the versions that will be used. The files are in /group/online/hlt/conditions/LHCb/2015/RunNumber/*.xml

Monitoring

  • If there is a problem with the top level adder, you can get more information by starting errlog on hlt02:
ssh -Y online@hlt02
errlog -s lhcb2

Pictures

Hlt2.png Hlt2_FarmStatus.png
-- RoelAaij - 2015-06-07
Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng Hlt2.png r1 manage 82.3 K 2016-04-26 - 11:41 SaschaStahl  
PNGpng Hlt2_FarmStatus.png r1 manage 141.8 K 2016-04-26 - 16:16 SaschaStahl  
Edit | Attach | Watch | Print version | History: r20 < r19 < r18 < r17 < r16 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r20 - 2018-04-22 - RosenMatev
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCb All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback