HLT piquet guide: new Moore versions and TCKs

There are many interdependent parts of the online system so it is important that we follow the procedures outlined on this page whenever we install new versions of Moore and/or new TCKs. In particular, the run-coordinator, run-chief, online-team and OPG should all be kept up to date, and consulted about any decisions that need to be made, as outlined in this page.

Things that must be ready before a new Moore and/or TCK version can be installed and used in the Pit

  • Read the full page. If something is not clear, ask and insist that the Twiki page is improved.
  • The release of Moore(Online) must have been:
    • Tagged (1 day)
    • Tested (timing, rate and independence tests; 3 days)
    • Released and deployed (1 day)
  • The Moore(Online) release must be available on cvmfs
  • The TCKs and the TCK/HltTCK package must have been:
    • Created (1-few days depending on number of changes)
    • Checked (visual inspection of dumps) (1-2 days)
    • Tagged (1/2 day)
    • Released and deployed (1/2 day) -> Important for monitoring and alignment, see here for instructions.
  • If you need an emergency release of HltTCK or another package, contact Albert Puig [albertDOTpuigATepflDOTch]
  • Only if all these have been completed can this version of Moore be installed for use in the Pit.
    • Requires installation, update of Trigger configuration, creation of the functor cache and checkpointing. (1-2 hours)
  • The trigger configuration needs to be tested before taking data:
    • Configure and start a run from LHCb (15 minutes)

How to install a new version of Moore vXrY for use in the pit

  • send Rosen your ssh public key so he can add it to the hlt_oper user.
  • ssh hlt_oper@plus
  • cd /group/hlt/sattelite
  • ./installMoore.py vXrYpZ
  • Create manifest file for the CondDB and DDDb tags (see below)
  • Empirical observation: In case you get an error that data packages are not found. Try again, wait a few minutes, log out and log in again, try again.
  • If you have problems, read the manual instructions at the bottom of the page, try to understand the problem and ask.
  • Make sure the functor cache is properly created, see below for instructions. Creating the functor cache can take considerable amount of time (45 minutes).
  • Create the trigger configuration in the run control, see below for detailed instructions.

How to create the manifest file for the CondDB and DDDb tags

  • In case the database tags need to be updated without installing a new Moore version follow the instructions here here.
  • Log to plus as user online (ssh online@plus)
  • cd /group/online/hlt/conditions/manifest
  • create MOORE_vXrYpZ in that folder, you can take a look at the previous for hints of syntax
  • email lhcb-hlt-piquet if unsure about which tags to use

How to create and release a new TCK

  • First follow the instructions here to create the new TCK. This is expected to be a duty for the HLT piquets.
  • A released version of TCK/HltTCK is important for the monitoring and the reconstruction, and it is needed for offline processing. So make sure it is released before taking data. Please follow these instructions for releasing TCKs. You should discuss with Rosen, as Moore release manager, or one of the other HLT experts, before proceeding to this step.

Installing a new TCK in the production version of Moore in the pit

  • This steps are done in case only the TCK needs to be updated not the Moore version.
  • Make sure the TCK/HltTCK package is released and appears in CVMFS at $LHCBRELEASES/DBASE/TCK/HltTCK/
  • Run the following commands, replacing vXrY with the current MooreOnline version. Check that there are no errors in the output.
    ssh hlt_oper@plus
    cd /group/hlt/sattelite
    ./installMoore.py --manifest vXrY
    
  • In case of errors or other issues with the above, notify Rosen.
  • The functor cache should be automatically created by the installation script.

How to manually create the functor cache for a new TCK.

The creation of functors from python uses a significant amount of memory. To mitigate this problem functors can be translated to C++ code. This is done in the so-called functor cache. The use of the functor cache is important for the stable running of Hlt1 and Hlt2 otherwise we cannot run at the optimal amount of processes. Creating the functor cache can take considerable amount of time (45 minutes).

installMoore.py --manifest v<latest>
  • To test the proper functioning of the functor cache do:
Setup Moore from the satellite area. Chose the latest version:
source /group/hlt/sattelite/MooreOnlinePit_v<latest>/InstallArea/x86_64-slc6-gcc49-opt/setupMoore.sh
Get an options file, for example the ones used to create and test the TCK. Add
from Moore import Funcs
outputTrans = {".*Factory.*" : {"UsePython" : {"^.*$": "False"}}}
Funcs._mergeTransform(outputTrans)
to disable python. If the job runs through, the functor cache is working. If it crashes, you have to create the functor cache again. If you want to try running HLT2 over data at the pit, you should add
Moore().Split = "Hlt2"
Moore().UseTCK = True  
Moore().inputFiles = ["/net/hltf1111/localdisk1/hlt1/Run_0194681_20170710-014652.hltf1111.mdf"]
Moore().DataType = "2017" 
import sys
rd = '/group/online/hlt/conditions/RunChangeHandler'
sys.path.append(rd)
CondDB().UseDBSnapshot = True
CondDB().EnableRunChangeHandler = True 
CondDB().EnableRunStampCheck = False
CondDB().IgnoreHeartBeat = True
CondDB().Tags['ONLINE'] = 'onl-20170512'
from Configurables import MagneticFieldSvc
MagneticFieldSvc().UseSetCurrent = True
import All 
CondDB().RunChangeHandlerConditions = All.ConditionMap
to your options file (specifying your desired input file).

Update the trigger configuration in the run control

The following procedure should only be carried out after discussion in the run meeting with agreement from the runchief and the hlt group.
  • If all is ok, check with the run coordinator whether you can update the Physics trigger configuration for LHCb (better do it from the Pit).
  • The version of Moore and TCK is coupled to the "Trigger Configuration"
  • To change this, click on RunInfo, then Trigger Configurations, Creat/Edit.
  • In the Trigger Configuration panel:
    • Select the trigger configuration that you want to update from the Trigger Configurations dropdown list on LHCb (top of left panel)
    • Check that you can see the new TCK in the list
    • In the 'Runtime Conds' box choose 'AllHlt1.py'.
    • On the left modify and save the Hlt1 configuration.
    • On the right modify and save the Hlt2 configuration. Make sure that the L0 TCK is common between Hlt1 and Hlt2
    • Create checkpoints for both HLT1 and OnlineBrunel. HLT2 is not needed at the moment while we run it in forking mode. See below for detailed instructions.
  • Test the configuration at the earliest opportunity even if not in data taking. See below for detailed instructions.
  • Make sure that the TCK package is also updated in all other parts of the system (monitoring, online brunel, etc.)
  • Optional: test with FEST. Note that this requires preparation of files that have been processed with the correct L0 with L0App.

Instructions for Checkpointing

We are currently running without checkpointing, skip this step

Checkpointing saves the initialised HLT, speeding up running. This is particularly important for HLT1 and OnlineBrunel in the reconstruction farm. You only need to create one set of checkpoints for a Moore version and type of TCK. If we install a set of new TCKs without changing the Moore version and the type of the TCK, then we only need to create the Moore1 checkpoint for one of them. Recreating checkpoints for TCKs is also needed when the TCK is based on a new L0 part. Also, you may need to do this from a linux machine rather than Windows.

December 3rd 2015 - Seems we should always make new checkpoints for TCKs, even if only the L0 part is different

  • Select RunInfo
  • Select Trigger configuration: Create/Edit
  • Select Checkpointing…
  • Select Moore1 for HLT1, Moore2 for HLT2, OnlineBrunel for the reconstruction farm
  • Select Create Checkpoint, do not click Ok just yet...
  • When it says finished, press okay...
  • Then select Test Checkpoint, wait for finish, etc.
  • Then Create gzip and md5
  • Wait for finish
  • You then want to test that it works, and that the HLT runs as expected, see below.

Instructions for testing the new configuration

  • Once the checkpoint is prepared and the Trigger configuration is created, test it.
  • Talk to the run chief about testing the TCK in the run control at the earliest opportunity, even if not in datataking. It has absolutely to happen before the next physics fill.
  • Ask the shift leader to switch on the new configuration and start a run (outside of normal data taking).
  • If Hlt2 is fully running, you may need to reduce the number of Hlt1 tasks in the architecture. Otherwise the farm nodes may run out of memory.
  • This step also distributes the checkpoint among the farm nodes.
  • Look at LHCb error logger if problems occur and check that there is some Hlt1 output rate, once the run has started. If in doubt, ask.

Manual instructions for installation and debugging of installation

If the installation script fails, check the error output and fix it. The manual instructions below are still valid and can be used to find the problem and install anyway. The script does the same operations in this order.
  • Find the correct Online project version from the CMakeLists.txt file of the MooreOnline release you wish to install
  • Next do lb-dev --name MooreOnlinePit_vXrY --dev-dir /group/online/dataflow/SwData --dev-dir /group/online/dataflow/cmtuser MooreOnline vXrY to create the local area for this Moore version. The --name is to avoid having the directories always suffixed with "Dev" which is confusing, but you MUST remember to suffix with _vXrY explicitly.
  • export OnlineDev_DIR=/group/online/dataflow/cmtuser/OnlineDev_vXrY/InstallArea/${CMTCONFIG}, where vXrY is the same as in the CMakeLists.txt
  • cd MooreOnlinePit_vXrY
  • edit CMakeLists.txt and add "OnlineDev vXrY" to the "USE MooreOnline ..." statement at the end of the file
  • getpack -p anonymous TCK/HltTCK vXrY. Unless you know otherwise (check with Roel if unsure!) you should get the latest released version.
  • If you got the head of TCK/HltTCK, go into TCK/HltTCK and do "ln -s . v3r9999"
  • getpack -p anonymous MooreScripts vXrY. Unless you know otherwise (check with Roel if unsure!) you should get the latest released version.
  • make -j 8 install
  • ../createSetupMoore.py MooreOnlinePit vXrY > InstallArea/${CMTCONFIG}/setupMoore.sh
  • ./run bash (this starts an environment with our project which we just created)
  • cd TCK/HltTCK
  • ./scripts/createTCKmanifest config.cdb manifest
  • If there are TCKs in the released version that you do not want to be visible in the ECS control panels, edit the manifest/MOORE_vXrY file and remove their corresponding entries.
  • cp manifest/MOORE_vXrY ../../InstallArea/manifest
  • ./MooreScripts/scripts/PostInstall.py InstallArea
  • Create a file called project_versions.txt in the InstallArea directory that contains the following two lines:
         Moore vXrY
         Online vXrY
         
    Where the Moore version is the one you are installing and the online version the one you found above.
  • ln -s /group/hlt/sattelite/MooreOnlinePit_vXrY /group/hlt/MOORE/MooreOnline_vXrY
  • add a file MOORE_vXrY to /group/online/hlt/conditions/manifest with the correct LHCBCOND and DDDB tags in it. The format of the file is one entry per line, each entry in the format "LHCBCOND_cond-20120831 : DDDB_dddb-20120831". Only tag names which exist in /group/online/hlt/conditions/ are eligible to be included in this file.
  • Create checkpoints for HLT1 and OnlineBrunel.
  • talk to the run chief about testing the TCK in the run control at the earliest opportunity, even if not in datataking.

-- RoelAaij - 04-May-2015

Edit | Attach | Watch | Print version | History: r49 < r48 < r47 < r46 < r45 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r49 - 2018-11-08 - RosenMatev
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCb All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback