Calo Portal Current Status LAr Phone Numbers

Guidelines for LAr errors

To edit this page, use the editable version of the page or the proxy server.

For General How To's relevant for all Calo/Fwd, please see Calo TroubleShooting page.

For Tile only, check the Tile TroubleShooting page.

For Calibration runs, check TroubleShootingCalibration page

General Troubleshooting Guidelines

  • Be sure to read the Known Issues page for the most up-to-date information
  • Look for your error (DCS, TDAQ, ERS, OHP...) or "HOW TO..."
  • Follow the procedure
  • If your problem is not listed, if the fix does not work, or when in doubt: call the LAr Run Coordinator (70136)

ERS

ERROR [ERR_UNCONFIG] object 'TBB_XXXX_XXX': TBB_XXXX_XXX Error! TBB Delays different than originally configured!

If this appears at the stop of the run, please make a separate elog entry so people are informed. No need to call anybody.

WARNING RODC_XXX ers::Message [ERR_LOAD] object 'FEB_XXX_AAA_XX': not responding to SPAC

If the entire crate has this problem (all FEBs + CALIB_XXX_AAA + TBB_XXX_AAA complain) you should swap the SPAC Slave-to-Master.
  • To swap it, follow instructions of this Elog

WARNING RODC_EMBCX rcc::ActionMessage Action 'MONITOR': FEB_EMBCX_XXX_FX reg. error +3.3V SCA or -1.7V SCA or +5V Analog [Left/Right]

  • The understanding is that these messages most probably correspond to real, particle-induced, SEUs. Where exactly?
    • An SEU within the voltage regulators themselves is "probable". Within the digital Controller ASIC "not very probable". Other sources "cannot be excluded".
  • For further reading, check this and this occurrence.

WARNING ROS::ROSRobinNPExceptions RobinNP ::clearRequest: The RobinNP could not delete 100 events because they were not in its buffer

  • If you have a large number of these messages at the start of a run, particularly if they are all for a single ROS, call the RC immediately. This is usually because the ROS wasn't configured properly at the start of a run, which should be able to be fixed with a TTC restart.
    • Note: we are currently keeping a special eye out for ROS-LAR-EMECA-03.
  • If you ignore these warnings at the start of the run, then it will likely be followed up with:

RODC_XX rcc::Generic XX DSP(s) have no received TTC event => ROD_XX_YY_NbTtcEvents

  • May occur at the start of run.
  • In case the warning shows up every minute, call the RC immediately. The corresponding TTC partition will need to be restarted.
    • Note: we are currently keeping a special eye out for RODC_EMECA2 (related to the above ROS)
  • If the warning only appears once, it's possibly a fake alarm. Post an ELOG. Call the RC if there are any related warnings (FEB errors, DQMD, ROS warnings).
  • If this was a real alarm then it will likely be followed up with:

Warning rc::HardwareError PU_XXXX RODC_XXXX

  • Uh oh! LAr has now gone busy. Call the LAr Run Coordinator IMMEDIATELY. Do a dump of the ROS/RODs status logs.
    • Post an elog with the details (see this entry for a nice example of what to include)
  • If the busy persists for over a minute the offending PUs will be stopless removed automatically (if in stable beams, flat top, squeeze or adjust), otherwise the Run Control shifter will get a popup. Run Control should NOT click "yes" for stopless removal before you have spoken to the LAr Run Coordinator.

  • Just for completeness: the messages regarding the stopless removal look like this:
    • CHIP-ATLAS  Information chip::msg::Recovery : Automatic recovery of category "Stopless" and of type "REMOVAL" for component(s) "RODC_EMECA2/PU_EMECA2_01_02" has been initiated/completed successfully

CaloReceiverGains _A or CaloReceiverGains _C Failed to load the receiver gains to XXXX crate, module XX

  • The following error message, concerning the CaloReceiverGains_A or CaloReceiverGains_C segments, might appear when a run start, during the 'CONFIGURE' step:

    • FATAL CaloReceiverGains_A rc::UserRoutineFailed User routine: CONFIGURE failed.
      
      Failed to load the receiver gains to EMEC C crate, module 10 : 00000000
            => acknowledge signal not received after data transfer
            => length of data record not equal to the number contained in the first byte
            => data received is inconsistent with function code
      
      ERROR CaloReceiverGains_A ers::Message Something went wrong with the USB communication to the receiver crates 
            FAILED_USBCOMM
    • If such case, you will have to ask the Run Control to reconfigure the concerned segment until there are no more error messages (one iteration is generally enough). If it systematically fails for all 16 modules (from 6 to 21) call the Receivers expert (Carlos) (16-5196) !
    • If you cannot reach the Receivers expert, call the L1Calo on-call phone (16-5213). If they cannot be reached, call the TDAQ on-call expert (16-2772).
    • This USB communication problem is solved by sending a VMEbus SYS_RESET to the problematic crates. This can be done by anyone with the DCS:TDQ:expert role (i.e. by Receivers expert, the DCS shifter, etc). If this does not solve the problem the next (brute-force) solution is to power cycle the problematic crates. This action can be done by anyone with the DCS:TDQL1CALO:expert role (i.e. by Receivers expert, the DCS shifter, etc). More details can be found in section Troubleshooting in the L1Calo Receivers Experts Page.

QPLL Unlock

If you see QPLL unlock message(s) in ERS, for example:

  • WARNING RODC_HECFCALC1 rcc::ActionMessage Action 'MONITOR': FEB_HECC1_02L_L2 QPLL Unlocked

  • Please check DQMD and OHP for FEB errors. If the errors continue past 2 LBs, call the LAr Run Coordinator.
  • Post a separate elog entry, including:
    • The time / LB of the incident
    • Whether or not there were FEB errors [Check OHP and the DQMD History under the Data Integrity branch]
    • The ERS messages (short version)
      • Tip: you may need to press SHIFT while refreshing Konqueror to make sure you get the newest version.
      • Please wait several minutes before taking this screenshot, make sure that the time axis includes the time of the QPLL (notice 1 hr shift to UTC time)

Note: An automatic "Scac Init" is performed at every ECR (=Event Counter Reset), which happens behind the scenes every 5 seconds, and resynchronizes the Switched Capacitor Array Controller chips on the FEBs (the QPLL chips are actually resynchronized automatically). No shifter intervention is required! except to record it in your shift summary (thanks!).

Background Information: ATLAS e-News article about the ATLAS Clock

L1Calo Hot Towers

More information on L1Calo hot towers can be found here.

WARNING l1calo-trigger-monitor-app trigmon::HotTower HEC L1Calo tower 0x071c0f00 is hotter than its neighbours. Eta=-1.55 Phi=-0.15 (-16,62). Please check L1 Empty trigger rates for unusual activity (EM3,J10,Tau8 etc.)

WARNING l1calo-trigger-monitor-app trigmon::VeryHotTower HEC L1Calo tower 0x05170201 is MUCH hotter than its neighbours. Eta=-1.75 Phi=1.91 (-18,19). Please check L1 Empty trigger rates for unusual activity (EM3,J10,Tau8 etc.)

  • The LArTriggerTowerNoiseKiller will automatically begin searching for a noisy cell within the trigger tower:
    • WARNING [void addActingTT(...) at LArHotTTMon/src/LArHotTTMon.cpp:364] BEGIN SEARCH FOR NOISY CELL  in tower 0x5170103
  • If the tool finds a noisy cell it will be disabled for the remainder of the run:
    • WARNING [void checkResult(...) at LArHotTTMon/src/LArHotTTMon.cpp:496] FOUND NOISY CELL  in tower 0x5170103 . Cell with online ID: 974838016 Will remain disabled for the rest of the current run.
  • Please note the details in the shift summary:
  •      LAr Cells permanently disabled (and NOT re-enabled):
         Time     |   TT ID   | Cell ID
         00:13:34 | 0x5170103 | 974838016
         
  • If the tool fails to identify the problematic cell please interact with the trigger shifter (and make a note in the shift chronology, including the TT ID). If the noise is disruptive to data taking it will be up to the shift leader and the L1Calo on-call (165213) to decide whether to mask the tower.
  • There is no need to report these ERS messages unless the tower is also causing high trigger rates and the trigger shifter had to be contacted. If this happens then please just add a sentence to the shift chronology explaining there was a hot TT (include the ID) causing high rates and note any action taken.

L1Calo Isolated Spikes

WARNING l1calo-trigger-monitor-app trigmon::Spike HEC L1Calo tower 0x071c0f00 experienced a spike in its PPM rate. Eta=-1.55 Phi=-0.15 (-16,62). Please check L1Calo Mapping Tool and L1 Empty trigger rates (EM3,J10,TAU8 etc.) for unusual activity.

  • Please check the L1 trigger rates using the trigger rate presenter and the L1Calo map.
  • If the spikes are causing high rates and becoming problematic (persistent spikes from the same tower causing disruption to the data flow), please interact with the trigger shifter. If the spikes are disruptive to data taking it will be up to the shift leader and the L1Calo on-call (165213) to decide upon the course of action.
  • There is no need to include all ERS messages pertaining to spikes in the "ERS messages (LAr)" section of the shift summary. Instead, add a sentence in the shift chronology if you had to get in contact with the trigger shifter / LAr RC noting the trigger tower ID and the action taken. If the spikes did not cause high rates and were not problematic then there is no need to report this issue at all.

OHP Repository LArHistogramming does not exist!

  • Ask the run control desk to restart the LArHistogramming server.
  • If he does not know how to do it, instructions can be found on the corresponding "How to..." section

PMG-AGENT_sbc-lar-rcc-PARTITION-XX.cern.ch

  • Ask the SysAdmins on call to restart the pmg agent,
  • If the SysAdmins answer is that the crate is down then AND ONLY THEN:
  • Click on the PARTITION -> ROD -> crate number
  • In the box at the top left next to the crate number that says "ON", click and select "GOTO_OFF." This will turn the crate OFF, then repeat the process to turn it ON again. If you cannot click GOTO_OFF because it is gray, click on the field with a key in the upper-left corner. You need to sign in with your Point 1 account.
  • Ask the Shift Leader to bring the full LArg partition down, and up again :
    • In the "Run Control" tab, select "LArg"
    • In the right part of the TDAQ GUI, open the "command" tab. Click "out"
    • Bring the LAr back to Shutdown and then back to the state of the other systems before putting it back in
    • inform Run Coordinator

TriggerMonitor

The triggerMonitor application (hosted under the l1calo segment) now provides several different types of warnings of interests to LAr (and Tile,) which can be related to unusual noise in the detector or to certain misconfigurations of the trigger menu which can have an impact on LAr operations.

  • Most spikes in trigger rates are OK, but rates which spike very frequently or which are persistently noisy may require some action to be taken. Shifters should keep an eye on TRP (please look at the rates before prescaling TBP) and the L1CaloMap to investigate these warnings, and contact the LAr RC in the event of a persistent problem. See the L1Calo Mapping Tool documentation.
  • You are asked to provide a summary of trigger activity in the shift summary.

PPM trigger rates:

  • [PARTITION] L1Calo tower 0xYYYYYYYY is hotter than it's neighbors....
  • [PARTITION] L1Calo tower 0xYYYYYYYY experienced a spike in its PPM rate....
    • This indicates that the Pre-Processor Module trigger rate on one of the L1Calo towers is unusually high or experienced a spike. It is important to determine whether this is a transient or permanent problem and whether this noise in the trigger tower is at high enough energy to register in the L1 global triggers. Actions for this are described above.
    • Check the 'Trigger Presenter' tool and determine whether there is any correlated high rates in the L1 empty items. The expected rates vary significantly from run to run, but on the order of 100hz is normal during a physics run with stable beam. If there is a sustained rate which many times higher than this, please post a separate elog. If it is extremely high (we have occasionally seen rates as high at 100khz-several mhz) then you should call the run coordinator! Spikes are ok if they do not occur too frequently, but if there are large and frequent spikes from the same tower with corresponding spikes in the global L1 empty rates, please post a separate elog.
    • You should also check the Current Status page to see if there is a known problem on this tower
    • Please also keep an eye on problematic towers using the L1Calo mapping tool. See the L1Calo Mapping Tool documentation.

Global Trigger rates:

  • [TRIGGER] experienced a spike in its rate ([RATE]). Please check Trigger Rate Presenter & L1Calo mapping for more information...
  • [TRIGGER] trigger rate is very high([RATE]). Please check Trigger Rate Presenter for more information...
    • If you see these warnings, you may check the L1Calo mapping Tool and TRP to see whether the spike or high rate is correlated with any hot tower. If they are correlated or the rate keeps high for a long time, please post a separate elog. Instructions on what to do in this situation can be found above. Note that the EF_*_LANoiseBurst triggers should be handled differently, see following.
  • [TRIGGER] is disabled(PS=-1), please check Trigger Rate Presenter for more information...
    • If this happens during physics data taking, it means that there may be an error in trigger configuration. Check with the Trigger shifters for additional information an call the Sw oncall/ LAr RC if necessary. For the moment, some messages may appear at the start of STABLE BEAMS regarding some FIRSTEMPTY items, this is expected and will be removed with the next patch.
  • [TRIGGER] experienced 5 spikes during the past 60 minutes. Please check Trigger Rate Presenter & L1Calo mapping for more information...
    • This should not happen very often. LAr RC or SW on call should be notified about repeated spikes which can cause persistent prescaling of the l1rates.

  • EF_xe45_LArNoiseBurst" experienced a spike in its rate (>1Hz). Please check Trigger Rate Presenter & L1Calo mapping for more information...
  • EF_xe45_LArNoiseBurst" trigger rate is very high (>1Hz). Please check Trigger Rate Presenter & L1Calo mapping for more information...
    • It may be possible to see some spikes for the LArNoiseBurst triggers. But if you got the spikes quite often or if you see high rate of any LArNoiseBurst trigger, you should inform the LAr RC or SW oncall and post an separate elog.

TDAQ GUI - "Run Control"

LAr is Busy

Updated by Nikiforos on October 19th, 2011 according to change in ATLAS/LAR policy for start/stop of run during stable beams and after the enabling of the automatic stopless removal.

  • If we have stable beams and there is no XOFF from the ROS (i.e. underlying ROS and/or TDAQ problems) the busy PU will be automatically disabled and the Main.RunControl desk should receive a pop-up window just announcing the stopless removal. Call the LAr Run Coordinator and relay as much information as possible (PU that was disabled, etc).
  • If the Run Control shifter or Shift Leader tells you that LAr is Busy and they got a YES/NO pop-up window IN STABLE BEAMS it means that most likely there is an underlying DAQ issue. Ask them to click NO and call TDAQ on call while you call the LAr Run Coordinator.
  • If the Run Control shifter or Shift Leader tells you that LAr is Busy AND NOT IN STABLE BEAMS (or impending stable beams) :
    • Ask them NOT to click anything in the pop-up window
    • Find out which PUs (=processor unit) are busy (listed in the pop-up window at the run control desk)
    • CALL the LAr Run Coordinator immediately so that he/she can tell you what to do
    • Especially if more than 1 PU is busy, make sure that the DAQ/HLT shifter is informed of the situation so they can check if there is an XOFF from the ROS
    • You may also see related message(s) in ERS:
      • WARNING RODC_<partition>X rc::HardwareError PU_<partition>_XX_XX RODC_<partition>X
    • Be sure to mention any additional ROS or SFI messages in ERS

  • If the super-shifter or a shadow shifter is present, please ask that person for assistance while you are performing the following. Otherwise, while you are in communication with the LAr run coordinator, please:
    • Open up the CTP Busy Monitoring Presenter (from the "Busy" button on the DAQ panel).
      • In the column "CTPOUT 13", one of the LAr partitions will be red and show 100% busy.
    • In OHP, go in the panel LAr_Expert/ROD_Crates:
      • On the 2D histograms, you see for each partition which ROD_CRATE / DSP are busy.
        • Use kSnapshot (under the General tab) to get a picture of the panel.
        • Things to know : 4 PU (Processing Unit) per ROD. 2 DSPs per PU. 1 DSP (Digital Signal Processor) = 1 FEB

  • Document what happened in a separate ELOG entry.
  • Mark the affected lumiblocks for the relevant partition in red in the DQ section of the Shift Summary.

RODC_PARTITION_X does not configure

This error happens when starting a run, during the transition from Config to Start

  • Ask the Shift Leader to do the following :
    • In the "Run Control" tab, click on LArg (not the ROD crate itself or the RootController)
      • Take the LArg segment "out" (done on the Status tab)
      • Click "restart" for the whole LArg segment, if the error persists, try "restart" again
      • Once LArg is brought back to the same state as the rest of the systems, put it back in

LAr Monitoring segment crashed

ERROR/FATAL RODC_XXXX User routine: PREPAREFORRUN failed

When starting a run, you get the following sequence of errors:

  • ERROR RODC_EMBA2 rcc::ActionMessage Action 'PREPARE_FOR_RUN': RodDb::PrepareForRunDb : DSP 6 FebID 0x39458000 Failed to send Calibration Constants
  • FATAL RODC_EMBA2 rc::UserRoutineFailed User routine: PREPAREFORRUN failed.
    • Ask the Shift Leader to restart the whole partition object (not only the ROD crate, but the next higher category). Please make an elog entry quoting the error message and the run number!!

DCS

If something turns into WARNING or ERROR :

  • First you need to precisely track down the problem by navigating through the tree :
    • LAr -> Partition (EMBA, FCALC...)-> ROD/FEC/LVPS... -> Crate number
  • Check if this problem is already listed on the Current Status page
    • If it's listed, get back to work
    • If the problem is NOT listed, call the LAr Run Coordinator!

If Anything is dead, grey, or unconnected in the FSM tree and is NOT listed in the Current Status page, call the LAr Run Coordinator.

  • Any DCS messages that are not on the Current Status page should be put in your shift summary elog under "Observed Problems"
  • To copy the message easily, right click on the alarm in DCS and click on "Insert into elog"; you can copy the message from the subject line of the window that pops up. Then click "Cancel".
  • Please also note the time of the alarm

HV Trip

In case of a HV channel trip, you will see some combination of alarms in DCS:

  • Alert 1: LAR [sys] [side] HV [sector] [mod-chan] TRIP TRUE
    Alert 2: LAR [sys] [side] HV [sector] [mod-chan] trip Autorecovery TRUE 
    Alert 3: LAR [sys] [side] HV [sector] [mod-chan] Not at Op Voltage TRUE
    

  • If Alert 1 remains on the alarm screen, please call the LAr HW on-call immediately at 70137, then the LAr RC at 70136. During a fill, s/he will likely reset the alarm and set the channel Vop to 0 (Alert 1 disappears, Alert 3 goes to CAME) and request that you call when the fill ends.
  • Alerts 2 and 3 will go to WENT state at the end of a successful trip autorecovery.

For all types of trip, please do the following:

  • Find the channel in the FSM, either by navigating through the menu or by clicking directly on the channel in the FSM screen. Make sure to select the matching module, channel number from the alert.
  • Look at the voltage and current plots in the bottom panels; for trips with autorecovery, make sure that the channel is restored to its nominal value properly.
  • If anything looks strange (i.e., the channel is drawing a lot of current, the same channel trips a second time) call the LAr RC at 70136 then the LAr HW on-call at 70137.
  • Inform the shift leader that there was a HV trip (for their information, and in case there is a corresponding trigger spike).
  • Post a separate elog entry according to the following template (adding LAr and L1Calo as affected systems) : *
    SUBJECT: Run <Run Number>: LAr HV Trip: [sys] [side] HV [sector] [mod-chan] <ALARM DESCRIPTION>
    TRIP TIMESTAMP:
    RECOVERY TIMESTAMP:
    DURATION:
    LHC STATUS:
    ATLAS RUN NUMBER:
    AFFECTED LBS:
    ETA,PHI REGION:
    NOISE/TRP-SPIKE DURING RECOVERY [Y/N]:
     If Y: (please include LVL1 in systems affected and remove this line)
     list of L1_X_EMPTY triggers affected:
     relevant ERS message from l1calo-trigger-monitor-app:
    
    
  • To find the DESCRIPTION of the alarm:
    • Right click on the alarm and select "Details". Copy the text under "Description".
  • To find the correct TIMESTAMPS:
    • Right click on the Autorecovery alert, and select "Details". Under "Acknowledgement Attributes", look at the "Partner Time" for the initial timestamp of the trip.
    • The "Duration" tells you the time between the trip and the recovery to nominal (cross-check with the alarm timestamp).
    • In the case where there is no autorecovery or the channel is switched off in response to a failed autorecovery, the time at which <10 V is reached can be obtained from the voltage and current plots in the FSM.
  • To find the LHC STATUS:
  • To find the ATLAS RUN NUMBER and AFFECTED LBS (important for Data Quality!):
    • See how to look up the LB. If the trip happened when no run was in progress, enter 0 for the run and lumi-block numbers.
    • Be sure to note the LB ranges over which the module was ramping (e.g. ramping back up, or in the case of a failed autorecovery also the LB range over which it was ramping down). If an autorecovery failed then it should be noted at which LB the voltage reached 0V, and then it should also be noted that the module remained at 0V for the remainder of the run. This means you may need to post moe than one elog.
  • To find the corresponding ETA,PHI REGION:
  • Either the HW on-call expert or LAr run coordinator will acknowledge all of the alarms at the end of the fill.
  • To find if there was a trigger spike during the trip recovery period (e.g. while the auto-recovery process was ramping the channel back on), follow the instructions to create plots using the trigger presenter (TRP). An example of a trigger spike is attached here. Please answer yes or no and give the time / LB. For trigger spikes please also state the affected trigger(s).

Background information:

  • An autorecovery procedure has been enabled for the EMB, EMEC and HEC (except hospital channels). (as of May 20th, 2011, autorecovery of EMB trips is also enabled).
  • HV trips in the EMBPS, FCal, or hospital channels will not be automatically recovered and require expert intervention.
  • If you notice any strange behavior (for example, if the channel changes state to NOT_READY/RAMPING in DCS or there is message in ERS like ddcct_LAR_EMECC ddc::AppWarning TTC Partition is not ready for data taking without a corresponding alarm), please collect as much information as possible and post a separate elog entry.

LAR FCAL HV Current Sum

When you see an alarm in DCS with the text LAR FCAL A HV Current Sum or LAR FCAL C HV Curent Sum, please submit a separate elog entry containing the following information:

  • TITLE : LAr FCal HV Current Sum Alarm
  • TIMESTAMP of the alarm
  • DESCRIPTION of the alarm
  • CAME text (HIGH or LOW)
  • MAGNETS status (ON, OFF, RAMPING UP, RAMPING DOWN,...)
  • LHC status (as described on LHC Page 1)
  • ATLAS RUN NUMBER + LB (how to look up the LB)

The alert text will be HIGH or LOW compared to thresholds of 100 μA (see elog 97499 for the full description). This is an alarm designed to spot new shorts in the FCal.

FEC with Voltage == 0

Several Front-End Crates are regularly loosing the connection to ELMB (in short : ELMB = device checking the crate voltage values and sending them to DCS).

  • Symptom : all the voltage values are set to zero (graphs on the left plot and the 5 values in the boxes below)
  • What to do :
    • BEFORE calling the LAr Run Coordinator, please collect the following information : if a run is ongoing, check in OHP if the regions corresponding to the faulty crates with presumably "zero voltage" are actually read out or not. If they are correctly sending data (i.e no hole in the detector coverage plots), it means it is a link problem, not a voltage problem.
    • Call the LAr Run Coordinator

HEC LV FSM nodes go to status unknown. You will also see alarm in alarm panel

  • In DCS FSM, log in and navigate to FSM node HEC FCAL A (or C) and then HEC_A_LV.
  • You must restart the OPC server (within 3 minutes!) by clicking button "OPC server". Otherwise, the HV of the HEC will be dropped.
  • Also call the LAr Run Coordinator!

ArchiveBufferNumber Warnings / Errors

Alarm description: LAR ATLLARXXX ArchiveBufferNumber, Disk buffer number is...

  • What to do:
    • If it is a WARNING, check that the "Online Value" is less than 100, and the alarm direction is WENT. Then post a separate elog entry with the full alarm description (including all systems affected, e.g. ATLLARSCS or ATLLARHV2), the value, and the time of the alarm.
    • If it is an ERROR, and/or the "Online Value" is greater than 100, call the DCS General on-call phone (Sergey).

ROD goes off

Serious problem to be addressed with maximum urgence. Call LAr RC!
  • For experts :
    • If one or more ROD crates appear to be OFF in FSM, click on "GOTO_ON".
    • Then ask for a TTC restart of the affected partition.
    • Afterwards, ask for a restart of the LAr Monitoring segment.

Monitoring

Continuous FEB Mon Errors in DQMD/OHP

  • If you see continuous FEB Mon Errors in DQMD (NBins is non-zero in every event in the history plot for a given partition):
    • Call the LAr run coordinator immediately

DSP Energy errors in DQMD/OHP

  • Some additional information on the allowed energy ranges (tolerances) for the OHP plot in Data_Integrity / DSP_Physics / Eonline - Eoffline. This can be used to help diagnose errors in DQMD under Data Integrity, DSP energy.
    • E < 2^13, +/- 1MeV
    • E < 2^16, +/- 8MeV
    • E < 2^19, +/- 64MeV
    • E < 2^22, +/- 512MeV
  • Please try to correlate errors with lumi block number if possible

Problems with the LAr MonitoringSegment

WARNING LArMonitoringSegment:pc-tdq-mon-73 rc::ApplicationWarning Application LArMonitoring-CaloMon-L1Calo on host pc-tdq-mon-73.cern.ch exited. Exit value:139.Logs are: /local_sw/logs/tdaq-03-00-01/ATLAS/LArMonitoring-CaloMon-L1Calo_pc-tdq-mon-73.cern.ch_1310737730.out/err.

  • If you see an error such as the one above, please check that the application restarted automatically. Go to the Run Control panel in your TDAQ window, open up the LArg->LArMonitoringSegment branch tree, and check that the application in question is "Up" or "Running"
  • If application does not restart, see restart the LAr monitoring segment



-- Main.VincentPascuzzi - 2017-04-03

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2017-04-05 - VincentPascuzzi
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback