Pre-requisites: You need access rights to USC55 and Cessy control room. Computer account on .cms is needed as well as sudo rights to become ecalpro/ecaldev. You must know how to set up a tunnel to P5.

ECAL trigger expert deontology: the ECAL trigger expert on call has always to be reachable (by phone) 24/24, 7/7. When called, he/she should try to fix the problem. He/she may not be in position to use a computer for a limited amount of time (1-2h). In that case, if it is an urgent matter he/she can try to reach another expert and get him/her to fix the problem. If not possible, it is his/her responsibility to come back as soon as possible in a situation that allows him/her to do his/her duty.


General considerations

Things to Do BEFORE your shift:

1. Get a P5 account

In order to do ANYTHING as a trigger expert you need to have access to the P5 computing cluster. Make sure you apply for an account well in time before your shift. In order to get a P5 account email Giacomo (giacomo.cucciati@cernNOSPAMPLEASE.ch) and cc Alex (alexandre.zabi@cernNOSPAMPLEASE.ch) stating that you will be Trigger Expert on call soon and need a P5 account.

Important - Please make sure you ask for sudo rights for your P5 account when you apply for it. Normally it would be given to you automatically but people (are super busy at times) can forget. Without sudo rights you wont be able to edit any files and consequently wont be able to mask anything in the database.

2.Access to Webpages

In order to access certain webpages you need to run a tunneling script.

The tunnel script is here:

http://azabi.web.cern.ch/azabi/TEMP/cmstunnel.sh

Once you have the script run it with the command - ./cmstunnel.sh . You will then need to supply two passwords. First your regular CERN account password followed by your P5 password. Keep the tunnel running as long as you need to access the relevant webpages that require tunneling.

Note - You need to run the tunnel on system where you are launching your web browser. Usually that will be your laptop/PC. So you'll need to run the tunnel on your laptop. (unless you're trying to launch a web browser remotely in which case you need to run the tunnel on the remote machine.)

3. Relevant Webpages

The two most important web pages are:

1 : http://cms-project-ecal-p5.web.cern.ch/cms-project-ECAL-P5/ (doesn't require tunnel)

2:

http://l1page.cms/main/FirstPage (requires tunnel and proxy)

3: ECAL Trigger Operations, Most update for operations and Masking histories and so on https://twiki.cern.ch/twiki/bin/view/CMS/ECALTriggerOperation

It is suggested that you use firefox as it has been known to cause minimal fuss when accessing cms/cern webpages. The proxy that you need to add is here : http://azabi.web.cern.ch/azabi/TEMP/cms.pac .To add proxy go to Firefox->Preferences->Advanced->Network->Settings->Automatic Configuration Url: and add the above link.

Important: Firefox (newer versions) always adds a www by default at the beginning of a web address. You need to get rid of that. Otherwise the second link might not work. Instructions on how you can do that can be found at the following stack exchange page : http://superuser.com/questions/584305/firefox-says-server-not-found-because-its-adding-www-how-do-i-stop-this.

4. Access and Training

You should apply for access to the CMS control room on P5 prior to your shift. You can ask for access here:https://edh.cern.ch/Desktop/dir.jsp

Go to ACCESS REQUEST and ADD .

You also need to complete certain training courses in order to be able to be given access to P5. For the CMS control room at P5, (as of now) you need to pass the following (online) courses:

1.CERN Safety Introduction

2.Building Safety (Level 2) a.k.a. Beam Facilities

3.CMS (Level 4 C)

You can take the courses here: https://espace.cern.ch/info-safetytraining-official/safety_training_for_me/Pages/default.aspx

A full catalogue of what courses you need for any particular access can be found here:https://apex-sso.cern.ch/pls/htmldb_edmsdb/f?p=273:1:17263051253359::::: . Go to my accesses.

Phone numbers/experts list

Who to call?

Please always try first trigger virtual phone: 7-0133. If not answering, consult who is on-call and check the phone number below to call him/her directly. If he/she doesn't answer (must not), call any experts below.

Trigger experts so far (by order of appearance): Pascal Paganini (16-5719), Alex Zabi (16-0737), Sean Lynch (16-0065), David Petyt (16-9326), Dmitri Konstantinov (16-9359), Matt Ryan (16-8778) and Jean Fay (16-9612)

Redirecting phone number of ECAL Trigger Expert on-call

From a fix CERN landline :

* To cancel the deviation: *7 70133 0101 #1

* To redirect to your phone: *7 70133 0101 *1 16xxxx (xxxx your phone)

*rerouting won't work unless you use a CERN landline for it. You need to have a cern cell phone. When you are trying to do the above the automatic voice on the phone will keep saying that you're dialing a wrong number. Ignore the voice. Keep dialing until the end of the above. At which point you will hear -"votre manouver a été enregistré". That means it's done. * The usual ecal trigger phone number is 5719.

Very useful links

Geometry/mapping of ECAL trigger towers

You will have to deal with tcc number, FED id, crystal number in CCUid etc. Following an idea from David Petyt, I've created a root tree containing all these info. You can get it here: http://fay.web.cern.ch/fay/TRIGGER/EcalTPGParam.root

Useful commands are (in root):

To know to which FED/TCC/tower a crystal(ieta=1,iphi=15) in EB belongs to:

tpgmap->Scan("fed:tcc:tower","ieta==1 && iphi==15")

To know to which FED/TCC/tower/CCU/VFE a crystal(ix=50,iy=15,iz>0) in EE belongs to, and getting its number within the CCU (useful for crystal masking scheme):

tpgmap->Scan("fed:tcc:tower:CCU:VFE:xtalInCCU","ix==50 && iy==15 && iz>0")

To know which TCC controls which part of the detector and where the board is installed:

tpgmap->Scan("det:crate:TCCslot","tcc==1")

Want to know all useful info about a given tower (ietaTT=-7, iphiTT=55)? Type for instance:

tpgmap->Scan("det:fed:CCU:tcc:tower:ietaTT:iphiTT:TCCch:TCCslot:SLBch:SLBslot:crate","ietaTT==-7 && iphiTT==55")

Want to know the id in the oracle DB for a given crystal and its correspondence in CMSSW? (useful if you check the DB content with https://cmsdaq0.cern.ch/cmswbm/cmsdb/servlet/ECALTriggerData?CONF_ID=108 for example)

tpgmap->Scan("dbId:cmsswId","ix==50 && iy==15 && iz>0", "colsize=10")

Here is a brief description of the branches of this root TTree:

fed tcc tower stripInTower xtalInStrip CCU VFE xtalInVFE xtalInCCU
fed nb tcc nb tower nb within the TCC [EB:1 to 68, EE:1 to 28] strip nb within the tower [1 to 5] crystal nb within the strip [1 to 5] CCU nb VFE nb [1 to 5] crystal nb within the VFE [1 to 5] crystal nb within the CCU [0 to 24]

ieta iphi ix iy iz hashedId ic cmsswId dbId ietaTT iphiTT
crystal eta index crystal phi index crystal x index crystal y index z side crystal hashed index crystal nb within the SM/sector crystal DetId.rawId() crystal EcalLogicId.getLogicID() (format used in DB) tower eta index tower phi index

TCCch TCCslot SLBch SLBslot ietaGCT iphiGCT det crate
channel input of the TCC TCC slot in VME crate channel input of the SLB [1-8] SLB slot on the TCC GCT eta index GCT phi index SM or EE sector nb (ex EB+2) VME crate name
I list below however useful info and pictures: In the barrel, there is 1 TCC per FED. A TCC barrel has 68 trigger towers. In the endcaps there are 4 TCC per FED: 2 in the outer region, 2 in the inner. A TCC covering the outer has 16 trigger towers. In the inner, there are 28 trigger towers.

TCC Numbering, FED id etc:
TCCGeom.png

Trigger towers in EB :
EB.png

Trigger towers in EE- :
EEMinus.png

Trigger towers in EE+ :
EEPlus.png

The 2 most inner eta rings of EE

The 2 last eta rings in EE (ieta=27-28) are special: the towers are duplicated so that ECAL still sends to Regional Calorimeter Trigger 4 towers per 20 degrees sector per eta ring (as this is the case for all 20 degrees sectors and even in the barrel where 20 degrees correspond to 1 SM). At the detector level, the number of crystals in this large eta region are so few, that it was impossible to build 4 towers. Therefore, to fulfill RCT constrains, the towers are artificially doubled in the TCC (we're talking about virtual towers in trigger jargon here). However, in order to conserve the total energy, the ET affected to each member of the pair of duplicated towers is simply divided by 2. Consequences: if you want that your rechits match the TP, multiply the TP by 2 ! Beware that by looping on crystals and getting the trigger tower, you just find 1 member of the pair of duplicated towers. You miss the second one. This is the case in the pictures above where in order to draw the boundary of each towers, we looped on the crystals."

Geometry of RCT/GCT

That's here...

Connection of fibers to TCC-EE

Since it's a real mess, it's better to look to this drawing:

TCCEE-fibre-connection.png

Anyway, I don't expect "standard experts" to touch to these fibers.

The famous trigger rules

These rules are there to avoid the overflow of the buffers of the different front-end detectors. The trigger rules are:

  • No more than 1 Level 1 Accept per 75 ns (minimum 2 bx between L1A), dead time 5.10-3.
  • No more than 2 Level 1 Accepts per 625 ns (25 bx), dead time 1.3.10-3.
  • No more than 3 Level 1 Accepts per 2.5 µs (100 bx), dead time 1.2.10-3.
  • No more than 4 Level 1 Accepts per 6 µs (240 bx), dead time 1.4.10-3.

Most common problems/requests during runs

It's interesting to check if one of the experts has already encountered this problem and how he managed to fix it. Check this page and put there your own contributions. By the way, *if you have a TCC having problem or TT being suddenly noisy, always check the page: http://cms-project-ecal-p5.web.cern.ch/cms-project-ECAL-P5/query/main_query.htm (for password ask me or Nicolo but it's easy to guess). You will see if it's a known case. Don't forgoet to add your contribution if needed.*

A TCC has an error at configuration level

I consider that the Trigger Expert must be able to cope with such problems since the TCC is the ECAL Trigger board. You will find an error message in the ECAL Function Manager.

The infamous power-up problem: "Fatal error in configuring [...]" (Note 19.05.2015 Alex: this is outdated)

There is a known bug in the TCC-EB: sometimes after a power cycle of the crates, their FPGAs is not properly configured. Obviously in this case, there is an error when ECAL is being configured: almost all registers don't have the expected value (click on the TCC error message in the ECAL Function Manager to access to the error message). The error message can be weird like "Fatal error in configuring etc". You can confirm this by starting the TCC gui (see instruction below). You will see all the left panel (showing the link status) being all grey instead of green or red. The only solution is to reboot the crate.

The QPLL lock stability: "Error in TCCBarrelSupervisor: hardware checking failed TCC [...]"

Before any state transition, the QPLL is checked and must be locked. If the QPLL has been unlocked, you can get this message (click on the TCC error message in the ECAL Function Manager to access to the error message): Error in TCCBarrelSupervisor: hardware checking failed TCC in slot:5, serialNumber:TCC68-V1.2-0019 QPLL NOT locked TTCRX ready SRP ON QPLL had been NOT locked or with errors TTCRX stay ready DCM locked Phase beetwen Clock and TTCrx set. When there is a QPLL error, it means the clock is not stable. There is a difference between the TCC barrel and the TCC endcap with that respect: for historical reason (and mistake) the main clock of the TCC barrel is the one coming from (TTCci-)RCT through the following long chain: Machine interface->RCT TTCvi->RCT TTCfanouts->ECAL TTCfanout->TCC EB. On the contrary, the main clock of TCC endcap is the expected one: the one delivered by (TTCci-)ECAL: Machine interface->ECAL TTCci->TCC-EE. Hence, when QPLL is not locked in all boards of the barrel, it's due to the RCT clock that doesn't arrive to the TCC. It can be due to any componant of the chain and usually, you have to check with trigger shifter whether RCT or GCT is not playing with their clock. If it happens for a TCC-EE, the problem is due to our TTCci.

Wrong SLB latency "Error in TCCBarrelSupervisor::checkSLBLatency() [...]"

The SLBs are mezzanine boards on the TCC EB or EE used as an interface with RCT. Basically they act as a FIFO, keeping the trigger primitives during a given time so that all trigger primitives from both ECAL (EB and EE) and HCAL (HF HB HE HO) are aligned at the input of RCT. This time can be measured and this is what we call the SLBLatency. The way the SLB does it is to count the number of clocks between the BC0 received via ECAL (called TX-BC0) and the BC0 received via RCT (called RX-BC0). If the latency is not the one expected, the TCC issues an exception with this kind of message: slbId: 1 syncId: 1 latency measured/expected: 6/8. It's a major problem since it means that the trigger provided by this board won't be in time with the other triggers. There are 2 possible scenarios: 1) all SLBs reported the error. In that case, clearly one of the 2 BC0 is wrong. It can be due to ECAL TTCci or RCT TTCci. To be checked with ECAL DAQ expert or RCT expert. 2) only SLBs of one TCC suffer from this problem. Try first to reconfigure again. If it persists, I know only 1 way to fix it: reboot the crate of that TCC (call the ECAL Run Field Manager). You can check yourself the SLB latency using the script showSLBLalency.sh as described below.

ECAL trigger rate is too high!

this is the most common reason for being called by ECAL or Trigger shifters. Here is what we have to do:

Expected ECAL trigger rate

2011:

During collisions (starting 11 Mars 2011) with luminosity of 300e30 cm-2s-1 (April 2011) L1_SingleEG2 does not exist anymore This can be looked at with this page (you need a tunnel):

http://cmswbm.cms/cmsdb/servlet/L1Summary?RUN=163174

* L1_SingleEG5 (bit 47 prescale 50) 1,100 kHz

* L1_SingleEG12 (bit 50) 2.5 kHz

* L1_SingleEG15 (bit 51) 1.3 kHz

* L1_SingleEG20 (bit 52) 0.5 kHz

* L1_SingleEG30 (bit 53) 0.15 kHz

During Cosmics runs * L1_SingleEG5 (bit 50) 0.9 Hz

2010:

When noise is stable, without collision here is what we expect:

  • L1_SingleEG2 (bit 46): rather stable, expected rate 10-15 Hz
  • L1_SingleEG5 (bit 47): very stable, expected rate below 1-5 Hz

When there are beams, these rates can be much higher depending on obviously on the luminosity. With the current luminosity (May 25, 2E29 cm-2 s-1) we have:

  • L1_SingleEG2 (bit 46): 900 Hz
  • L1_SingleEG5 (bit 47): 70 Hz
  • L1_SingleEG20 (bit 47): 3 Hz

Check always these rates with the rate of algo bit 124 (minimum bias). The EG rates should scale with bit 124. So far we have:

  • L1_BscMinBiasOR_BptxPlusORMinus (bit 124): ~7 KHz.

Check ECAL trigger rate

yes, sometimes "they" think it's ECAL but they're wrong!

  • New: to check trigger rates, connect to https://cmswbm.web.cern.ch/cmswbm/cmsdb/servlet/TriggerRates. It's easier than the method described below (however still valid and can be useful to know)
  • Let's check first ECAL is IN the trigger. Go to Trigger online web page (P5 tunnel needed):http://l1page.cms/main/FirstPage and click on the key used by GT (link usually started with TSC_xxxx). Look at the RCT key name and check in the trigger wiki (there is a link in the right panel of http://l1page.cms/main/FirstPage) the meaning of this key. At least EB or EE must be involved.
  • Now, check the L1-algos used: still go to http://l1page.cms/main/FirstPage and click on the GT Cell. In the trigger Supervisor menu on the left, click on Control Panels and GT Trigger Monitor. Check the bit corresponding to ECAL algo. Until recently bit 45 (ie ECAL L1 single EG1, meaning 1 single Electromagnetic candidate above 1 GeV) was used. Now (Nov 2009), the bottom line is bit 46 (Single_EG2) . This bit must appear in the table. If not, ECAL is not involved in the trigger decision. Warning: with splashes only EG5 is used: bit 47.
  • Now, you're at the point where you can read the trigger rate coming from ECAL. Check ECAL bit rate. This rate has a meaning only when the Trigger is in Running status. You can also check rates for higher trigger threshold like L1 single EG5,10 (bit >46) etc. In order to see them, deselect the compact button. The matching between trigger bits and trigger name is given by the so called Trigger Menus where the list is given here: https://twiki.cern.ch/twiki/bin/view/CMS/GlobalTriggerAvailableMenus. For instance, in CRAFT09, we used https://twiki.cern.ch/twiki/bin/view/CMS/GlobalTriggerMenu_L1Menu_Commissioning2009_v3. For beam, it evolves too fast to follow! Look at: GlobalTriggerAvailableMenus
  • What rates should we expect for ECAL triggers with bit46? If only EB is involved, we usually see rate ~ few Hz. If only EE, few Hz as well. If both, you sum up the previous numbers, ie around 10 Hz. I personally consider that there is a problem if the rate is at least an order of magnitude above the expectations. Check also if there is dead time by looking at the dead time counters values on the same window (bottom of the page). Usual numbers are around 2-3%. Above, there may be a problem like L1A signals are coming too close and so suppressed by trigger rules.
  • Other useful information is the trigger rate history (this is continuously monitored by the Trigger Shifter). Still go back to http://l1page.cms/main/FirstPage and click on the top on Trigger Rate Plot. You see the overall L1 Trigger rate as function of time. Usually we're mainly interested in the contribution from ECAL. Go to http://cmswbm.cms/cmsdb/servlet/L1Summary?RUN= and provide the ongoing run number. Scroll down until you see Single EG1 and click on the rate number. Any drop/increase/spikes indicates a problem.

Fix high ECAL trigger rate

Ok, let's assume we see huge rate several kHz (they can come by spikes). If ECAL is the source of the problem, it must be related to noisy towers.

  • Let's check Trigger Tower rates (monitored at 40 MHz by the TCCs boards). Go to the so-called Non-Event Data (XMAS) monitoring: http://ecalod-web01/xmas.html (still using a ssh tunnel) and click on Trg. You have the map of the rates, the color (Z scale) being the rate in Hz. Set a threshold to 8 ADC ie 2 GeV (the Trigger Primitive LSB is 250 MeV). There are 2 situations:
    • the calibration sequence is off (so no light is flashed during the abort gap): the rate you see is therefore directly related to the physics L1 trigger rate. If you see a tower having several kHz, clearly it's noisy and has to be masked. To mask it, please follow these instructions
    • the calibration sequence is on: the rate you're seeing is dominated by lights flashed in the orbit gap. No way to identify a noisy tower, its level being much lower than laser/led/test pulse activities. Since the calibration is cycling over the whole detector, there are times where in some regions of the detector, nothing is happening. You can see there the noisy towers. But it's not easy. So the best is to call the Trigger shifter (75257) and ask him to switch off temporarily (during 5') the calibration sequence. The plot will then be much cleaner.
  • An alternative and complementary approach is to check the Trigger Primitive occupancy in the DQM: https://cmsweb.cern.ch/dqm/online/session/. Choose alternatively the workspace EcalBarrel and EcalEndcap, click on Layout, choose Shift view and one of the plot displayed is the TPG occupancy. Beware that this plot is only filled when there is a L1A. In this sense, it gives poorer (and biased) informations than the plots in http://ecalod-web01/xmas.html
  • Now imagine that in spite of a large bit 46 trigger rate, you don't find any noisy trigger towers: this excludes ECAL detector. The problem can be in between ECAL and RCT (a flaky link for instance) or RCT-GCT or GCT-GT. The best is to call the RCT expert (click on Contacts on the upper part of page http://l1page.cms/main/FirstPage).

Masking a noisy trigger tower/strip/crystal

The recipe given below adds the noisy tower/crystal to the list stored in the DB. The mask will be applied permanently only after reconfiguring ECAL. If you want to mask a tower immediately you can use this instruction. But don't forget to mask it in the DB as well otherwise, at the next run (ECAL being reconfigured) your temporary mask will be lost.

Some detailed instructions in the slides prepared by Alex Zabi: ECALTriggerShifter.pdf ECALTriggerShifter.pptx

Important:

Before executing any of the following recipes: Make sure you add the following line to your .bashrc file in your user space. After you have added the line, you need to logout and login again.

source ~ecaldev/utils/bashrc

Also, please make sure you edit the .xml file as pro and then log out of pro and run the update script as user as outlined below. Otherwise, it won't work.

Case of trigger towers

We NOW use ONLY DBGui masking at the Trigger Tower level until we fix the problem with masking with script The same instruction also exists in the "ECALTriggerOperation" Twiki. https://twiki.cern.ch/twiki/bin/view/CMS/ECALTriggerOperation

To Mask Trigger Tower Using DBGui

We are currently developing to be as user-friendly as possible. The following instruction will be updated once we migrate out testing version to the production sandbox. You need to launch the server test script "To Launch the server_test from Sandbox_newTCC" to access the DBGui (5001) version.

  • Tunnel into P5 (run cmstunnel.sh script)
  • Go to DBGui page following the links below -- 5000 for the current production version, 5001 for the testing version

http://srv-s2f19-30-01.cms:5000/static/index.html#/login
http://srv-s2f19-30-01.cms:5001/static/index.html#/login

  • In DB String box, go to ECAL@P5
  • Enter DB password (upon request, ask Alex/Tutanon)
  • Go to FE Trigger, click on Bad Trigger Towers
  • Go to second page, and click on edit for BEAMV6_TRANS_SPIKEKILL
  • This will bring you to the page with ECAL geometry both EE and EB as shown below

DBGuiMasking.png

  • Double click on the "bad" trigger tower in EB or EE, this will bring you to the new page with TCC/TT numbers (for EB) and FED/TCC/TT numbers (for EE)
  • You could also check the currently masked too by clicking on Show All button. Note that you will have to go back to the previous page to mask the bad TT if you are already open the list of the currently masked TT
  • Click on Add to list
  • Click NEXT
  • Check ONLY the update box corresponding to the run key you just modified, OTHERWISE YOU WILL MODIFY EVERY KEY!!
  • Click Validate
  • Remove Dry Run box and click on Commit
  • You will see Commit Successfully
  • Write on ELOG the new Conf_ID and which TTs are masked
  • Request ECAL DoC to request a "Green Recycle"

To Mask Trigger Tower Using Script

  • connect to cms node and as ecalpro edit ~/DAQ/RunTime/GREN/config/db/Bad_Trigger_tower.xml

In this file, add a set of lines similar to:

<Row\><Parameter column="FED_ID" type="unsigned int"\>###</Parameter\>

<Parameter column="TT_ID" type="unsigned int"\>###</Parameter\></Row\>

<Parameter column="TCC_ID" type="unsigned int"\>###</Parameter\></Row\>

Note that you do not need to specify TCC_ID for EB but you DO need to specify TCC_ID for EE.

  • Then as yourself run the command:

updateTrgTowerMasks.sh -k BEAMV6_TRANS_SPIKEKILL -d ~ecalpro/DAQ/RunTime/GREN/config/db/Bad_Trigger_tower.xml

At the end of the program, you can see a new ConfID being generated which you can use to check the DB online:

https://cmsdaq0.cern.ch/cmswbm/cmsdb/servlet/ECALTriggerData?CONF_ID=580

  • Write an elog. Everytime you mask a tower you need to add this to the following page:

http://cms-project-ecal-p5.web.cern.ch/cms-project-ECAL-P5/query/main_query.asp

If you need to apply a mask on the fly, it's explained here.

Case of crystals mask
  • connect to cms node and as ecalpro edit ~/DAQ/RunTime/GREN/config/db/Bad_Trigger_xtal.xml

In this file, add a set of lines similar to:

<Row\><Parameter column="FED_ID" type="unsigned int"\>###</Parameter\>

<Parameter column="TT_ID" type="unsigned int"\>###</Parameter\></Row\> NB: Instead of the TT_ID put the CCU_ID here!

<Parameter column="TCC_ID" type="unsigned int"\>###</Parameter\></Row\>

<Parameter column="CRY_ID" type="unsigned int"\>###</Parameter\></Row\> You can find the CRY_ID in EcalTPGParam.root file, it is named xtalInCCU

<Parameter column="STATUS" type="unsigned int"\>1</Parameter\></Row\>

  • Then as yourself run the command:

updateTrgXtalsMasks.sh -k BEAMV6_TRANS_SPIKEKILL -d ~ecalpro/DAQ/RunTime/GREN/config/db/Bad_Trigger_xtal.xml

  • Write an elog.

Case of strip mask

  • connect to cms node and as ecalpro edit ~/DAQ/RunTime/GREN/config/db/StripConf.xml
In this file, add a set of lines similar to:

<Row\><Parameter column="FED_ID" type="unsigned int"\>###</Parameter\>

<Parameter column="TT_ID" type="unsigned int"\>###</Parameter\></Row\>

<Parameter column="TCC_ID" type="unsigned int"\>###</Parameter\></Row\>

<Parameter column="ST_ID" type="unsigned int"\>###</Parameter\></Row\> You can find the ST_ID in EcalTPGParam.root file, it is named stripInTower

<Parameter column="STATUS" type="unsigned int"\>1</Parameter\></Row\>

  • Then as yourself run the command:

updateTrgStripMasks.sh -k BEAMV6_TRANS_SPIKEKILL -d ~ecalpro/DAQ/RunTime/GREN/config/db/StripConf.xml

  • Write an elog.

SLB errors are reported by the Non-Event Data Monitoring (a.k.a XMAS)

Go to the Non-Event Data monitoring: http://ecalod-web01/ecalXMAS/ (still using a ssh tunnel) and click on the tab slb. Scary hein? Yes, we usually see errors like this:
slb.png

Well there is known bug (one more!) in the TCC-EB where the SLBs nb 2,5 and 8 have TX-error as soon as the run is long enough... So far, we haven't found a way to fix this. Analyzing the data offline doesn't indicate that these errors are an issue (so far...). So, if you see such errors, simply ignore them! On the contrary, if you see an error in the SLBs of the TCC-EE or in the SLB 1,3,4,6,7,9 of the TCC-EB, something goes wrong. Please check the SLB latency for these board. Follow these instructions. The expected latency in EB and EE are given ECALTriggerSettings (click on SLB). If it seems crazy, it's probably better to stop the run and re-configure. Then cross your fingers...

Checking DQM

Reminder about DQM

Looking to TP quantities
  • in the Workspace tab, select the detector you want to monitor (EB or EE)
  • to see TP occupancy:
  • to see TP timing comparison with emulator:
    • 01 ECAL Shift > 05-Trigger + Selective Readout
    • Plot 03 Trigger Most Frequent Timing

Looking to L1 EM candidates quantities
  • use CMS DQM and select workspace L1T
  • Plots showing occupancy of iso, non-iso EM candidtae, bx distribution are there.

Loading new ECAL trigger configuration (this cannot be done by trigger expert on call and it is outdated anyway Alex 19.05.2015)

Now that trigger parameters are in the DB, we need to populate the DB with the new parameters. Have a look to this page. Config file to be modified is produceTPGParameters_beamv2.py (April 29th 2010). It was before produceTPGParameters_beamv1.py. They all belong to the directory: /nfshome0/ecalpro/trigger_config/CMSSW_3_5_4/src/CalibCalorimetry/EcalTPGTools/test

Current ECAL Trigger settings

2011

The available trigger keys are:

- BEAMV6TRANS __SPIKEKILL : This is the default key to be used for collisions and cosmics, it implements the spike killer and the transparency corrections for the EndCap

- BEAM_ZEROTESLA_V1 (NOT USED ANYMORE) Key to be used for zero tesla running. This is taken into account specific masking of towers and crystals (still needed?) and the fact that EE response is moving up by 15-20%.

- BEAM_ZEROTESLA_V1_SPIKEKILL (NOT USED ANYMORE) Same but with spike killing if collisions are to recorded with field off.

These are also listed in this twiki:

https://twiki.cern.ch/twiki/bin/view/CMS/OnlineWBL1TriggerKeys2011

2010 They are listed here

Modify ECAL Trigger Settings

This is not advised and should only be done by an expert but modifications can be done in case of emergency. All parameters are defined in a python file in this directory:

1) you should go cmsusr1

2) log as ecalpro

3) source /nfshome0/cmssw2/cmsset_default.sh

export SCRAM_ARCH=slc5_amd64_gcc434

cd /nfshome0/ecalpro/trigger_config/CMSSW_3_11_1/src/

4) cmsenv

5) cd CalibCalorimetry /EcalTPGTools/test

6)\emacs -nw produceTPGParameters_beamv5.py

And change the parameters. DB input parameters (pedestals, gain ratios, intercalibration constants and adcToGeV) are not updated automatically anymore. There are protected and retrieve the same parameters all the time for collision running. Update can be done by removing the line:

firstRun = cms.untracked.uint32(161310)

and use:

firstRun = cms.untracked.uint32(100000000)

This should take the latest values from the DB. Please make sure that you are writing to the OMDS: writeToDB = cms.bool(True),

8) save and close

9) cmsRun produceTPGParameters_beamv5.py should write all these into the DB, then save the CONF_ID and put it in an elog.

10) go to the following page:

https://cmswbm.web.cern.ch/cmswbm/cmsdb/servlet/ECALTriggerData?CONF_ID=300

The last number (here 300) is your config Id, just use the one you got when running the produceTPGParameters.

11) VERY IMPORTANT: You should also save the strip masking file which is not attached to the new configuration, this should be fixed later on. logout of ecalpro and as yourself just do:

cd /nfshome0/ecaldev/francesca

updateTrgStripMasks.sh -k BEAMV5 -d ~ecalpro/DAQ/RunTime/GREN/config/db/StripConf.xml

Need to change ECAL thresholds

What thresholds are we talking about? There are 2 kinds: Trigger Primitive thresholds and selective readout thresholds

Trigger Primitive thresholds

The L1 Electromagnetic candidate is made by the sum of 2 Trigger Primitives of adjacent towers. This is this sum that GT requires to be above the GT threshold. In addition, in order to minimize the noise contribution, a threshold is applied on each Trigger Primitive. If the Et of the TPG is below this threshold (current values are 250 MeV in EB and EE), then it is set to 0. This threshold is defined by the Look-Up Table used to compress the Trigger Primitives from 10 bits to 8 bits in the TCC board. Change the 2 lines with LUT_threshold_EB and LUT_threshold_EE.

Trigger Tower Flags thresholds aka selective readout thresholds

Change the 4 lines of the kind TTF_lowThreshold_EB in the config file mentioned above.

sFGVB (strip Fine Grain Veto Bit) thresholds

See here for instructions on how to update the current trigger key to change the sFGVB threshold. The precise values to use should be decided in consultation with James Jackson/ECAL DPG. When a new value is uploaded, please record the following details in the ECAL ELOG:

Trigger key/version, sFGVB threshold (in ADC counts), Link to CONF_DB entry, Starting run number

Please also note any changes to the sFGVB threshold in the corresponding JIRA entry: ECAL-26

Use the sFGVB to kill spikes: changing the ECAL TPG key

It is now possible to switch on the spike killing using the working point chosen: 281 MeV threshold for the sfgvb and 8 GeV for the ECALTPG. You just need to ask the trigger shifter (or ECAL DG) to change the ECAL TPG key to BEAMV5_SPIKEKILL (or BEAM_ZEROTESLA_V1_SPIKEKILL if the magnet if OFF) and ask to recycle && reconfigure ECAL. DO not forget the recycle stable otherwise the front-end won't update its key.

Use the sFGVB to kill spikes: changing the ET threshold on TPG for spike killing

How to setup the spike killer on the TCC. This is the procedure to follow step by step, please take your time. A few checks are to be done to make sure the configuration went through:

1) you should go cmsusr1

2) log as ecalpro

3) source /nfshome0/cmssw2/cmsset_default.sh

export SCRAM_ARCH=slc5_amd64_gcc434

cd /nfshome0/ecalpro/trigger_config/CMSSW_3_11_1/src/

4) cmsenv

5) cd CalibCalorimetry /EcalTPGTools/test

6)\emacs -nw produceTPGParameters_beamv5.py

7) in this config file you should see these lines:

SFGVB_Threshold = cms.uint32(9),

SFGVB_SpikeKillingThreshold = cms.int32(-1), ## (GeV) Spike killing threshold applied in TPG ET in TCC (-1 no killing)

The first one is the SFGVB threshold, here we have changed the saturation scale to 128GeV what it means is that an old SFGVB == 17 adc is in fact SFGVB/2, so I chose 9 adc here.

The second one is the Spike killing threshold which is in GEV, I do the conversion inside the code.

--> -1 means no killing

--> any other value (8 GeV for example) will be loaded into the boards

8) save and close

9) cmsRun produceTPGParameters_beamv5.py should write all these into the DB, then save the CONF_ID and put it in an elog.

10) go to the following page:

https://cmswbm.web.cern.ch/cmswbm/cmsdb/servlet/ECALTriggerData?CONF_ID=300

The last number (here 300) is your config Id, just use the one you got when running the produceTPGParameters.

Once, the page is up scroll down to spike data and look at the file (). If you have no killing, then you should read 1023. If you have 8 GeV you should read 64 : 8*1024/128

11) This step can only be done by the trigger expert on call. If not, move to the next step. Now, In order to check what threshold is implemented in the TCC barrel, you need to open a GUI on any TCC EB, please have a look at the twiki (opening a TCC GUI during global run). The TCC should be configured by the daq shifter. This can also be checked if the TCC are running.

Then, once opened, and clicked connect, go to the debug menu on the top right and write in the first field, the following address: 888014, then press enter on your key board and the click on the small "R" letter, to read. This should read the register with the spike threshold in hex, the result should show in the second field and say" "3FF" = 1023 max threshold no killing. If it is 8GeV then 64 should correspond to 0x40.

This should be the same if you check on any TCC. That could help if you change that threshold for the test and you want to make sure al is ok.

12) VERY IMPORTANT: You should also save the strip masking file which is not attached to the new configuration, this should be fixed later on. logout of ecalpro and as yourself just do:

cd /nfshome0/ecaldev/francesca

updateTrgStripMasks.sh -k BEAMV5 -d ~ecalpro/DAQ/RunTime/GREN/config/db/StripConf.xml

You're ready to go! All the best

EMERGENCY:

These are the instructions in case it is intended to configure the spike killer with a file and not the DB. Note that this is not possible now, it requires a modification directly into the TCCBarrelSupervisor code. This should ONLY be used in case of extreme emergency and can only be setup by SuperExperts.

You ned to go to the ECAL_current folder: /nfshome0/ecalpro/DAQ/ECAL/ECAL_current/ecal/ecalTCC/TCCBarrel/supervisor/src/common/TCCBarrelSupervisor.cc change:

//configureSpikeThresholds(SPIKETHRESH_CONFIGURATION_FILE)

configureSpikeThresholdsFromDB() ; //modif-alex 18/02/2011

comment second line and uncomment first. make install

and then the rest should follow the instruction bellow.

Before doing anything, know that changing this parameter won't be visible in DB. It is the same as changing the mapping or timing delays, these are file directly loaded by the supervisor onto the TCC. Be careful when editing these kind of files.

Setting the sFGVB to kill spikes requires the 2 following step:

step1: set the sFVGB threshold (see previsous item). Changing this parameter would determine what threshold is used in the pattern recognition of spikes at the FE (FENIX) level. This means that if 2 crystals are found to be above this threshold in a strip the sFGVB is set to 1. The sFGVB is a OR of all 5 single strip sFGVB. The sFGVB==0 is case of spikes

step2: You need to set the threshold for which the TPG is zeroed if the sFGVB==0. That an be done by going to the following folder: /nfshome0/ecalpro/DAQ/RunTime/GREN/config/tcc (login as ecalpro) and edit the file SpikeThresholds.txt. You can see that the first column is the TCC id (37 to 72 Barrel TCC) and the 68 values are the spike killer threshold. Right now the value 1023 means that we kill spike if the TPG >= 128 GeV which is the saturation, so no killing applied. As a first working point we will use 8 GeV which means a threshold at 64. I have prepared files: SpikeThresholds _killing_8gev.txt.test and SpikeThresholds _killing_12gev.txt.test which should be replacing the SpikeThresholds.txt file. You just have to replace the file SpikeThresholds.txt by this one in that directory then the ECAL has to be reconfigured. A copy of the non killing file is save under the name: SpikeThresholds.txt.nokilling (that allows you to ge back to the previous setting and then reconfigure ECAL). Look at step 11) above to check on the board directly.

Useful scripts in case of emergency/expert debugging

Masking on the fly towers

Masking on the fly is possible when the run is ongoing and you have high rate coming from a specific place in ECAL. If the rate is way to high, the run will crash anyway. In order to mask a hot trigger tower on the fly, you must first identfy the location precisely using the DQM. You will need the TCC number and the Trigger tower number. NOTE: In the case of the EE, you have 4 TCCs per sectors so please look at the mapping carefully.

  • Please log onto cmsusr :

1) ssh cmsusr1 -XY

2) dev

3) cd /nfshome0/ecaldev/DAQ/ECAL/Sandbox_newTCC/ecal/ecalTCC/NewTCC/cmd/scripts

4) ./setTccTtActive.sh tccId TT 0 dev

NOTE: if you need to reactivate the Trigger Tower, please use ./setTccTtActive.sh tccId TT 1 dev

IMPORTANT : if the Trigger Tower needs to be masked for good, please mask in the DB [here] and ask to reconfigure ECAL when starting a new run.

What follows is outdated (Alex 19.05.2015)

  • For the barrel: as ecaldev in /nfshome0/ecaldev/utils directory type ./EBTCCmasks.sh EBnb TTnb. Example to mask EB-1 TT67, type ./EBTCCmasks.sh EB-1 67. If you want to mask a whole supermodule set TTNb = 0. Example: ./EBTCCmasks.sh EB-1 0 mask all towers of EB-1.
  • For the endcap: as ecaldev in /nfshome0/ecaldev/utils directory type ./EETCCmasks.sh EEnb TCCnb TTnb. Example to mask EE-1 TCC24 TT7, type ./EETCCmasks.sh EE-1 24 7. If you want to mask a whole EE sector type put TCCNb = 0 and TTnb = 0. Example: ./EETCCmasks.sh EE-1 0 0 masks all towers of EE-1. If you want to mask all the towers of a TCC, set TTnb = 0. Example: ./EETCCmasks.sh EE-1 24 0 masks all towers of TCC24 (belonging to EE-1).

Emergency masking of all ECAL TPG (EB and EE)

In order to mask large areas of ECAL such as a TCC, a FED, the ECAL barrel or the Endcap please use these scripts: You need to log as dev and go to /nfshome0/ecaldev/DAQ/ECAL/Sandbox_newTCC/ecal/ecalTCC/NewTCC/cmd/scripts

  • Masking all channels of a given TCC: ./disactiveAllTccChan.sh tccId dev
  • Masking all TT of a given FED: ./maskAllFedChan.sh FEDID
  • Masking all the Barrel: DeActivate _Barrel.sh
  • Masking all the EndCap: DeActivate _EndCap.sh

This is obsolete (Alex 19.05.2015): As ecaldev in /nfshome0/ecaldev/utils directory type sh maskALLECALTPG.sh that should mask all the barrel EB-, EB+ then the encaps EE_ and EE+. It will take a few seconds to be noticeable on the xmas monitoring.

WHEN TO USE IT: If a high trigger rate is reported, as usual look at xmas and then DQM for TP occupancy. If you see a hot tower mask it on the fly, if the rate is not reducing then look at the L1T monitoring where you can investigate RCT plots of EG candidates. Look at the rank (ET) plot and the EG candidate occupancy to see if there is clearly a problem in a given RCT region (for example an L1 candidate with the same ET value all the time). If so, the trigger shifter should be alarmed and then the RCT expert called. If the expert says, I see hot towers in ECAL and you know it is not causing any high rate then mask all of ECAL TPG to isolate the problem. This kind of issue should not be assigned to "ECAL TPG".

Opening a TCC GUI during global run

In order to open a GUI, it is possible at all times without interupting the global Run. As ecaldev, you just need to type tccGUI.sh TCCID dev and the GUI is openned.

NOTE: What follows is obsolete (Alex 19.05.2015)

So far, not very professional. As ecalpro connect to the VME pc (02l, 06h etc) that controls the TCC. If you don't know which crate it is, look here. Then:

  • cd DAQ/ECAL/ECAL_current/ecal ; . config/env.sh ; cd ecalTCC/scripts
  • For a TCC-EB:
    • if in the slot 5: ./runTCCGui 23010 50 55555 66666 or ./runTCCGui 23010 51 55555 66666 (try any of these 2 possibilities to see which one works)
    • if in the slot 11: ./runTCCGui 23010 53 55555 66666
    • if in the slot 17: ./runTCCGui 23010 54 55555 66666
  • For a TCC-EE:
    • if in the slot 2,3,4 or 5: ./runTCC48Gui 23010 50 55555 66666
    • if in the slot 8,9,10 or 11: ./runTCC48Gui 23010 52 55555 66666
    • if in the slot 14,15,16 or 17: ./runTCC48Gui 23010 53 55555 66666
* Click on the connect button to access to the board registers etc. if you don't see the connect button, resize the window, I know it sounds weird but it works!
  • Don't forget to kill the process thanks to killTCC command.

What are the important register to check? Well:

  • click on the Configuration tab to access to the register names and content (scroll down to see all registers).
  • QPLL_LOCK, QPLL_LOCKED_INT must be at 1 => meaning that the clock is stable. You can click several time on the "Read All Registers" button to read again the values.
  • TTCRX_READY must be 1 => meaning that the TTCRX chip is able to interpret BGo commands (start/stop/etc) coming the ECAL TTCci.

It would be great to have something more user friendly.

Opening a SLB GUI / Spying SLB during global run

To open a SLB GUI, do as ecaldev startSLBGui.sh [crateNb] [crateposition] where crateNb is 02, 03 etc and crateposition is d (=EB-), h(=EB+), l(=EE).

To spy SLB status, there are a bunch of useful scripts working in ecaldev with the same option as the SLB gui:

  • showSLBStatus.sh : smallest (common to all the others). Serial, Fw, position, nSynchs and status of controller
  • showSLBLatency.sh : showSLBStatus.sh + latency values
  • showSLBSummary.sh : showSLBStatus.sh + Synch status and error counters
  • showSLBReport.sh: gives all info.

Monitoring gap flag transition

Gap flag bit is used to check the timing alignment of the towers in the barrel and the pseudostrips in the endcaps. This bit is sent by the FE board with the usual data (trigger primitive, fine grain). For any TCC id you can check at which bunch crossing the gap flag flips from 0 to 1 (indicating the beginning of the LHC abort gap) thanks to this recipe:

  • connect to cmsusr node as pro
  • cd ~/TPG-Log
  • open a root session
  • type .x phaseMonitoring.C(tccid,10) where the tccid is the id of the TCC (what else can it be???)
  • if you want to check all the TCC at the same time, just type .x phaseMonitoring.C(-1,10). The 10 is used to present only the 10 bx around the one expected.

You will see a plot where on the X-axis you have the tower (1-68 in EB) or pseudostrip number (1-48 max in EE) and in the Y-axis the bunch crossing number at which the gap flag is raised. When everything is perfectly aligned all towers/strips must flip at the same time. The number in Z is the number of measurement. They should all have the same number. If not, a red point is drawn. Several reasons for such problem: 1) the channel is not properly timed-in. This is exactly what you want to monitor with the gapflag. 2) the channel may have hamming/link error and in that case the gap flag is not reliable: if you look to the plot of 1 TCC: h_17->Draw("text") ; to see plot of TCCid 17 for ex, when there is hamming/link pb, the gap flag BX is spread over the whole range of BX [0,3563]. Channels with known problems are recorded here.

Interpreting RCT monitoring

Even if it's not an ECAL expert task, it's always useful to understand errors belonging to the downstream systems. RCT online monitoring (go to to the RCT cell in the Li front page: http://l1page.cms/main/FirstPage) is rather limited but provide the following informations:

  • BC0 mismatch: it checks that both BC0 marker sent by the SLBs and the BC0 used in RCT subsystem are in sync. Technically: BC0 uses and XOR (with local BC0) and latches on a rising edge of the result of the XOR. Clearing this error (if any) during a run is fine - if BC0 is still in disagreement it will return on the next orbit, meaning that there is clearly a BC0 mismatch between ECAL and RCT.
  • SLB-RCT Link errors: In RCR jargon it's rather an hamming error (comparison of the hamming code computed from the data and the one sent with the data). Link errors are latched on rising edge (high=error). If a link error stays high,however, and the error is cleared, it isn't a reliable measurement. But link errors are cleared during the idle phase and then they should stay logic low after the idle->data transition. Probably not 100% ideal, but I wouldn't trust a link that goes into error after the idle->data. In other words, clearing this error has no effect.
  • Phase errors respond to 40/120 phase and the position of the idle->data is used to help determine this. Remember that the link SLB-RCT works at 120 MHz and not at 40 MHz. So 3 words are sent per 40 MHz clock. These errors are not latched, but if the phase is out of whack, it is expected it to stay that way. RCT relies on the stop,resync,start to make sure the links are aligned (and correct phase setup beforehand of course). The clear error has no effect on phase errors.

Downloading firmware in the TCC

In principle this task is done only by "Super Expert". But just in case, it may be useful to give here few instruction. For the TCC endcap, Irakli Mandjavidze wrote a very detailed documentation that can be found here: The TCC endcap programming documentation. For the TCC Barrel, there are few details given here.

Opening the dbGUI to change spike killing - gap flag configuration

log onto ecalod-xmas. the dev. cd /nfshome0/ecaldev/DAQ/Tools/ecalTools/ecalEdit and launch by issuing the command: ./runGuipro.sh --also-write

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng DBGuiMasking.png r1 manage 73.9 K 2016-07-01 - 11:05 TutanonSinthuprasith DBGuiMasking
PNGpng EB.png r1 manage 24.3 K 2009-11-04 - 22:26 PascalPaganini  
PDFpdf ECALTriggerShifter.pdf r1 manage 17998.1 K 2015-08-19 - 09:36 SilviaTaroni Ecal Trigger Shifter Hands on
Unknown file formatpptx ECALTriggerShifter.pptx r1 manage 3380.0 K 2015-08-19 - 09:36 SilviaTaroni Ecal Trigger Shifter Hands on
PNGpng EEMinus.png r1 manage 73.5 K 2009-11-04 - 22:27 PascalPaganini  
PNGpng EEPlus.png r1 manage 73.8 K 2009-11-04 - 22:27 PascalPaganini  
Unknown file formatpptx FE-TCCtiming.pptx r1 manage 58.6 K 2010-03-25 - 13:18 AndreDavid  
PDFpdf Tcc48ProgrammingTools.pdf r2 r1 manage 438.8 K 2010-06-04 - 14:11 PascalPaganini The TCC endcap programming documentation
Texttxt sfgvb_instructions.txt r1 manage 2.7 K 2010-08-18 - 15:21 DavidPetyt  
PNGpng slb.png r1 manage 107.0 K 2010-03-12 - 18:04 PascalPaganini  
Edit | Attach | Watch | Print version | History: r99 < r98 < r97 < r96 < r95 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r99 - 2017-04-20 - FanboMeng
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback