%CERTIFY% ---+!! <nop>%TOPIC% %STARTINCLUDE% ---+ Introduction This page describes the debugging procedures for FELIX systems. The first part of the twiki focuses on the *hardware* and is relevant for the on-call *readout expert*, the second is dedicated to the *software* and is relevant for on-call *DAQ/HLT experts*. %RED% <img alt="ALERT!" border="0" height="16" src="%PUBURL%/TWiki/TWikiDocGraphics/warning.gif" title="ALERT!" width="16" /> *Note on logging:* whenever an intervention is made on a system at P1 a dedicated [[https://atlasop.cern.ch/elisa/display][elog]] entry should be made describing what was done and why, plus the expected impact on the system.%ENDCOLOR% %BLUE%<b><img alt="ALERT!" border="0" height="16" src="%PUBURL%/TWiki/TWikiDocGraphics/warning.gif" title="ALERT!" width="16" /> Note of jargon, cards and devices: </b>one physical FELIX *card* is seen as two PCI *devices* by the computer. In many tools the card is selected via option =-c=, the device with =-d=. The enumeration starts at 0: card 0 includes devices 0 and 1. Card 1 corresponds to devices 2 and 3. Each device serves the e-links corresponding to one MTP connector (i.e. 12 GBT links per device on 24-channel cards). The TTC connector and BUSY are one per card.<span><br /></span>%ENDCOLOR% ---+ Hardware ---++ Diagnostic tools ---+++ Card identification, physical location Some FELIX PCs are equipped with two cards. Card #0 is installed in the bottom slot of the FELIX PC, card #1 on the top slot. T The location of the FELIX PCs is reported in the <a class="twikiTable" target="_self" title="Table">Table at the end of the page.</a> ---+++ Optical power on GBT links Each FELIX card is equipped with four optical transceivers called MiniPOD s. Two minipods emit light (TX), two receive light (RX). The emitted and received optical power can be visulised with the command <verbatim>flx-info -c <card number></verbatim> The output is pasted below. The power is reported in the second table. <pre><br /><span class='WYSIWYG_TT'>How to the read the table below:</span><br /><span class='WYSIWYG_TT'># = FLX link endpoint OK (no LOS)</span><br /><span class='WYSIWYG_TT'>- = FLX link endpoint not OK (LOS)</span><br /><span class='WYSIWYG_TT'>First letter: Current channel status</span><br /><span class='WYSIWYG_TT'>Second letter: Latched channel status</span><br /><span class='WYSIWYG_TT'>Example: #(-) means channel had lost the signal in the past but the signal is present now.</span><br /><span class='WYSIWYG_TT'> </span><br /><span class='WYSIWYG_TT'>Latched / current link status of channel:</span><br /><span class='WYSIWYG_TT'> | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |</span><br /><span class='WYSIWYG_TT'> |======|======|======|======|======|======|======|======|======|======|=======|=======|</span><br /><span class='WYSIWYG_TT'>1st TX | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) |</span><br /><span class='WYSIWYG_TT'>1st RX | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) |</span><br /><span class='WYSIWYG_TT'>2nd TX | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) | #(#) |</span><br /><span class='WYSIWYG_TT'>2nd RX | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) | -(-) |</span><br /><span class='WYSIWYG_TT'> </span><br /><span class='WYSIWYG_TT'> </span><br /><span class='WYSIWYG_TT'>Optical power (rx or tx) of channel in uW:</span><br /><span class='WYSIWYG_TT'> | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |</span><br /><span class='WYSIWYG_TT'> |=========|=========|=========|=========|=========|=========|=========|=========|=========|=========|=========|=========|</span><br /><span class='WYSIWYG_TT'>1st TX | 845.10 | 872.70 | 873.30 | 888.70 | 898.90 | 852.70 | 923.10 | 852.40 | 978.20 | 862.30 | 911.90 | 773.30 |</span><br /><span class='WYSIWYG_TT'>1st RX | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |</span><br /><span class='WYSIWYG_TT'>2nd TX | 911.40 | 964.50 | 899.80 | 1016.60 | 940.30 | 929.50 | 976.30 | 954.80 | 1065.40 | 974.00 | 965.20 | 902.70 |</span><br /><span class='WYSIWYG_TT'>2nd RX | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |<br /><br /><br /></span></pre> | *<span><b>Observation</b></span>* | *<span><b>Conclusion</b></span>* | *<span>Action</span>* | | RX power of a channel is 0 | No light coming from the corresponding fibre. | Cross-check with the sub-detector expert that light is expected. If yes, swap two MTP cables and check if the dark channel correlates with the cable or the connector on the card. If the problem is on the FLX card, replace the FELIX PC. | | RX power close to or below 100 uW | Too little light for safe data transmission | Swap two MTP cables and check if the dark channel correlates with the cable or the connector on the card. | | TX power significantly below 800 uW | Laser (MiniPOD) ageing | Replace PC at next occasion without beam | | RX light received but link status (first table) not #(#) | Link not aligned. | Run "flx-init -c <card number>" once more. Cross-check with detector expert that the front-end is in a good state. If the problem persists exchange the PC. | ---+++ TTC fibre diagnostic The status of the TTC optical connection can be checked with <verbatim>flx-info -c <card number></verbatim> The output contains a section called "TTC (ADN2814) status" reporting one of the messages listed below | *<span>Message</span>* | *<span>Action</span>* | | _The TTC optical connection is up and working_ | Nothing to do. | | _No light arriving. Check the fibre connection to the FLX-712_ | Swap the TTC fibre with a fibre from a neighbouring FLX-712 reporting no issues.. If no light is detected replace the FELIX PC. | | _Light is arriving but the FLX-712 may have an internal problem_ | <p>1. Swap the TTC fibre with a fibre from a neighbouring FLX-712</p> <p>2. Log on to the neighbouring FELIX (from where you have taken the fibre) and and run "flx-info -c [X] ADN2814"</p> <p>3. If on that FELIX you get "The TTC optical connection is up and working" you can conclude that the FLX-712 in the first FELIX has a problem. Replace the entire FELIX PC.</p> <p>4. If you get the same error in the other FELIX as well, the problem is upstream in the TTC system. You cannot fix this.</p> | ---+++ BUSY LEMO diagnostic The BUSY state can be manually switched. To verify the correct functionality 1. assert BUSY with <verbatim>fttcbusy -d <device number> -m 1 -i 0</verbatim> 2. measure the output of the LEMO connector with a voltmeter. The output must be 0V (BUSY on = logical zero) 3. de-assert busy with <verbatim>fttcbusy -d <device number> -m 1 -i 0</verbatim> 4. use the voltmeter to check that you have a logical 1 (between 3.3 and 5 V) If you do not get the expected logic levels something is wrong with the FLX-712. Replace the entire PC by a spare. ---++ Issues and solutions ---+++++ flx-init reports "Lock not found" (failed recovery of TTC clock) The output of flx-init produces the <verbatim>Card type: FLX-712 Configuring Si5345... Si5345 hard reset Si5345 configuration done Enabling Si5345 output Si5345: LOS register = 0x20 Si5345: Sticky LOS register = 0xf0 Si5345: LOL register = 0x02 Si5345: LOL register = 0x02 Si5345: LOL register = 0x02 Si5345: LOL register = 0x02 Si5345: LOL register = 0x02 Si5345 ERROR: Lock not found in 5 secs Si5345: Sticky LOL register = 0x06</verbatim> Follow the instructions of section "TTC fibre diagnostic" ---+++++ FELIX card not detected If a FELIX card disappears from the system query the driver with <verbatim>cat /proc/flx</verbatim> and look for errors such as | *Error message* | *Action* | | ERROR: 1 card(s) were ignored because of a problem with the power status | Reboot the PC. If the problem persists note down the output of =lspci | grep CERN= and replace the PC | ---+ Monitoring via IS, ERS ---+ Software ---+++ Diagnostic for the commissioning phase ---+ List of FELIX nodes The table below reports the list of all the FELIX nodes in USA15. The acronym HCA stands for Host Channel Adapter and indicated the type of Mellanox network card installed. <table border="1" cellpadding="0" cellspacing="1" id="Hosts"> <tbody> <tr><th>hostname</th><th>location</th><th>installed</th><th>firmware</th><th>HCA</th><th>notes</th></tr> <tr><th>NSW others</th> <td> </td> <td> </td> <td> </td> <td> </td> <td> </td> </tr> <tr> <td style="text-align: left;">pc-tdq-flx-nsw-spare-00</td> <td>2-5-1 U37</td> <td>YES</td> <td>GBT</td> <td>x1 25 GbE </td> <td> </td> </tr> <tr> <td style="text-align: left;">pc-tdq-flx-nsw-tp-a-00</td> <td>2-5-1 U34</td> <td>YES</td> <td>GBT</td> <td>x1 25 GbE </td> <td>(was pc-tdq-flx-nsw-stgc-tp-00)</td> </tr> <tr> <td style="text-align: left;">pc-tdq-flx-nsw-tp-c-00</td> <td>2-5-1 U32</td> <td>YES</td> <td>GBT</td> <td>x1 25 GbE </td> <td>(was pc-tdq-flx-nsw-mm-tp-00)</td> </tr> <tr> <td style="text-align: left;">pc-tdq-flx-nsw-spare-01</td> <td>3-5-1 U42</td> <td>YES</td> <td>GBT</td> <td>x1 25 GbE </td> <td> </td> </tr> <tr><th>NSW mm</th> <td> </td> <td> </td> <td> </td> <td> </td> <td> </td> </tr> <tr> <td style="text-align: left;">pc-tdq-flx-nsw-mm-06 _to_ 11</td> <td>3-5-1 U29/40</td> <td>NO</td> <td>GBT</td> <td>x1 25 GbE </td> <td> </td> </tr> <tr> <td style="text-align: left;">pc-tdq-flx-nsw-mm-00 _to_ 05</td> <td>2-5-1 U19/30</td> <td>YES</td> <td>GBT</td> <td>x1 25 GbE </td> <td> </td> </tr> <tr><th style="text-align: left;">NSW stgc</th> <td> </td> <td> </td> <td> </td> <td> </td> <td> </td> </tr> <tr> <td>pc-tdq-flx-nsw-stgc-08 _to_ 15</td> <td>4-5-1 U24/39</td> <td>NO</td> <td>GBT</td> <td>x1 25 GbE </td> <td> </td> </tr> <tr> <td>pc-tdq-flx-nsw-stgc-00 _to_ 07</td> <td>4-5-1 U06/21</td> <td>YES</td> <td>GBT</td> <td>x1 25 GbE </td> <td> </td> </tr> <tr><th style="text-align: left;">BIS 7/8</th> <td> </td> <td> </td> <td> </td> <td> </td> <td> </td> </tr> <tr> <td>pc-tdq-flx-rpc-bis-00</td> <td>5-5-1 U17</td> <td>YES</td> <td>GBT</td> <td>x1 25 GbE </td> <td> </td> </tr> <tr><th>LAr LDPB</th> <td> </td> <td> </td> <td> </td> <td> </td> <td> </td> </tr> <tr> <td>pc-tdq-flx-lar-ldpb-07 _to_ 13</td> <td>5-16-2 U26/39</td> <td>YES</td> <td>FULL</td> <td>x1 100 GbE </td> <td> </td> </tr> <tr> <td>pc-tdq-flx-lar-ldpb-00 _to_ 6</td> <td>4-16-2 U26/39</td> <td>YES</td> <td>FULL</td> <td>x1 100 GbE </td> <td> </td> </tr> <tr><th style="text-align: left;">L1Calo</th> <td> </td> <td> </td> <td> </td> <td> </td> <td> </td> </tr> <tr> <td>pc-tdq-flx-l1c-trex-01</td> <td>7-11-2 U24</td> <td>YES</td> <td>FULL</td> <td>x1 100 GbE </td> <td> </td> </tr> <tr> <td>pc-tdq-flx-l1c-trex-00</td> <td>7-11-2 U22</td> <td>YES</td> <td>FULL</td> <td>x1 100 GbE </td> <td> </td> </tr> <tr> <td>pc-tdq-flx-l1c-gfex-00</td> <td>7-11-2 U20</td> <td>YES</td> <td>FULL</td> <td>x1 100 GbE </td> <td> </td> </tr> <tr> <td>pc-tdq-flx-l1c-jfex-00</td> <td>7-11-2 U18</td> <td>YES</td> <td>FULL</td> <td>x1 100 GbE </td> <td> </td> </tr> <tr> <td>pc-tdq-flx-l1c-efex-00</td> <td>7-11-2 U16</td> <td>YES</td> <td>FULL</td> <td>x1 100 GbE </td> <td> </td> </tr> <tr><th>Tile</th> <td> </td> <td> </td> <td> </td> <td> </td> <td> </td> </tr> <tr> <td>pc-tdq-flx-til-00</td> <td>5-5-1 U25</td> <td>YES</td> <td>FULL</td> <td>x1 100 GbE </td> <td> </td> </tr> <tr><th style="text-align: left;">Spares</th> <td> </td> <td> </td> <td> </td> <td> </td> <td> </td> </tr> <tr> <td>pc-tdq-flx-spare-00 _to_ 01</td> <td>5-5-1 U22/24</td> <td>NO</td> <td>-</td> <td>-</td> <td> </td> </tr> </tbody> </table> ---+ List of SW ROD nodes | *hostname* | *location* | *HCA* | *notes* | | pc-tdq-swrod-rpc-bis-00 | 5-5-1 U37 | x1 25 GbE, x1 40 GbE | | | pc-tdq-swrod-til-00 | 5-5-1 U36 | x1 100 GbE, x1 40 GbE | | | pc-tdq-swrod-nsw-00 _to_ 08 | 5-5-1 U27/35 | x1 25 GbE, x1 40 GbE | | | pc-tdq-swrod-lar-06 _to_ 13 | 7-16-2 U23/30 | x1 100 GbE, x1 40 GbE | | | pc-tdq-swrod-lar-00 _to_ 05 | 6-16-2 U27/32 | x1 100 GbE, x1 40 GbE | | | pc-tdq-swrod-l1c-00 _to_ 05 | 7-11-2 U32/38 | x1 100 GbE, x1 40 GbE | | <!-- *********************************************************** --> <!-- Do NOT remove the remaining lines, but add requested info as appropriate --> <!-- *********************************************************** --> ----- <!-- For significant updates to the topic, consider adding your 'signature' (beneath this editing box) --> *Major updates*:%BR% -- Main.MarkusJoos - 2020-05-25 <!-- Person responsible for the page: Either leave as is - the creator's name will be inserted; Or replace the complete REVINFO tag (including percentages symbols) with a name in the form Main.TwikiUsersName --> %RESPONSIBLE% %REVINFO{"$wikiusername" rev="1.1"}% %BR% <!-- Once this page has been reviewed, please add the name and the date e.g. Main.StephenHaywood - 31 Oct 2006 --> %REVIEW% *Never reviewed* %STOPINCLUDE%
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r2
<
r1
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r2 - 2021-10-17
-
CarloAlbertoGottardo
Home
Plugins
Sandbox for tests
Support
Alice
Atlas
CMS
LHCb
Public Webs
Sandbox Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
PDF version
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Cern Search
TWiki Search
Google Search
Sandbox
All webs
E
dit
A
ttach
Copyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback