WIB Troubleshooting

  • Make sure the WIB is powered on (and that the power supply output is enabled)

  • Errors in the DTS CDS, like LOL=1 or LOS=1 could mean the timing signal isn't getting to the WIB. It could also mean that you are looking for the timing signal in the wrong place, i.e. the backplane instead of the front panel. If it takes a long time to initialize, that could mean the connection isn't good (like when you put a 8 deg fiber into a 0 deg interface)

  • If the WIB gets stuck in W_RDY and W_ALIGN states, it could be because the timing group (partition) number isn't set correctly.

  • You can control the timing with the commands here: ~np04daq/notes/timing.txt and the WIB timing status with the BUTool command: status 10 DTS DTS_SI5344 DTS_CDS

FEMB Troubleshooting

  • If you aren't getting data from the FEMBs on the WIB, and random data from spy_femb, please try the following command.
"fread 1 VERSION_ID" to see if it properly returns the FEMB FW version and then do "fread 1 TIME_COMPILED" to see if it give a number different than the VERSION ID. If this all works then we know the FEMBs are up and the issue is someplace else.

  • Check FEMB ADC synchronization: run fread 1 ADC_ASIC_SYNC_STATUS where 1 is the FEMB number (1-4). It should be all 0 if the ADCs are synchronized.

Reporting Problems to the Elog

Please describe the problem and attach the output of one of the following commands to an elog entry:

cd /nfs/sw/wib/WIBSoftwareTrunk
source env.sh
cd scripts/
dump_vst_wib_rce.sh # for VST RCE WIB
dump_vst_wib_felix.sh # for VST FELIX WIB
dump_coldbox_wibs.sh # for cold box WIBs

artDAQ Exception Glossary

Invalid name
you tried to use a register name that isn't in the register table

Bad File
you can't access the BUTool table files, source env.sh

Convert State

  • On configuring:
    • If the timing system is locked in the RUN state:
      1. Start out in WAIT_FOR_SYNC 0x0
      2. Once receive a sync (or two) it goes into IN_SYNC
        • If another sync comes at the wrong time go into OUT_OF_SYNC
    • if the timing system is not locked (W_RDY 0x6?) state:
      • Convert state is IDLE
  • If in local clock mode, then there are corresponding FAKE_* modes

Timing System Endpoint State PDTS_STATE

Hex Code Binary Code Name Description
0x0 0b0000 W_RST Starting state after reset
0x1 0b0001 W_SFP Waiting for SFP LOS to go low
0x2 0b0010 W_CDR Waiting for CDR lock
0x3 0b0011 W_ALIGN Waiting for comma alignment, stable 50MHz phase; CDR is locked but data is bad
0x4 0b0100 W_FREQ Waiting for good frequency check
0x5 0b0101 W_LOCK Waiting for 8b10 decoder good packet
0x6 0b0110 W_RDY Waiting for time stamp initialisation
0x8 0b1000 RUN Good to go
0xA 0b1100 ERR_R Error in rx
0xB 0b1101 ERR_T Error in time stamp check

The system should quickly cycle through some of these states. If an endpoint is stuck in one of these states, then:

Red means really bad, possible hardware problem. Check timing system is powered on and cables are connected. Could mean no timing signal or that the CDR has failed to lock

Yellow means time stamps syncs aren't being sent. Check timing system configruation (software problem). Short term fix is to login to np04-srv-012 and run:

source /nfs/sw/timing/pro/software/timing-board-software/tests/env.sh
pdtbutler master DUNE_FMC_MASTER send TimeSync
pdtbutler master DUNE_FMC_MASTER send TimeSync

Green means good. This is how it should be

More Advanced Commands from Dan Gastler

Something to check when you are having problems like this is to look at the FEMB data using "spy_femb N"
This dumps the raw FEMB links, so I usually check links 0, 4, 8, and 12 to look at the first links on each FEMB.
If things are running, you should be able to see some FEMB frames in the data.
Look for something like the following: 
  0141: 0 0x1F
  0142: 0 0xFE
  0143: 0 0xE1
  0144: 0 0x1F
  0145: 0 0xFE
  0146: 1 0x3C
  0147: 1 0x3C
  0148: 1 0x3C
  0149: 1 0x3C
  0150: 1 0xBC
  0151: 1 0x3C
  0152: 0 0x71
  0153: 0 0xDD
  0154: 0 0xC7
  0155: 0 0x32
  0156: 0 0x00
  0157: 0 0x00
  0158: 0 0x00
  0159: 0 0x00

This is the end of one frame and the beginning of another. (1 0xBC followed by 1 0x3C is the start of a frame).

It is also useful to see if the FEMBs are responding to the WIB fast commands.
If you write a '1' to the FEMB_CNC.FEMB_STOP then all the spy_femb dumps should just give "1 0x3c" for idle.
Then writing a '1' to FEMB_CNC.FEMB_START should return them to something like the above dump.

You can also do "status 4 FEMB_CNT" to see if there are any counts in the bad RX Start or if the Frame rate is something besides 2000000 (+/- a few is ok.)
Here is what it looks like now:
>status 4 FEMB_CNT 
Process done
      FEMB1 LINK1| 0xCA739DE5|         2000000|            0|               0|
      FEMB1 LINK2| 0xCA73AA10|         2000000|            0|               0|
      FEMB1 LINK3| 0xCA73B663|         2000000|            0|               0|
      FEMB1 LINK4| 0xCA73C27F|         2000000|            0|               0|
      FEMB2 LINK1| 0xCA73CFEB|         2000000|            0|               0|
      FEMB2 LINK2| 0xCA73DC47|         2000000|            0|               0|
      FEMB2 LINK3| 0xCA73E87B|         2000000|            0|               0|
      FEMB2 LINK4| 0xCA73F4C5|         2000000|            0|               0|
      FEMB3 LINK1| 0xCA740171|         2000000|            0|               0|
      FEMB3 LINK2| 0xCA740D2D|         2000000|            0|               0|
      FEMB3 LINK3| 0xCA7419AF|         2000000|            0|               0|
      FEMB3 LINK4| 0xCA74258F|         2000000|            0|               0|
      FEMB4 LINK1| 0xCA743277|         2000000|            0|               0|
      FEMB4 LINK2| 0xCA743EEE|         2000000|            0|               0|
      FEMB4 LINK3| 0xCA744B83|         2000000|            0|               0|
      FEMB4 LINK4| 0xCA7458A4|         2000000|            0|               0|

SVN: 325

-- JustinHugon1 - 2017-11-02

This topic: CENF > WebHome > RDProjects > DUNEProtSP > DUNEProtSPSubprojects > DUNEProtSPDAQ > WIBTroubleshoot
Topic revision: r15 - 2018-07-27 - DavidGeoffreySavage
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback