The CRT produces 3 kinds of log files which I've sorted in order of likely relevance to end-users. To see any log files, you'll have to LogInToCRTServer first.

/log/boardreader/

All board reader's stdout is redirected to a log file. This can include some of the other kinds of log files below as well as the LArSoft message facility. At the beginning of each hour, all log files more than 3 days old are deleted by a cron job. You can probably find much of this information on Kibana without reading log files by hand. To view the log file for a particular board reader:

  1. cd /log/boardreader/ #works on servers that start other board readers too
  2. Log file names encode DAQ partition (part0), board reader name (crt0), and time when they were produced.
  3. For example:

    part0-crt3-20180920102434.log

    Was produced for the crt3 board reader (name of leaf node under BoardReaders/CRT in run control window) on partition 0 on September 20, 2018 for a run that booted at 10:24:34.

TRACE Buffer

TRACE can send log messages to a live buffer in addition to stdout. You might need TRACE to see some lower-level debugging message from the CRT board reader. The TRACE buffer shows the most recent messages first, and old messages are constantly pushed off the end of the buffer and lost (as far as I know). To view TRACE logs for the CRT board reader:

  1. Set up to use TRACE with the "artdaq folder selection", refered to as <artdaq area> here, in the partition that you want to debug:

    source <artdaq area>/setupDUNEARTDAQ_forTRACE #For old artdaq areas, setupDUNEARTDAQ_forTRACE might be missing. You should use setupDUNEARTDAQ_forRunning instead

  2. TRACE provides a couple of commands that send the current TRACE buffer to stdout. If you want to read them line-by-line and search them, I recommend you use less:

    tshow | tdelta -ct 1 | less
  3. You'll see something like:

    0 11-02 11:27:38.255805 177952 177952 19 MetricManager dbg . do_stop Complete
    1 11-02 11:27:38.255803 177953 177953 17 MetricManager dbg . do_stop Complete
    2 11-02 11:27:38.255801 177953 177953 17 MetricManager dbg . Joining Metric-Sending thread
    3 11-02 11:27:38.255794 177952 177952 19 MetricManager dbg . Joining Metric-Sending thread
    4 11-02 11:27:38.255793 177953 177953 17 MetricManager dbg . Stopping Metrics
    5 11-02 11:27:38.255792 177952 177952 19 MetricManager dbg . Stopping Metrics
    6 11-02 11:27:38.255783 177952 177952 19 MetricManager dbg . MetricManager is shutting down...
    7 11-02 11:27:38.255780 177953 177953 17 MetricManager dbg . MetricManager is shutting down...
    8 11-02 11:27:38.255553 177952 177952 19 crt0_xmlrpc_commander dbg . server terminated
    9 11-02 11:27:38.255539 177953 177953 17 crt1_xmlrpc_commander dbg . server terminated
    10 11-02 11:27:38.254723 178089 178089 22 MetricManager dbg . do_stop Complete
    11 11-02 11:27:38.254721 178089 178089 22 MetricManager dbg . Joining Metric-Sending thread
    12 11-02 11:27:38.254720 178089 178089 22 MetricManager dbg . Stopping Metrics

    In the first line:
    • 11-02 means the message was written on November 2
    • 11:27:38.255805 is the time in hours:minutes:seconds:fractions when the message was written
    • MetricManager could be either the name of an artdaq "component" or a name specified as the second argument to the TLOG() macro in a source file (like CRTInterface in TLOG(TLVL_INFO, "CRTInterface"))
    • dbg, which could also be the number of a custom TRACE logging level, tells us this is a TLVL_DEBUG message. Other message types are

Backend log files

The CRT backend writes its own log files at ~np04daq/crt/readout_linux/log/ . Their contents are probably not helpful to anyone except Camillo, but their timestamps can tell experts about when hardware problems occurred. There are 2 kinds of CRT backend log files:

  1. "out" files like usb_22_out.log: Generally tells us when the backend reaches certain stages of running. Once the CRT board reader has configured in a well-behaved job, the "out" logfile for each CRT board USB stream (1 per board reader) should look like:

    Run Number: 201809142781
    Target Path: CRTDAQ/DATA/
    Targer Directory: /data1/CRTDAQ/DATA/
    Inhibit: -1
    USB 22: inhibiting writing to file
    Inhibit: -3
    Run number to be assigned: 201809142781
    Data folder: /data1/CRTDAQ/DATA/
    Inhibit: -2
    USB 22: releasing inhibit for baseline
    Inhibit: -1
    USB 22: inhibiting writing to file
    Inhibit: 0
    USB 22: releasing inhibit

    Any other messages, especially at the end, probably indicate a hardware problem that will show up in the "err" file, and you should contact the CrtExperts. To map board reader name to USB stream, ask the CrtExperts or read about CrtHardware.

  2. "err" files like usb_22_err.log: "err" files only get written to when something goes wrong. They have always seemed rather cryptic to me, but Camillo can sometimes extract useful information from them. "err" files seem to be constantly appended to and never emptied, so you should not necessarily be worried that they have something in them. Things written many times to "err" files almost always indicate hardware problems that the CrtExperts need to handle. Occasionally, "err" files have 1 or 2 lines written to them in an otherwise-well-behaved run, and this hasn't seemed to stop us from taking good data yet. An "err" file should look like:

    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22
    libusb: warning [handle_timeout] async cancel failed -5 errno=22

    This is an error that seems to be particularly common and not necessarily a cause for alarm. Leave "err" files to the CrtExperts if you can.

-- Main.anolivie - 2018-11-02

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2018-11-02 - AndrewPaulOlivier
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CENF All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback