ROS oncall expert

Last recipe (Feb 8 2011)

  • Copy setupDAQ from ../joos directory
  • "source setupDAQ 301d"
  • more /proc/robin
  • cd /clients/daq_area/tools/ros/ROBIN
  • upload_fw_301_classic 0 (ou algo parecido)
  • robinscope -m 0 -l 0 -s (i checked that swVersion was 0xa0018 as Louis said it should be)
  • Useful place to look for tips cd ../joos .bashrc .bash_profile

Sanity check after intervention (April 15 2011)

  • Ping the machine
  • After new cooler instalation. From grape: "IPMI sensor data"
  • Check robin sanity (non lembro con cal boton)

About optical links (Sept 26 2011)

  • A single ROL can be restarted with “robinscope –m X –l Y –r”. This may resolve some S-Link related problems. During a run it should only be done if the respective link has been disabled (stopless removal) by the control room.
  • If this does not help one could restart the Robin with “robinscope –m X –l Y –R”. In that case all active ROLs of that ROBIN (up to 3) have to be disabled.
  • In case this does not work either the next step is a reboot of the PC.
  • It is difficult to define when to use robinscope and when to do the reboot immediately (as you did). In your particular case rebooting the computer from home was a good decision because it is the operation with the highest potential for fixing an intermittent problem and the time that one could save with robinscope is minimal.

MBOX2 error (April 15 2011)

  • After installing new cooling system, when crosschecking the system, error from robin 1: MBOX2 value different from expected reference (is the same for everybody...5ff..)
  • This MBOX2 value tells us the status of the application that is run in the power-pc in the robin
  • To cure the problem:
  1. Warm reset of robin
    • Login into the machine
    • Set the enviroment (typical alias go301)
    • "robinscope -m 1 -R" (if you don't remember "robinscope -h" for help). Resets the robin.
  2. Reset the PC (if warm reset of robin has not worked).
    • Using grape (doing it directly on the PC is equivalent). Press "IPMI power down".
    • You can crosscheck that everything is down pressing "IPMI sensor" (no values should show up in any entry).
    • "IPMI power up".
  3. Change robin (if warm reset of robin and PC reset have not worked).

Useful twikis

-- TeresaFonsecaMartin - 08-Feb-2011

This topic: Sandbox > TeresaFonsecaMartinSandbox > BimbaRos
Topic revision: r3 - 2011-09-26 - unknown
This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback