Checking the ILCDirac Server Status

Dirac Webinterface

Look at the SystemAdministration
  • Click on the machines and select "Show Errors": In many cases the error message is just a warning and can be ignored. This takes some time to get used to.

  • Look ad the System load

Checking the status on the machines

Logon to the VOBOX of interest: voilcdirac01, voilcdirac02, voilcdirac03, etc.

 ssh voilcdirac01 

Make yourself dirac user: you'll need to be dirac to start/stop services:

 sudo su dirac 
Source the dirac environment if needed
source /opt/dirac/bashrc
Then you should go to
/opt/dirac/startup
to have the services/agents running on the machine.

Check the disk space with

df -h

/opt/dirac should never be at a 100%. In that case, the services start to have problems. In the worst case, the web page fails because it cannot put anything in cache. To "fix" the situation, usually restarting the services is enough: the mySQL cache is emptied, and some disk space recovered. It allows agents to work (in particular the JobCleaningAgent). Now, how to do that?

It requires to know that all services/agents are running with the runit framework (http://smarden.org/runit/). Dirac comes with a set of handy commands to allow proper supervision:

 runsvctrl t path/to/service 
restarts the service at path/to/service (example: DataManagement_FileCatalog). To restart properly an agent, it is needed to create an empty file called stop_agent under /opt/dirac/control/Sytem/Agent.

 runsvctrl d path/to/service
takes down the service
 runsvctrl u path/to/service
restarts the service after using the previous.

One can also use

 runsvstat *
To see what is running and what is down. All on volcd01 should be running.

-- AndreSailer - 2014-10-30

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2014-12-08 - AndreSailer
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CLIC All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback