DIRAC3 Administration Procedure

Reboot of CERN VOBoxes

Rebooting a machine for a kernel upgrade

When a reboot is needed, this operation can be done by the VOC or AM as long as he/she has permissions to use SMS and to reboot the machine. There are 4 basic steps needed to reboot a machine without raising any alarms:

Putting the machine into maintenance state

This can be done on lxvoadm.cern.ch:
sms set maintenance 'kernel upgrade' 'scheduled software upgrade' volhcb01

Rebooting the machine

This has to be done in the machine itself with superuser permissions (either as root or via sudo). the machine can be rebooted:
# to shutdown the machine right away
sudo shutdown -r now
# to shutdown the machine at a specific time
# (useful if you have users connected via ssh to the machines, they will be warned)
sudo shutdown -r 16:00

( If you are logging with your username and you get a message stating the machine is in maintenance and that you are not allowed to connect you can still login by pressing Ctrl+C as soon as you get that message.)

Checking if the machine came back OK

After the machine reboots it absolutely necessary to check if there are no alarms before putting it into production again. Login to the machine and execute:
sudo lemon-host-check
This command must report that there are no exceptions active. It takes a few minutes until the machine finishes booting, until that happens this command will report [ERROR] No monitoring agent process running. If this is the case, please wait a couple of minutes and try again until you see something like this:
[INFO] lemon-host-check version 1.2.2 started by root at Thu May 22 15:40:46 2008 on volhcb01.cern.ch
[VERB] Exceptions: 0 - Running actuators: 0 - Disabled exceptions: 2 - State: Maintenance

If there are any active exceptions reported by lemon-host-check the maintenance state shouldn't be cleared! Please contact vobox.support@cernNOSPAMPLEASE.ch if you don't know how to solve the problem.

Putting the machine back into production

If there are any active exceptions reported by lemon-host-check the maintenance state shouldn't be cleared! Please contact vobox.support@cernNOSPAMPLEASE.ch if you don't know how to solve the problem.

This can be done again on lxvoadm.cern.ch:

sms clear maintenance 'kernel upgrade' volhcb01
# and just to be sure
sms get volhcb01

VOLHCB06

In order to clean the disk /storage, a cron is running every day. This cron perform several actions:

  • if the disk usage is higger than 90%, the cleaning is activate
  • the cleaning is performed in the SAM area where files are older than 2 months
  • the cleaning is performed in the test area for tests productions, for file older than 4 days.

Creation of the WorkFlowLib tar ball

As soon as the WorkflowLib package is tagged with a TAG name like wkf-vXrY, you can create the tar ball with the follwing command:

dirac-distribution -W wkf-vXrY

Then you need to copy this tar to a SE:

dirac-dms-add-file LFN:/lhcb/applications/WorkflowLib-wkf-vXrY.tar.gz /afs/cern.ch/lhcb/distribution/DIRAC3/WorkflowLib-wkf-vXrY.tar.gz CERN-disk
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2009-02-05 - JoelClosier
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCb All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback