DIRAC3 Administration Procedure
Reboot of CERN VOBoxes
Rebooting a machine for a kernel upgrade
When a reboot is needed, this operation can be done by the VOC or AM as long as he/she has permissions to use SMS and to reboot the machine. There are 4 basic steps needed to reboot a machine without raising any alarms:
Putting the machine into maintenance state
This can be done on lxvoadm.cern.ch:
sms set maintenance 'kernel upgrade' 'scheduled software upgrade' volhcb01
Rebooting the machine
This has to be done in the machine itself with superuser permissions (either as root or via sudo). the machine can be rebooted:
# to shutdown the machine right away
sudo shutdown -r now
# to shutdown the machine at a specific time
# (useful if you have users connected via ssh to the machines, they will be warned)
sudo shutdown -r 16:00
(
If you are logging with your username and you get a message stating the machine is in maintenance and that you are not allowed to connect you can still login by pressing Ctrl+C
as soon as you get that message.)
Checking if the machine came back OK
After the machine reboots it absolutely necessary to check if there are no alarms before putting it into production again. Login to the machine and execute:
sudo lemon-host-check
This command must report that there are no exceptions active. It takes a few minutes until the machine finishes booting, until that happens this command will report
[ERROR] No monitoring agent process running
. If this is the case, please wait a couple of minutes and try again until you see something like this:
[INFO] lemon-host-check version 1.2.2 started by root at Thu May 22 15:40:46 2008 on volhcb01.cern.ch
[VERB] Exceptions: 0 - Running actuators: 0 - Disabled exceptions: 2 - State: Maintenance
If there are any active exceptions reported by lemon-host-check
the maintenance state shouldn't be cleared! Please contact vobox.support@cernNOSPAMPLEASE.ch if you don't know how to solve the problem.
Putting the machine back into production
If there are any active exceptions reported by lemon-host-check
the maintenance state shouldn't be cleared! Please contact vobox.support@cernNOSPAMPLEASE.ch if you don't know how to solve the problem.
This can be done again on lxvoadm.cern.ch:
sms clear maintenance 'kernel upgrade' volhcb01
# and just to be sure
sms get volhcb01
VOLHCB06
In order to clean the disk /storage, a cron is running every day. This cron perform several actions:
- if the disk usage is higger than 90%, the cleaning is activate
- the cleaning is performed in the SAM area where files are older than 2 months
- the cleaning is performed in the test area for tests productions, for file older than 4 days.
Creation of the WorkFlowLib tar ball
As soon as the
WorkflowLib package is tagged with a TAG name like wkf-vXrY, you can create the tar ball with the follwing command:
dirac-distribution -W wkf-vXrY
Then you need to copy this tar to a SE:
dirac-dms-add-file LFN:/lhcb/applications/WorkflowLib-wkf-vXrY.tar.gz /afs/cern.ch/lhcb/distribution/DIRAC3/WorkflowLib-wkf-vXrY.tar.gz CERN-disk