FTS web-service maintenance failover
What is it?
This is the procedure for taking down an FTS web-service node for maintenance.
It assumes you have more than one FTS web-service and have a load-balanced production alias.
When to use it?
When you want to take down an FTS web-service node for maintenance.
Failover
Providing you use SMS and wait a while, the failover is automatic. i.e. the node is taken out of the load-balanced DNS alias.
Note that you would expct to see double the load on the single FTS web-service machine that is left since all user requests will now be serviced by that node. This may cause some performance degradation.
Procedure
We use in the example the two-node load-balanced cluster at CERN. The SMS controls are part of the CERN CC environment; substitute appropriately for your site.
Take the node down for maintenacne
For example, a kernel upgrade:
sms set maintenance "kernel upgrade" "Upgrade to new SLC3 kernel" fts103
Wait for the node to drop out of the load-balanced alias.
host prod-fts-ws.cern.ch
Do what you need to do
e.g. reboot the node.
If you reboot the node, the FTS web-service daemon (Tomcat5) will be started on runlevels 3, 4 and 5.
Re-enable the node
sms set production none "Kernel upgrade done" fts103
Wait for the host to come back into the load-balanced alias before doing anything to the other node.
host prod-fts-ws.cern.ch
Last edit:
SteveTraylen on 2007-04-10 - 13:18
Number of topics: 1
Maintainer:
GavinMcCance