FTS web-service maintenance failover

What is it?

This is the procedure for taking down an FTS web-service node for maintenance.

It assumes you have more than one FTS web-service and have a load-balanced production alias.

When to use it?

When you want to take down an FTS web-service node for maintenance.


Providing you use SMS and wait a while, the failover is automatic. i.e. the node is taken out of the load-balanced DNS alias.

Note that you would expct to see double the load on the single FTS web-service machine that is left since all user requests will now be serviced by that node. This may cause some performance degradation.


We use in the example the two-node load-balanced cluster at CERN. The SMS controls are part of the CERN CC environment; substitute appropriately for your site.

Take the node down for maintenacne

For example, a kernel upgrade:

sms set maintenance "kernel upgrade" "Upgrade to new SLC3 kernel" fts103

Wait for the node to drop out of the load-balanced alias.

host prod-fts-ws.cern.ch

Do what you need to do

e.g. reboot the node.

If you reboot the node, the FTS web-service daemon (Tomcat5) will be started on runlevels 3, 4 and 5.

Re-enable the node

sms set production none "Kernel upgrade done" fts103

Wait for the host to come back into the load-balanced alias before doing anything to the other node.

host prod-fts-ws.cern.ch

