Main FTS Pages |
---|
FtsRelease22 |
Install |
Configuration |
Administration |
Procedures |
Operations |
Development |
Previous FTSes |
FtsRelease21 |
FtsRelease21 |
All FTS Pages |
FtsWikiPages |
Last Page Update |
GavinMcCance 2008-09-16 |
ps aux | grep glite-transfer-channel-agent
3. Set these channels Inactive, so that new transfers will not be started. The downtime starts here. e.g.
for i in `ps aux | grep glite-transfer-channel-agent | grep edguser | \ awk '{print $11}' | sed 's/glite-transfer-channel-agent-urlcopy-//g'` ; \ do glite-transfer-channel-set -S Inactive $i ; \ done4. Wait until there are no jobs running. i.e. grep the process table for processes of the form
CHANNEL-NAME__*
:
ps aux | grep CERN-CERN__
5. Disable the agents such that they will not start on the next reboot:
ps aux | grep glite-transfer-channel-agent | grep edguser | awk '{print $11}' > /etc/glite-data-transfer-agents.disabled
6. Stop the agents:
service transfer-agents stop
7. Move to the backup machine. Check that the file /etc/glite-data-transfer-agents.disabled
does not exist and start the agents:
rm -f /etc/glite-data-transfer-agents.disabled
service transfer-agents start
8. Set all the channels Active again. The downtime ends here.
for i in `ps aux | grep glite-transfer-channel-agent | grep edguser | \ awk '{print $11}' | sed 's/glite-transfer-channel-agent-urlcopy-//g'` ; \ do glite-transfer-channel-set -S Active $i ; \ done
ps aux | grep glite-transfer-channel-agent-urlcopy-CERN-CERN
3. Set the associated channel Inactive. The agent downtime starts here.
glite-transfer-channel-set -S Inactive CERN-CERN
4. Wait until all the transfer processes for this agent have finshed. e.g.
ps aux | grep CERN-CERN__
5. Disable this agent by adding its name to the /etc/glite-data-transfer-agents.disabled
file:
echo "glite-transfer-channel-agent-urlcopy-CERN-CER" > /etc/glite-data-transfer-agents.disabled
6. Stop the agent (the instance name is the full process name).
service transfer-agents stop --instance glite-transfer-channel-agent-urlcopy-CERN-CERN
7. Move to the backup machine and start the new agent, checking that the file /etc/glite-data-transfer-agents.disabled
(if it exists) does not conmtain the agent name you are about to start.
grep glite-transfer-channel-agent-urlcopy-CERN-CERN /etc/glite-data-transfer-agents.disabled
service transfer-agents start --instance glite-transfer-channel-agent-urlcopy-CERN-CERN
8. Set the channel Active again. The downtime stops here.
glite-transfer-channel-set -S Active CERN-CERN
ps aux | grep glite-transfer-channel-agent | grep edguser | awk '{print $11}' > /etc/glite-data-transfer-agents.disabled
2. Stop the agents:
service transfer-agents stop
3. Start the agents on the backup machine.
service transfer-agents start
service transfer-agents start
The startup script will block for 1 minute attempting to take the lock from the agent running on the primary machine.
2. You should help it along - kill all the DB sessions from the primary machine, with particular emphasis upon the agent that is currently trying to start.
To do this log onto the DB session manager - see FtsProcedureDropDBLock20. There is only one DB session per agent. Soon after you kill the DB session from the primary machine, the agent should start OK on the backup.
3. Attempt to stop cleanly the agent on the primary machine as soon as possible. It will continue to attempt to re-establish the connection to the DB while it is still running.
service transfer-agents start
If an agent daemon does not start within a few seconds, do the procedure above for "If you cannot log onto the machine, but it is still running".