FTS Daily Procedure
This procedure should be followed every day to check the general health of the ongoing transfers and try to find problems. It is also the procedure to rescue jobs that the FTS has given up on.
List each of the channels:
glite-transfer-channel-list
For each channel, look for jobs in the
Hold
state. A job in
Hold
state will contain files that have been tried unsuccessfully 3 times:
glite-transfer-list -c CERN-RAL Hold
or:
glite-transfer-list -c CERN-RAL Hold | wc -l
to count them.
You should sample some of the jobs amd see what the problems were:
glite-transfer-status --verbose -l fb9080da-1a2e-11da-9a6c-c2709c5b2fc9
An example output:
Request ID: fb9080da-1a2e-11da-9a6c-c2709c5b2fc9
Status: Hold
Channel: CERN-GRIDKA
Client DN: /C=UK/O=eScience/OU=Edinburgh/L=NeSC/CN=andrew cameron smith
Reason: <None>
Submit time: 2005-08-31 14:53:25.000
Files: 1
Priority: 3
VOName: lhcb
Done: 0
Active: 0
Pending: 0
Canceled: 0
Canceling: 0
Failed: 0
Finished: 0
Submitted: 0
Hold: 1
Waiting: 0
CatalogFailed: 0
Source: srm://castorgridsc.cern.ch:8443/srm/managerv1?SFN=/castor/cern.ch/grid/lhcb/production/DC04/v2/00000849/DST/00000849_00000696_5.dst
Destination: srm://f01-015-103-e.gridka.de:8443/srm/managerv1?SFN=/pnfs/gridka.de/sc3/lhcb/unregistered/andrewstest/lhcb/production/DC04/v2/00000849/DST/00000849_00000696_5.dst
State: Hold
Retries: 3
Reason: Failed on SRM put: Failed SRM put call. Error is
RequestFileStatus#-2147415143 failed with error:[ GetStorageInfoFailed : file exists, cannot write ]
--
GavinMcCance - 30 Sep 2005