Show Children Hide Children

Main FTS Pages
Previous FTSes
All FTS Pages
Last Page Update

FTS Transfer Timeouts

This page explains how the timeouts on a transfer are calculated. The timeouts considered here are referred only to the transfer phase (the actual gridftp transfer) for urlcopy channels or the srmCopy operation for srmcopy channels, i.e. all the other timeouts (on get/put operations, http timeouts etc.) are not considered.

In FTS versions <= 2.1, it was only possible to set:

  • for urlcopy channels:
    • an absolute value for the timeout
    • some extra transfer failure conditions based on transfer markers
  • for srmcopy channels:
    • an absolute value for the timeout, which was then multiplied for the number of files in the request
    • failure condition based on refresh interval between statusOfSrmCopyRequest operations
This model was inadequate to handle the case where files with very different sizes were transferred over the same channel.

Starting with FTS 2.2, more complex timeouts will be introduced for the transfers or copy operations.

See also FtsDbSchema.

Url copy channels

Transfer timeout = urlcopy_tx_to + tx_to_per_mb * file size in mb

Fail the transfer if:

  • urlcopy_txmarks_to is set to a value N (not null) and no transfer markers are received for more than N seconds (regardless whether the markers are indicating a transfer progress or not).
  • url_copy_first_txmark_to is set to a value N (not null) and the first non-zero transfer marker is not received within N seconds from the start of the transfer.
  • no_tx_activity_to is set to a value N (not null) and the transfer markers do not indicate any progress for more than N seconds.

If both urlcopy_txmarks_to and tx_to_per_mb are zero (or null), the transfer is considered to have no timeout.

urlcopy_tx_to acts as a lower limit on the transfer timeout, so that very small files will not be failed just because the calculated timeout was so small that the transfer didn't even have the time to start.

urlcopy_txmarks_to and url_copy_first_txmark_to have the same meaning as before: abort the transfer if you don't receive the first non-zero marker within a certain time; once you received the first non-zero marker, abort the transfer if you don't receive subsequent markers at least every N seconds, without caring if the marker is indicating a progress or not (if the transfer is stuck at 50% it's ok, as long as you keep receiving markers).

no_tx_activity_to is new in FTS 2.2 and introduces a check on the value of the progress reported by the markers: if the markers indicate no progress for more than a certain time, kill the transfer.


The following graph shows the behavior of some transfers failed because the transfer timeout was hit on the CERN-STAR channel on the T2 FTS service at CERN.


Please note that none of the above transfers reached 100%. Probably, rather than having an extension grace period (as suggested in BUG:40947 it would be better to kill those transfers sooner, thanks to the no_tx_activity_to.

Srm copy channels

Transfer timeout = srmcopy_to * number of files + tx_to_per_mb * (sum of the sizes of all files)

Fail the transfer if:

  • srmcopy_refresh_to is set to a value N (not null) and no status updates received for more than N seconds.

Last edit: AkosFrohner on 2009-06-03 - 11:13
Number of topics: 1

Maintainer: PaoloTedesco

Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2009-06-03 - AkosFrohner
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback