HTTP-TPC (COPY) protocol updates
Original HTTP COPY specification that was initially implemented may not meet all our future requirements and this document should guide people who would like to propose improvements for the existing HTTP-TPC standard. This page should be used to collect information about all proposals, links to the related meetings or presentations and final decision if / when to implement new extension. Agreement to extend HTTP COPY involve storage developers, transfer tool providers (FTS, gfal2) and experiments / communities who would like to benefit from newly added functionality.
WLCG DOMA BDT meetings (
indico
) or TPC mailing list (
wlcg-doma-tpc AT cern.ch
) should be used to discuss proposals. An update of the
HTTP-TPC protocol draft
is necessary for all extensions that affects this protocol.
Protocol versioning
Not yet defined, all extensions must be backward compatible with
original HTTP COPY specification
Proposed extensions
# |
Report Date |
Status |
Proposer |
Short description |
Affected components |
protocol |
active |
passive |
client |
1 |
2022-Aug-23 |
open |
fts-devel |
Pass client (FTS) identification to the passive party ( DMC-1337 ) |
|
|
|
|
2 |
2022-Sep-06 |
open |
fts-devel |
SCITAGS HTTP headers ( DMC-1344 ) |
|
|
|
|
3 |
2022-Sep-21 |
open |
fts-devel |
FTS IPv6 monitoring - perf marker on close ( details ) |
|
|
|
|
4 |
2022-Nov-9 |
open |
P. Vokac |
Monitoring - transfer source and destination addresses ( related ) |
|
|
|
|
- Decide about features available in FTS/gfal2/SE for GridFTP protocol that are not generally implemented for HTTP (multistream, TCP buffers, timeout for stalled transfers, ...)
List of TransferHeader
in use
Header |
Status |
Short description |
TransferHeaderAuthorization |
standard |
Used to authorize active party HTTP request by passive party |
TransferHeaderVia |
proposed |
see: extension #1 |
TransferHeaderFlowExperiment |
proposed |
see: extension #2 |
TransferHeaderFlowActivity |
proposed |
see: extension #2 |
Performance markers
!PerfMarker |
Type |
Status |
Short description |
Perf Marker ... End |
|
standard mandatory |
Performance marker boundary |
Timestamp |
unix time |
standard mandatory |
Unix timestamp when active party generated performance marker |
Stripe Index |
int |
standard mandatory |
|
Stripe Bytes Transferred |
bytes |
standard mandatory |
How many bytes have been transferred |
Total Stripe Count |
int |
standard mandatory |
|
RemoteConnections |
list |
standard optional |
Comma separated network connections tcp:addr:port currently associated with transfer |
State |
int |
dCache proprietary |
A machine-readable description of the current status |
State description |
string |
dCache proprietary |
A human-readable description of the current status |
Stripe Start Time |
unix time |
dCache proprietary |
When the transfer was started |
Stripe Last Transferred |
unix time |
dCache proprietary |
When data was last send or received |
Stripe Transfer Time |
seconds |
dCache proprietary |
How long the transfer has been running |
Stripe Status |
enum |
dCache proprietary |
Current status of the transfer |
Stripe Source |
addr:port |
proposed extension #4 |
Transfer source address for specific connection |
Stripe Destination |
addr:port |
proposed extension #4 |
Transfer destination address for specific connection |
Discussion / details about proposed extensions
#3: FTS IPv6 monitoring - perf marker on close
Description: although RemoteConnections is optional field in the PerfMarker existing implementations should guarantee it is available on file close. Transferring small files (or not so small over fast networks) doesn't provide performance markers with transfer progress details, because some implementations shows first one only after 5s.
Accepted/rejected: ???? (date + link to meeting or details)
HTTP-TPC standard update pull request: ????
Storage developers plans / releases supporting this feature:
#4: Monitoring - transfer source and destination addresses
Description: active party in majority of our storage implementations first redirects TPC client to the disknode and only later HTTP-TPC transfer starts, but with dCache real IP address of active party is hidden from TPC client (FTS/gfal2), because headnode internally ask one of available disknode to execute HTTP-TPC transfer. For monitoring purposes (understanding problems with individual disknodes from FTS or central transfer monitoring) it would be useful to have final addresses used during data transfer in the PerfMarker. We need new optional PerfMarker called
Stripe Source
and
Stripe Destination
with source and destination addresses including port number for related connection (assuming that multistream transfers use different
Stripe Index
for each new connection). Data format is same as in case of one item from list of
RemoteConnections
PerfMarker, e.g.
Stripe Source: tcp:[2001:718:401:6017:2::28]:24081
Stripe Destination: tcp:[2001:1458:301:105::100:5]:8443
Implementation can choose to sent Stripe Source
and Stripe Destination
only in the one performance marker for given Stripe Index
.
Accepted/rejected: ???? (date + link to meeting or details)
HTTP-TPC standard update pull request: ????
Storage developers plans / releases supporting this feature:
Discussion / meetings
--
PetrVokac - 2022-10-19