Transfer Watch Procedure
In general, we use the data transfer management system,
PhEDEx, to do transfer watch. We usually don't care about T3 transfers. Below is the primary steps we are following in the watch:
<1a> Quality Plots :
<1b> T1 Disk Latency Overview :
- https://transferteam.web.cern.ch/transferteam/LatencyOverview/
- This page shows all subscriptions to T1_*_Disk endpoints
- Check current subscriptions, if "Estimated Arrival Time" is NA, that means this transfer has problems
- To see what the problem is, you can click problematic subscription, and see the reason of each dataset's problem by Problem column
- Please note that preprocessing subscriptions are more important than other subscriptions to Disk endpoints. Please check "notes" section in this page.
<2> Recent Errors
- https://cmsweb.cern.ch/phedex/prod/Activity::ErrorInfo
- We can filter the transfer errors of the problematic link by inputting the site name in this page
- Then check the detail log, it usually gives the name of the error, and tells you at which site there are errors.(not always)
- Thus we open a GGUS ticket to the problematic site, and ask them to check. <6>
<3> Rate
<4> Transfer Details
- https://cmsweb.cern.ch/phedex/prod/Activity::TransferDetails
- Check transfer detail at this moment.
- If there are files stuck in assigned state. But we don't see a problem there. It is a candidate to open a ticket.
- If many files are transferring at this moment, we would wait a bit and see how it goes, maybe site people noticed problem themselves, fixed already, and now everything should be fine.
<5> Routing
ASO Transfers
<1c> ASO Transfer Monitoring :
<6> GGUS Ticket
<7> Transfer Trouble Shooting
- This page helps us to debug the transfer issues, although we are not responsible for debugging transfer problems, it helps us to do transfer watch.
- Some twiki pages for common transfer errors: