5.10 Transferring MC Sample/Data Files
Complete:
Detailed Review status
Please note that
PhEDEx was retired at the end of 2020 and was replaced with Rucio. This page is in the process of being replaced. In the meantime, please find details about how to use Rucio here:
https://twiki.cern.ch/twiki/bin/view/CMS/Rucio
Goals of this page:
This page is intended to familiarize you with making a Rucio rule and monitoring the progress of transfers to a site. In particular, you will learn:
- what Rucio is
- why you should transfer data to your T2
- how to make a Rucio rule at a site
- how to monitor Rucio transfers
Contents
What is Rucio?
Rucio is the CMS data placement tool. It sits above various grid middleware [SRM (Storage Resource Manager), FTS (File Transfer Service)] to manage large scale transfers between CMS centres. Rucio is a series of daemons which submit the requested transfers to FTS, verify that transfers have completed correctly and keep a catalogue of what data is available for transfer.
A normal user does not interact with this machinery. They will fill in a web form to make a request for data transfer to a site. Once this is approved the
PhEDEx machinery takes over and makes sure that the transfer is complete.
Why do I need to transfer MC sample or Data files?
For general analysis CMS will use T2 centres, as the T0 and T1 sites will be busy carrying out reconstruction, re-reconstruction, skimming and AOD production. This means MC sample or data files need to be transferred from the T1 centres out to the T2's. It is up to the people working at T2 sites to choose which MC sample or data files goes to the site, and make the appropriate
PhEDEx request.
If you run a CRAB job you and find that your MC sample or data is not located at a T2 centre you can request to have it transferred there using
PhEDEx.
You
do not need to have your MC sample or data at "your" T2 to run analysis, CRAB will run on it in any location. However, you may find it useful to make a copy at your local T2 as this will increase the number of sites you can run your analysis at.
For
PhEDEx requests, if you are working with an analysis group, you can choose that group, however, that may mean files will be deleted in a regular cleanup. Otherwise choose "local" for the User Group.
No permission to view CMS.PhedexUserDocsSubscribeData
Copy files from other sites using gfal-copy command
Instructions are on the
CRAB3 FAQ twiki on how to use gfal tools to find and copy files from another site's Storage Element.
Instructions for FNAL-LPC
The
copyfiles.py
script can be used to copy single files or a directory of files using gfal-copy or xrdcp from another site to T3_US_FNALLPC.
The
getSiteInfo.py
script can be useful to get the information of the site's endpoint to obtain a single file through gfal-copy, it is used by the copyfiles.py script above.
Review status
Substantial modifications due to depreciation of DBS. Instructions with snapshots for
PheDex subscription using DAS interface added.
Responsible:
KatyEllis
Last reviewed by: YourName - date