Input provided by Miguel Branco, Dietrich Liko, Dario Barberis, Kors Boss
Storage Classes needed at various sites
Tape1Disk0, Tape1Disk1 and Tape0Disk1
Data flow between Tier0, Tier1, Tier2
??
Space reservation requirement
Static space reservation.
Space token descriptions per VO
ATLAS_PROD_ONLINE
ATLAS_PROD_ARCHIVE
ATLAS_PROD_REPROCESS
User files - ATLAS_USER
Special requirements
xrootd
Data access patterns
Data is accessed via Grid layers (SRM). Currently we copy each file over to the WN but we may change to a POSIX-like interface. Which one to use is not a strict ATLAS requirement but a function of their performance, according also to needs of each site/storage. These days, the DA group is running tests to determine the most suitable access pattern. Some more information from the DA group:
We assume that analysis jobs will use posix IO like protocols to access the data.
We assume a standard ATLAS analysis will access something like 100 to 1000 files in a job ....
A more difficult task will be analysis based on TAG navigation. In that case only few events will be read from a file. This might increase the number of files read by one job and increase therefore the load on the infrastructure.
The IO rate for a worker node should not be too dramatic. In the last year Athena was reported to read data with about 2MB/sec. This can then be scaled up according to the number of Worker Nodes reserved for analysis on a site.
SRM is needed to translate to sURLs to tURLs on LCG SE [...].
At this point in time there are problems reading data from various SE types, as we use older ROOT/POOL version in Athena. This problem is not fully solved yet and we cannot use all SE for analysis at this point in time.
xrootd should be evaluated and compared with the other protocols (rfio, dcap and/or GFAL).
We are currently defining some standard analysis jobs that should be used to perform this evaluations. As already mentioned there are technical problems using PosixIO protocol on some SE types, which we try to address by creating updated plugins for older ROOT versions.
Plans from 1st April 2007 till end of the year
We have 3 major commissioning activities before the Summer: Tier-0 internal tests, T0-T1-T2 data distribution and Calibration Data Challenge (CDC), plus the continued simulation production (increasing progressively in rate up to 8M events/week at the end of the year).
In Summer (July to end September/October) we are going to run the integration test of the FDR. The aim is to try to be ready for low-energy data taking at the end of October (one month in advance of the current schedule).
How much disk should Tier-1s and Tier-2s provide ? Derived from Megatable.
??
Of that amount of disk, how much must be set up for each of the storage classes T1D0, T1D1 and T0D1?
??
What network access is needed per instance?
??
Size of incoming/outgoing data buffers
let's assume a 2 day buffer. Therefore each site should have a disk buffer large enough to hold ~ 2 days of data taking data, plus at least as much for reprocessing (which should go at the same rate as data taking). Of course this depends largely on the storage configuration and the storages' ability to write to tape!!