Issue: Resource Broker Disk Space requirements
Problem Statement
The resource broker stores sandboxes for the user's jobs (input and output).
A review of LHC capacity requirements is as follows
Using the data presented during the meeting on Monday, we arrived at
- 10 MB input sandbox
- 10 MB output sandbox
- 14 days retention time of sandbox data
The current
LSF batch throughput at CERN is 906 jobs/hour. Therefore, we would have 304416 jobs during a 14 day period on the current load if all jobs were submitted through the grid. We are expecting a 4-5 times growth of CPU capacity for LHC which would lead to at least 3 times more jobs, i.e. around 900,000 sandboxes.
With 20MB per sandbox, this is 18TB of disk space for all RBs.
While this seems high, I cannot find any errors in my calculation.
Maarten's views
Your figure corresponds to a worst-case scenario, in which no jobs are
cleaned up during two weeks. I would expect a job normally to be cleaned
up within a day after it has finished, in which case 2 TB would suffice.
However, RBs will also submit to other sites, so the amount of space for
CERN jobs should be multiplied by some factor. On the other hand,
it cannot be just the RBs at CERN that drive the grid: there will be RBs
at other (big) sites too, probably scaling with their computing resources.
All in all, some fudge factor is needed, but it need not be as high as 10.
I think that 18 TB would be fairly conservative; half of that may suffice.
--
TimBell - 16 Sep 2005
Topic revision: r1 - 2005-09-16
- TimBell