Site

Question ________________________________Response__________________________________
Site and Endpoints
What is the site name? BNL
Which endpoint URLs do your archival systems expose? srm://dcsrm.usatlas.bnl.gov
How is tape storage selected for a write (choice of endpoint, specification of a spacetoken, namespace prefix). We only serve ATLAS.
Queue
What limits should clients respect? Please send bulk requests, we prefer to do pre-staging
---> Max number of outstanding requests in number of files or data volume In theory, unlimited. We observed max record of 245k requests, and processed smoothly. Took about 5 days to complete. For my own reference: STAR 2016-09-28.
---> Max submission rate for recalls or queries  
---> Min/Max bulk request size (srmBringOnline or equivalent) in files or data volume Min: prefer no less than 1000. Max is unlimited, in theory. Try sending us as many as possible.
Should clients back off under certain circumstances?  
---> How is this signalled to client?  
---> For which operations?  
Is it advantageous to group requests by a particular criterion (e.g. tape family, date)? Yes, we constantly seeing repeat mounts in ATLAS tapes. A tape might be re-mounted again within less than 15 minutes, over 20 remounts a day, which really should be avoided. We try not to delay any request, but we may have to implement a way to delay processing such frequent mounted tapes.
---> What criterion? Please, do the pre-staging, send us all requests once, and send them fast. This is the best practice to handle sequential access media.
Prioritisation
Can you handle priority requests? Yes we can
---> How is this requested? Any tape that has at least 1 high priority flagged request, will be placed in front of the queue. Prioritized tape will wait and get the next available drive. All priority tapes will be processed in the same selected logic: by demand, FIFO, or LIFO.
Protocol support
Are there any unsupported or partially supported operations (e.g. pinning) ?  
Timeouts
What timeouts do you recommend? do not set timeouts! All staging are synchronized calls, every file will be processed sooner or later. No need to re-submit. Multiple repeated requests may be transferred multiple times, as we do not drop any requests.
Do you have hardcoded or default timeouts? Our tape storage system do not have any timeout. Our system tracks every steps for every file, all requests will be processed eventually.
Operations and metrics
Can you provide total sum of data stored by VO in the archive to 100TB accuracy? yes. We can provide total sum in byte accuracy. So we can convert it to any format.
Can you provide space occupied on tapes by VO (includes deleted data, but not yet reclaimed space) to 100TB accuracy? yes
How do you allocate free tape space to VOs? In HPSS, we assign free tapes to a storage class.
What is the frequency with which you run repack operations to reclaim space on tapes after data deletion? Due to the limited drive resources, we only do massive repack as needed.
Recommendations for clients
Recommendation 1 do pre-stage. To prevent the data lost from un-necessary excessive accessing. We need to use the tool in the right way, the way how it was designed for.
---> Information required by users to follow advice Tape is designed for archiving, not for random access. Use it cautiously, or it may eventually be damaged by all means. Our intention is to protect the data.
Recommendation 2  
Buffer Management
Should a client stop submitting recalls if the available buffer space reaches a threshold? No, because client doesn’t know anything about the buffer space and its status in our dCache. They should not back off. Note - Here, I assume the buffer refers to the buffer area in dCache (dCache tape read pools), not the disk buffer of HPSS itself. That is, the data is already staged from tape to the frontend dCache.
---> How can a client determine the buffer used and free space?  
---> What is the threshold (high water mark)?  
---> When should the client restart submission (low water mark)?  
If the client does not have to back off on a full buffer, and you support pinning, how is the buffer managed? We don’t support pinning. The way it works here is: FTS sends bringonline command to dCache, which then pass to HPSS. Once the data is staged from HPSS to the dCache buffer space, bringonline command succeeds. Then FTS sends another “transfer” command, to transfer the file to the final destination, be it either within the same site but on a different disk area, or a remote site. Our buffer (dcache tape read pools) is always full, whenever new files come in, dcache purges files out of the buffer in a FIFO manner. So if the second FTS “transfer” command didn’t come fast enough, and there are hugh amount of data staged from HPSS in a short time, the files can be purged out before transferred to the final destination.
---> Is data moved from buffer to another local disk, either by the HSM or by an external agent? By an external agent. As said above, it’s triggered by a FTS transfer request.
Additional questions
Should any other questions appear in subsequent iterations of this survey?  
-- OliverKeeble - 2018-01-30
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2018-05-04 - OliverKeeble
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    HEPTape All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback