Site

Question ________________________________Response__________________________________
Site and Endpoints  
What is the site name? PIC
Which endpoint URLs do your archival systems expose? srm://srm.pic.es, and xrootd doors, which are typically accessed via xrootd redirectors. (experiment = atlas, cms, or lhcb)
How is tape storage selected for a write (choice of endpoint, specification of a spacetoken, namespace prefix). Depends on the VO. ATLAS and LHCb, selected by space token. CMS depends on namespace areas.
Queue
What limits should clients respect? For read access, the requests should come in big bulks, if possible
---> Max number of outstanding requests in number of files or data volume No limit. But if the requests are coming through SRM, there is a limit of 15k requests per VO.
---> Max submission rate for recalls or queries  
---> Min/Max bulk request size (srmBringOnline or equivalent) in files or data volume We allow a minimum of 1 request to unlimited, but we recommend to group the requests >= 1k.
Should clients back off under certain circumstances? Yes. The system is dimension to work fine taking into account the PIC Tier-1 size and the experiments expectations from the site. If the load is very high, then problems might appear.
---> How is this signalled to client? If the requests are coming through SRM, refuses occur when the requests reach 15k. This is a SRM limit, to protect the service. Reaching the limit is an exceptional situation, that rarely happens.
---> For which operations? If 15k is reached through SRM, read/writes are affected.
Is it advantageous to group requests by a particular criterion (e.g. tape family, date)? For writing, the disk servers are configured to send bunch of files per tape family to also reduce the tape re-mounts. For reads this helps to reduce the number of tape re-mounts, since datasets are stored in tapes according to predefined tape families.
---> What criterion? By tape family.
Prioritisation
Can you handle priority requests? Yes, Enstore allow to modify the priority of a specific request
---> How is this requested? This is only available for admin purposes. VOs are typically using the tape system with the same priority level.
Protocol support
Are there any unsupported or partially supported operations (e.g. pinning) ? Pinning is supported.
Timeouts
What timeouts do you recommend? We recommend using high timeouts (more than 48h) or donít use timeouts. The requests will be processed sooner or later. Duplicated requests generate to process the requests multiple times causing unnecessary overload.
Do you have hardcoded or default timeouts? Timeout for the HSM script is 864000 seconds (10 days). SRM timeouts and FTS timeouts are typically that high.
Operations and metrics
Can you provide total sum of data stored by VO in the archive to 100TB accuracy? Yes.
Can you provide space occupied on tapes by VO (includes deleted data, but not yet reclaimed space) to 100TB accuracy? Yes.
How do you allocate free tape space to VOs? After a tape purchase, we allocate the new free space to the VO according the pledge for that year. We monitor if a VO is close to exhaust the number of assigned tapes. We also have a tape pool with free tapes, used for tape migrations or if some experiment need some extra space. We also add +10% of pledges for LHC experiments, to ease tape operations (repacks).
What is the frequency with which you run repack operations to reclaim space on tapes after data deletion? We monitor this, and in particular we have a weekly digest that summarises all of the tapes subject to repack and recycle. We take actions as soon as there is a non-negligible amount of space to be recalled. Typically, several recycling/repacking campaigns are run along the year (more for CMS).
Recommendations for clients
Recommendation 1 Send read requests in bulks, to help for data pre-stage.
---> Information required by users to follow advice  
Recommendation 2 It could be desirable a better description on how to dimension the disk buffers for the LHC experiments. We suspect that the disk buffers in PIC are over-dimensioned (disk buffer = disk in front of tape for reads/writes), since we are a bit conservative and want to get rid of operational troubles.
Buffer Management
Should a client stop submitting recalls if the available buffer space reaches a threshold?  
---> How can a client determine the buffer used and free space?  
---> What is the threshold (high water mark)?  
---> When should the client restart submission (low water mark)?  
If the client does not have to back off on a full buffer, and you support pinning, how is the buffer managed?  
---> Is data moved from buffer to another local disk, either by the HSM or by an external agent?  
Additional questions
Should any other questions appear in subsequent iterations of this survey?  

-- OliverKeeble - 2018-01-30

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2018-04-19 - OliverKeeble
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    HEPTape All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback