TWiki> HEPTape Web>Survey>Survey_Conclusions (revision 3)EditAttachPDF

Archival Site Survey Conclusions

A first attempt to draw some conclusions from the survey.

Main Message

  • Submit recalls as far in advance as possible
    • Keep the queue as full as possible

Campaign Planning

  • Group recall requests by creation time or tape family if possible
  • Inform the site with as much warning as possible about recall plans
    • Allows synchronisation with local activities such as repack
  • Understand how priority requests are handled
    • Submitting priority requests will degrade throughput
    • Withholding recall submissions to keep latency down will degrade throughput
  • Synchronise data use with recalls to avoid purge/recall loops
  • The client should delete a staged file from the disk buffer once the workflow requiring the retrieval has completed.
  • Do not wait for the last byte to be recalled before processing

Client Behaviour

  • Consider queue size to be unlimited
    • FNAL, PIC - 15k per VO
    • KIT - 2k per pool
    • UNIKHEF-SARA - 1k (?)
  • Back off on SRM_INTERNAL_ERROR and SRM_FILE_BUSY
  • Use bulk requests
    • Best bulk recall size unknown. 1k is the reference, some sites want more, some want fewer.
  • Interaction rates under 10Hz typically acceptable
  • Run with no timeouts, or at least 48hrs
  • Ignore disk buffer occupancy
    • Exception: CNAF

Discussion points

Writing strategy

Should we make recommendations on writing strategy? Selecting particular pools or resources for particular types of data? Probability of future delete?? Perhaps a writing strategy is not possible as repack will eventually destroy locality (?)

Approach

The survey exposes the significant diversity in these systems. What should the strategy be? Produce some "lower common denominator" advice - do the following... it may help, and will never hurt Enumerate a small number of basic site characteristics and classify each site individually

Next steps

  • Experiments have to join the conversation
    • Which new scenarios to exploit tape are the experiments actually considering?
      • Carousels – R&D started at BNL
      • What else?
  • Understand what actions, if any, need to be taken client side
    • FTS configurations can be updated easily
      • e.g. size of a bulk request, number of outstanding requests
    • What about developments in FTS or in experiment-specific clients?
  • Track progress using the reported metrics
  • Finalise the advice to clients in a “mode d’emploi” for tape systems

-- OliverKeeble - 2018-05-07

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r3 - 2018-06-18 - OliverKeeble
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    HEPTape All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback