SLS Tape Metrics Review - OBSOLETE (newer version here)


This page contains documentation related to the WLCG SLS tape questionnaire and review presented at the WCLG Management Board meeting in November 2010. The goal is to review existing SLS tape metrics and to ensure they are suited for reporting how well and how efficient a site is performing; that they are providing appropriate information from the user (experiment) perspective for improving operations and to allow meaningful comparisons between tape MSS sites.

SLS tape metrics questionnaire

Link Description
MB-112010 Initial presentation to the WCLG MB, November 2010
t0-t1-tape-monitoring-questionnaire.txt Questionnaire sent out to WLCG T0/T1 sites and WCLG VO's
MB-022011 WLCG presentation on questionnaire results, February 2011
MB-112011 WLCG MB presentation on final specification and implementation status, November 2011

Questionnaire Results

From the 17 sites/VO's contacted, 16 have returned a filled-in questionnaire. There is a handful of conclusions which can be derived from the received answers:

  1. All T0/T1 sites do have their own internal tape monitoring in place which is used for daily operations. The SLS tape metrics are rather used for "exporting" availability / efficiency metrics, than for internal use; for several sites, it is not clear who the target audience is supposed to be.
  2. Most sites look periodically at the SLS plots; sites tend to only look at their own SLS graphs - this is mostly done to check that the SLS export is working correctly (see previous point).
  3. From the 3 VO's responding to the questionnaire, the interest in the SLS tape metrics is currently moderate, with one VO not looking into them and the other two only occasionally.
  4. There are differences between the metrics considered relevant by the VO's and by the sites:
    • From the core metrics, VO's are more interested in availability, average file size and overall data transfer rate. Sites pay attention to tape repeat mounting and volume per mount which are not considered to be important by the VO's.
    • From other existing metrics, VO's are particularly interested in total volume read/written and tape access (queueing) time.
    • Looking at what new metrics could be added, the VO wish-list is the file queue length, the disk cache failure rate, and tapes containing inactive (not recently accessed) data. Drive transfer efficiency and drives used for housekeeping are only seen as interesting by the sites.

Initial proposal

From the questionnaire, it seems that the SLS tape views can be made more attractive by clearly setting the target audience to the VO's. Metrics could be readjusted to fit what the VO's want to see. This would allow to significantly simplify the current metric set, dropping many metrics which are not considered relevant and internal to the sites, and to concentrate on the half a dozen metrics the VO's are really interested in following.

The core metrics which can be proposed to be retained are :-

  • Availability (definition to be reviewed)
  • Average file size
  • Average data transfer rate
  • Average access queueing time
  • Fraction of inactive data
  • Total data stored

The proposed metric set is not tape-specific and could be used as well for non-tape based archival storage (e.g. cloud-based storage). Frequency, and exact specifications of each metric can be worked on once the metric set has been agreed on.

Excel summary sheet

Link Description
questionnaire-evaluation.xlsx summary of questionnaire answers

Per-site / VO questionnaire responses

  • VO's

Link VO From
ALICE.eml ALICE Latchezar Betev
(none) ATLAS Kors Bos
CMS.eml CMS Ian Fisk
LHCb.eml LHCb Roberto Santinelli

  • T0/T1 sites
Link Site From
ASGC.eml ASGC Jhen-Wei Huang
BNL.eml BNL Michael Ernst
CC-IN2P3.eml CC-IN2P3 Pierre-Emmanuel Brinette
CERN.eml CERN Vladimir Bahyl
CNAF.eml CNAF Luca dell'Agnello
KIT.eml KIT Jos van Wezel
FNAL.eml FNAL Jon Bakken
LHCb.eml LHCb Roberto Santinelli
NDGF.eml NDGF Mattias Wadenstein
NL-T1.eml NL-T1 Mark van de Sanden
PIC.eml PIC Gonzalo Merino
RAL.eml RAL Matthew Viljoen

Proposal specification

A specification for the above proposal can be found here: TapeMetricsSpec

Proposal implementation status

See TapeMetricsImplementation for the current implementation status for each T0/T1 site.

More information

Link Description
SLS-Tape T0/T1 tape metrics as reported in SLS
MssEfficiency Old definition of core tape metrics set

-- GermanCancio

Topic revision: r18 - 2018-01-19 - VladimirBahyl
