DRAFT

WLCG Critical Services (Dev page!)

Introduction

This page lists per LHC experiment the set of services that are:

  • not operated by its own personnel, and
  • deemed critical for the successful operation of
    its grid workflows and for related activities.

Most of those services are hosted and operated by CERN-IT, while several Tier-1 sites and other partners also provide some.

For every relevant service, each experiment has provided indications of the effects of the service being unavailable. The impact indicates the effect on operations or people if the service were unavailable for a few days. The urgency indicates how quickly that impact would be reached. The criticality is defined as the product of urgency and impact. At the right hand side there are columns for the maximum criticality of a service across the experiments, the sum of the criticalities across the experiments and the weighted maximum criticality. The latter ranks services with identical maximum criticalities according to their respective sums of criticalities. Each numeric column can be sorted in ascending (descending) order by clicking once (twice) on its header.

Impact on operations and/or people

Level Definition
10 ops/VO severely affected
7 ops/VO notably affected
4 ops/VO moderately affected

Urgency levels

Level Definition
10 full impact reached within 6 hours
7 full impact reached within 1 day
4 full impact reached within 2 days
1 full impact reached after 2 days

Criticality levels

As a visual aid, 3 criticality ranges have been defined with distinct colors.
For a given experiment and for the maximum across the experiments, the ranges are as follows:

top %bv{ color="%b1%" val="70-100" }%
high %bv{ color="%b2%" val="40-69" }%
moderate %bv{ color="%b3%" val="0-39" }%

For the sum of the criticalities across the experiments:

top %bv{ color="%b1%" val="210-400" }%
high %bv{ color="%b2%" val="120-209" }%
moderate %bv{ color="%b3%" val="0-119" }%

The colors for the weighted maximum values correspond to those of the maximum values across the experiments.

Purpose of the tables

These tables are meant to clarify which services require which level of attention in their implementation and operation, to try and minimize the effects of service unavailability on the experiments, to the extent feasible. For example, a highly critical service should, if possible, be implemented and monitored in a more robust way than a less critical service. HA deployment methods, load-balancing and/or hot standby setups should be considered for such cases.

These tables do not make any promises about the level of support that can be expected for a given service: unless a specific arrangement was made for a particular service, the support level is best-effort for any service, though in practice it usually is compatible with the actual criticalities of the given service. If not, the service implementation and operation can be looked into.

Draft SNow support units

The (draft, suggested) SNow Functional Element (FE) or Service Element (SE) behind each service has now been added to the table.

For services that should be contacted through SNow instead of GGUS, the background of the FE or SE is set as in this

%go{ example. }%

We will ask IT groups to validate these soon.

CERN-IT services

Service SNow FE/SE urg imp crit urg imp crit urg imp crit urg imp crit    max sum wtd
    ALICE ATLAS CMS LHCb   crit crit max
Px-CC network %FE{ Datacenter-Network }% 7 10 %x_alice% 7 10 %x_atlas% 4 10 %x_cms% 10 10 %x_lhcb%   %max% %sum% %wmax%
LHC-OPN / LHC-ONE / GPN %FE{ Datacenter-Network }% 7 10 %x_alice% 7 10 %x_atlas% 7 10 %x_cms% 7 10 %x_lhcb%   %max% %sum% %wmax%
Oracle online %FE{ oracle-database }% 10 10 %x_alice% 10 10 %x_atlas% 10 10 %x_cms% 10 10 %x_lhcb%   %max% %sum% %wmax%
Oracle offline (inc. streaming) %FE{ oracle-database }% 4 7 %x_alice% 10 10 %x_atlas% 7 10 %x_cms% 10 10 %x_lhcb%   %max% %sum% %wmax%
DB-on-Demand %FE{ db-on-demand }%     %x_alice% 7 10 %x_atlas% 4 10 %x_cms% 10 10 %x_lhcb%   %max% %sum% %wmax%
CTA %SE{ CTA-service }% 4 7 %x_alice% 7 7 %x_atlas% 4 7 %x_cms% 4 7 %x_lhcb%   %max% %sum% %wmax%
EOS %SE{ eos-service }% 7 10 %x_alice% 7 7 %x_atlas% 7 10 %x_cms% 7 7 %x_lhcb%   %max% %sum% %wmax%
FTS %FE{ FTS }%     %x_alice% 10 10 %x_atlas% 4 7 %x_cms% 4 10 %x_lhcb%   %max% %sum% %wmax%
Global xrootd redirector %SE{ eos-service }%     %x_alice%     %x_atlas% 7 7 %x_cms%     %x_lhcb%   %max% %sum% %wmax%
Ceph %FE{ Ceph }%     %x_alice% 10 10 %x_atlas% 4 7 %x_cms% 10 10 %x_lhcb%   %max% %sum% %wmax%
CVMFS Stratum-0 %FE{ cvmfs }% 7 10 %x_alice% 7 10 %x_atlas% 4 7 %x_cms% 4 10 %x_lhcb%   %max% %sum% %wmax%
CVMFS Stratum-1 %FE{ cvmfs }% 4 7 %x_alice% 7 4 %x_atlas% 4 7 %x_cms% 7 10 %x_lhcb%   %max% %sum% %wmax%
Frontier and Squid %FE{ cvmfs }%     %x_alice% 7 7 %x_atlas% 7 10 %x_cms%     %x_lhcb%   %max% %sum% %wmax%
Batch service %FE{ LXBATCH }% 7 7 %x_alice% 7 7 %x_atlas% 4 7 %x_cms% 4 7 %x_lhcb%   %max% %sum% %wmax%
Dedicated batch %FE{ LXBATCH }%     %x_alice% 7 7 %x_atlas% 10 7 %x_cms%     %x_lhcb%   %max% %sum% %wmax%
CE %FE{ LXBATCH }% 7 7 %x_alice% 7 7 %x_atlas% 4 4 %x_cms% 4 7 %x_lhcb%   %max% %sum% %wmax%
VOMS %FE{ VOMS }% 4 10 %x_alice% 7 10 %x_atlas% 4 10 %x_cms% 7 10 %x_lhcb%   %max% %sum% %wmax%
MyProxy %FE{ MyProxy }% 4 10 %x_alice% 4 4 %x_atlas% 4 10 %x_cms%     %x_lhcb%   %max% %sum% %wmax%
CRIC %FE{ cric }% 1 4 %x_alice% 7 7 %x_atlas% 4 4 %x_cms% 1 4 %x_lhcb%   %max% %sum% %wmax%
WAU / WSSA %FE{ WLCG-WAU }% %FE{ WLCG-WSSA }% 1 4 %x_alice% 1 4 %x_atlas%     %x_cms% 1 4 %x_lhcb%   %max% %sum% %wmax%
BDII %FE{ BDII }%     %x_alice%     %x_atlas%     %x_cms% 1 4 %x_lhcb%   %max% %sum% %wmax%
Monit %FE{ monitoring }% 1 4 %x_alice% 7 7 %x_atlas% 7 7 %x_cms% 4 4 %x_lhcb%   %max% %sum% %wmax%
SiteMon %FE{ WLCG-Experiment-Probe-Submission }% 1 4 %x_alice% 4 4 %x_atlas% 7 7 %x_cms% 4 4 %x_lhcb%   %max% %sum% %wmax%
AI cloud services %FE{ cloud-infrastructure }% 4 7 %x_alice% 10 10 %x_atlas% 7 7 %x_cms% 10 10 %x_lhcb%   %max% %sum% %wmax%
Kubernetes %FE{ cloud-infrastructure }%     %x_alice% 10 10 %x_atlas% 7 7 %x_cms%     %x_lhcb%   %max% %sum% %wmax%
Lxplus %FE{ LXPLUS }% 4 7 %x_alice% 7 7 %x_atlas% 7 7 %x_cms% 10 7 %x_lhcb%   %max% %sum% %wmax%
AFS %FE{ AFS }%     %x_alice% 7 7 %x_atlas% 7 10 %x_cms%     %x_lhcb%   %max% %sum% %wmax%
GitLab %go{ %FE{ version-control }% }% 7 7 %x_alice% 7 4 %x_atlas% 7 7 %x_cms% 7 7 %x_lhcb%   %max% %sum% %wmax%
JIRA %go{ %FE{ JIRA-ITS }% }% 4 4 %x_alice% 7 4 %x_atlas% 4 4 %x_cms% 4 7 %x_lhcb%   %max% %sum% %wmax%
Twiki %go{ %FE{ twiki }% }% 1 4 %x_alice% 7 4 %x_atlas% 7 7 %x_cms% 4 4 %x_lhcb%   %max% %sum% %wmax%
Indico %go{ %FE{ indico }% }% 1 4 %x_alice% 7 7 %x_atlas% 4 7 %x_cms% 7 7 %x_lhcb%   %max% %sum% %wmax%
Video conf %go{ %FE{ zoom }% }%     %x_alice% 7 7 %x_atlas% 7 7 %x_cms% 7 7 %x_lhcb%   %max% %sum% %wmax%
Windows terminal service %go{ %FE{ windows-terminal }% }% 1 4 %x_alice% 1 4 %x_atlas%     %x_cms%     %x_lhcb%   %max% %sum% %wmax%

Services at other sites

Service    urg imp crit urg imp crit urg imp critSorted ascending urg imp crit    max sum wtd
  ALICE ATLAS CMS LHCb   crit crit max
GOCDB   1 4 %x_alice% 4 4 %x_atlas% 4 4 %x_cms% 7 7 %x_lhcb%   %max% %sum% %wmax%
MyOSG       %x_alice% 4 4 %x_atlas% 4 4 %x_cms%     %x_lhcb%   %max% %sum% %wmax%
GGUS   1 4 %x_alice% 4 4 %x_atlas% 7 7 %x_cms% 7 4 %x_lhcb%   %max% %sum% %wmax%
FTS       %x_alice% 10 10 %x_atlas% 4 7 %x_cms% 4 10 %x_lhcb%   %max% %sum% %wmax%
Stratum-1   4 7 %x_alice% 7 4 %x_atlas% 4 7 %x_cms% 7 10 %x_lhcb%   %max% %sum% %wmax%
Accounting Portal   1 4 %x_alice% 1 4 %x_atlas%     %x_cms% 1 4 %x_lhcb%   %max% %sum% %wmax%

Previous versions

Edit | Attach | Watch | Print version | History: r9 | r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r4 - 2021-07-01 - MaartenLitmaath
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback