DRAFT
WLCG Critical Services (Dev page!)
Introduction
This page lists per LHC experiment the set of services that are:
- not operated by its own personnel, and
- deemed critical for the successful operation of
its grid workflows and for related activities.
Most of those services are hosted and operated by CERN-IT,
while several Tier-1 sites and other partners also provide some.
For every relevant service, each experiment has provided indications of
the effects of the service being unavailable. The
impact
indicates
the effect on operations or people if the service were unavailable
for a few days. The
urgency
indicates how quickly that impact
would be reached. The
criticality
is defined as the product of
urgency and impact. At the right hand side there are columns for
the
maximum criticality of a service across the experiments,
the
sum of the criticalities across the experiments and
the
weighted maximum criticality. The latter ranks services
with identical maximum criticalities according to their respective
sums of criticalities.
Each numeric column can be sorted in ascending (descending) order
by clicking once (twice) on its header.
Impact on operations and/or people
Level |
Definition |
10 |
ops/VO severely affected |
7 |
ops/VO notably affected |
4 |
ops/VO moderately affected |
Urgency levels
Level |
Definition |
10 |
full impact reached within 6 hours |
7 |
full impact reached within 1 day |
4 |
full impact reached within 2 days |
1 |
full impact reached after 2 days |
Criticality levels
As a visual aid, 3 criticality ranges have been defined with distinct colors.
For a given experiment and for the maximum across the experiments,
the ranges are as follows:
top |
%bv{ color="%b1%" val="70-100" }% |
high |
%bv{ color="%b2%" val="40-69" }% |
moderate |
%bv{ color="%b3%" val="0-39" }% |
For the sum of the criticalities across the experiments:
top |
%bv{ color="%b1%" val="210-400" }% |
high |
%bv{ color="%b2%" val="120-209" }% |
moderate |
%bv{ color="%b3%" val="0-119" }% |
The colors for the
weighted maximum values correspond to
those of the maximum values across the experiments.
Purpose of the tables
These tables are meant to clarify which services require which level
of
attention in their implementation and operation, to try and
minimize
the effects of service unavailability on the experiments, to the extent
feasible. For example, a highly critical service should, if possible,
be implemented and monitored in a more robust way than a less
critical service. HA deployment methods, load-balancing and/or
hot standby setups should be considered for such cases.
These tables do
not
make any promises about the level of support that can be expected
for a given service: unless a specific arrangement was made for a
particular service, the support level is
best-effort for any service,
though in practice it usually is compatible with the actual criticalities
of the given service. If not, the service implementation and operation
can be looked into.
Draft SNow support units
The (draft, suggested) SNow Functional Element (FE) or Service Element (SE)
behind each service has now been added to the table.
For services that should be contacted through
SNow instead of GGUS,
the background of the FE or SE is set as in this
We will ask IT groups to validate these soon.
CERN-IT services
Service |
SNow FE/SE |
urg |
imp |
crit |
urg |
imp |
crit |
urg |
imp |
crit |
urg |
imp |
crit |
|
max |
sum |
wtd |
|
|
ALICE |
ATLAS |
CMS |
LHCb |
|
crit |
crit |
max |
Px-CC network |
%FE{ Datacenter-Network }% |
7 |
10 |
%x_alice% |
7 |
10 |
%x_atlas% |
4 |
10 |
%x_cms% |
10 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
LHC-OPN / LHC-ONE / GPN |
%FE{ Datacenter-Network }% |
7 |
10 |
%x_alice% |
7 |
10 |
%x_atlas% |
7 |
10 |
%x_cms% |
7 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Oracle online |
%FE{ oracle-database }% |
10 |
10 |
%x_alice% |
10 |
10 |
%x_atlas% |
10 |
10 |
%x_cms% |
10 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Oracle offline (inc. streaming) |
%FE{ oracle-database }% |
4 |
7 |
%x_alice% |
10 |
10 |
%x_atlas% |
7 |
10 |
%x_cms% |
10 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
DB-on-Demand |
%FE{ db-on-demand }% |
|
|
%x_alice% |
7 |
10 |
%x_atlas% |
4 |
10 |
%x_cms% |
10 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
CTA |
%SE{ CTA-service }% |
4 |
7 |
%x_alice% |
7 |
7 |
%x_atlas% |
4 |
7 |
%x_cms% |
4 |
7 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
EOS |
%SE{ eos-service }% |
7 |
10 |
%x_alice% |
7 |
7 |
%x_atlas% |
7 |
10 |
%x_cms% |
7 |
7 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
FTS |
%FE{ FTS }% |
|
|
%x_alice% |
10 |
10 |
%x_atlas% |
4 |
7 |
%x_cms% |
4 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Global xrootd redirector |
%SE{ eos-service }% |
|
|
%x_alice% |
|
|
%x_atlas% |
7 |
7 |
%x_cms% |
|
|
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Ceph |
%FE{ Ceph }% |
|
|
%x_alice% |
10 |
10 |
%x_atlas% |
4 |
7 |
%x_cms% |
10 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
CVMFS Stratum-0 |
%FE{ cvmfs }% |
7 |
10 |
%x_alice% |
7 |
10 |
%x_atlas% |
4 |
7 |
%x_cms% |
4 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
CVMFS Stratum-1 |
%FE{ cvmfs }% |
4 |
7 |
%x_alice% |
7 |
4 |
%x_atlas% |
4 |
7 |
%x_cms% |
7 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Frontier and Squid |
%FE{ cvmfs }% |
|
|
%x_alice% |
7 |
7 |
%x_atlas% |
7 |
10 |
%x_cms% |
|
|
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Batch service |
%FE{ LXBATCH }% |
7 |
7 |
%x_alice% |
7 |
7 |
%x_atlas% |
4 |
7 |
%x_cms% |
4 |
7 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Dedicated batch |
%FE{ LXBATCH }% |
|
|
%x_alice% |
7 |
7 |
%x_atlas% |
10 |
7 |
%x_cms% |
|
|
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
CE |
%FE{ LXBATCH }% |
7 |
7 |
%x_alice% |
7 |
7 |
%x_atlas% |
4 |
4 |
%x_cms% |
4 |
7 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
VOMS |
%FE{ VOMS }% |
4 |
10 |
%x_alice% |
7 |
10 |
%x_atlas% |
4 |
10 |
%x_cms% |
7 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
MyProxy |
%FE{ MyProxy }% |
4 |
10 |
%x_alice% |
4 |
4 |
%x_atlas% |
4 |
10 |
%x_cms% |
|
|
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
CRIC |
%FE{ cric }% |
1 |
4 |
%x_alice% |
7 |
7 |
%x_atlas% |
4 |
4 |
%x_cms% |
1 |
4 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
WAU / WSSA |
%FE{ WLCG-WAU }% %FE{ WLCG-WSSA }% |
1 |
4 |
%x_alice% |
1 |
4 |
%x_atlas% |
|
|
%x_cms% |
1 |
4 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
BDII |
%FE{ BDII }% |
|
|
%x_alice% |
|
|
%x_atlas% |
|
|
%x_cms% |
1 |
4 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Monit |
%FE{ monitoring }% |
1 |
4 |
%x_alice% |
7 |
7 |
%x_atlas% |
7 |
7 |
%x_cms% |
4 |
4 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
SiteMon |
%FE{ WLCG-Experiment-Probe-Submission }% |
1 |
4 |
%x_alice% |
4 |
4 |
%x_atlas% |
7 |
7 |
%x_cms% |
4 |
4 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
AI cloud services |
%FE{ cloud-infrastructure }% |
4 |
7 |
%x_alice% |
10 |
10 |
%x_atlas% |
7 |
7 |
%x_cms% |
10 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Kubernetes |
%FE{ cloud-infrastructure }% |
|
|
%x_alice% |
10 |
10 |
%x_atlas% |
7 |
7 |
%x_cms% |
|
|
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Lxplus |
%FE{ LXPLUS }% |
4 |
7 |
%x_alice% |
7 |
7 |
%x_atlas% |
7 |
7 |
%x_cms% |
10 |
7 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
AFS |
%FE{ AFS }% |
|
|
%x_alice% |
7 |
7 |
%x_atlas% |
7 |
10 |
%x_cms% |
|
|
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
GitLab |
%go{ %FE{ version-control }% }% |
7 |
7 |
%x_alice% |
7 |
4 |
%x_atlas% |
7 |
7 |
%x_cms% |
7 |
7 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
JIRA |
%go{ %FE{ JIRA-ITS }% }% |
4 |
4 |
%x_alice% |
7 |
4 |
%x_atlas% |
4 |
4 |
%x_cms% |
4 |
7 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Twiki |
%go{ %FE{ twiki }% }% |
1 |
4 |
%x_alice% |
7 |
4 |
%x_atlas% |
7 |
7 |
%x_cms% |
4 |
4 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Indico |
%go{ %FE{ indico }% }% |
1 |
4 |
%x_alice% |
7 |
7 |
%x_atlas% |
4 |
7 |
%x_cms% |
7 |
7 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Video conf |
%go{ %FE{ zoom }% }% |
|
|
%x_alice% |
7 |
7 |
%x_atlas% |
7 |
7 |
%x_cms% |
7 |
7 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Windows terminal service |
%go{ %FE{ windows-terminal }% }% |
1 |
4 |
%x_alice% |
1 |
4 |
%x_atlas% |
|
|
%x_cms% |
|
|
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Services at other sites
Service |
|
urg |
imp |
crit |
urg |
imp |
crit |
urg |
imp |
crit |
urg |
imp |
crit |
|
max |
sum |
wtd |
|
|
ALICE |
ATLAS |
CMS |
LHCb |
|
crit |
crit |
max |
GOCDB |
|
1 |
4 |
%x_alice% |
4 |
4 |
%x_atlas% |
4 |
4 |
%x_cms% |
7 |
7 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
MyOSG |
|
|
|
%x_alice% |
4 |
4 |
%x_atlas% |
4 |
4 |
%x_cms% |
|
|
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
GGUS |
|
1 |
4 |
%x_alice% |
4 |
4 |
%x_atlas% |
7 |
7 |
%x_cms% |
7 |
4 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
FTS |
|
|
|
%x_alice% |
10 |
10 |
%x_atlas% |
4 |
7 |
%x_cms% |
4 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Stratum-1 |
|
4 |
7 |
%x_alice% |
7 |
4 |
%x_atlas% |
4 |
7 |
%x_cms% |
7 |
10 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Accounting Portal |
|
1 |
4 |
%x_alice% |
1 |
4 |
%x_atlas% |
|
|
%x_cms% |
1 |
4 |
%x_lhcb% |
|
%max% |
%sum% |
%wmax% |
Previous versions