with some other Stratum1 outside CERN, why is the CERN one considered critical at all?
A: fail-over to a remote Stratum-1 may not be so transparent as expected (e.g. RAL recently failed to take over from CERN); ATLAS and CMS have similar values for Stratum-1
Openstack, Puppet: add with urgency 3 and impact 10
As batch
Aren’t these internals of CC operations?
We would like to understand this a bit more. For puppet, if it fails, machines can be exceptionally configured manually, it is not a blocker. For Openstack cloud, as soon as you run load balanced clustered services the failure of a VM should not be a problem either.
A: more and more services (for ALICE e.g. the validation cluster and the CAF) will depend on the Sandbox.OpenStack infrastructure (Cinder/Glance/CEPH/Keystone/Nova/...): if any of that breaks, a lot could go down with it...
GIT: add with urgency 5 and impact 9
what is stored on GIT that is critical?
A: the daily analysis tag and the weekly core revisions. The analysis is agile: if an important improvement cannot be committed, a lot of analysis may stop; this can only be tolerated for a while. If the service stops at a 'bad time', viz. at the weekly core release, this may also stop ongoing reconstruction/MC.
JIRA: add with urgency 5 and impact 9
why?
A: the production is driven by tickets in JIRA. If that service is down, new productions cannot be started.
CERN Oracle Tier-0: urgency 4and impact 6
Terminal servers: add with urgency 3 and impact 2
Still used but several workarounds if they don’t work
What are these used for? how is experiment operations affected if there are no terminal servers? We see it similar to lxplus, if lxplus is not critical why the TS should be?
A: the urgency and impact are low, but the servers are used, just like lxplus.
NICE AD servers: urgency 3 and impact 2
Credentials cached
DNS: urgency 3 and impact 2
Local caches, impact only for new devices
could you give us more details on this one?
A: the DNS is of course a critical service! The urgency and impact are mitigated by local caches.