EGEE emergency procedure for the CERN site in case of an Important but not Major Incident.

An important incident that is not a major incident is typically an incident that, concerns an outage of a single service or more than one closely related services.

Examples are: failure of a number of AFS machines, important network routers malfuctioning, important external Internet links unavailable, severe database problems etc

During extended working hours 8:00-18:00 Mon-Fri

Either the CERN ROC or the Manager On Duty (MOD) will take care of declaring the downtime in the GOCDB or sending an EGEE broadcast, as appropriate. If a broadcast is sent, the IT manager on duty and the CERN ROC (if not included as a recipient already) should be copied as well.

Details on how to declare a downtime in GOCDB can be found on

The link below should bring you directly to the form to declare a down time for CERN-PROD. To access the GOCDB a certificate should be loaded in the browser.

alternatively the CERN-PROD site page has a link at the bottom to declare a down time

Outside of extended working hours including weekends and holidays

There is nothing foreseen at the moment.

EGEE emergency procedure for the CERN site in case of a Major incident.

By Major Incident we mean an incident where for instance the network is down, there is an electrical problem in the Computing Centre (CC), or a large part of the Cern infrastructure is unavailable.

Other examples are loss of ventilation, fire, flooding, serious electrical problem, major network outage etc and any incident that involves serious personal injuries.

CERN is referred to in the EGEE jargon as the site "CERN-PROD".

During extended working hours 8:00-18:00 Mon-Fri

The Mod (Manager on Duty) will send a broadcast using the CIC portal and declare the down time in the GOCDB (details provided above).

To send an EGEE broadcast one should have a CERN CA certificate loaded in the browser. Please refer to the [[]CERN CA website]] if you need assistance in requesting a certificate or loading it into your browser.

The website address of the EGEE broadcast is

targets should be:

  • WLCG Tier-1 contacts
  • CIC on duty
  • All Roc managers
  • All production site admins
  • All VO managers

The options used for the last broadcast sent for the latest CERN wide power cut is

  • Screen shot of a typical broadcast in case of a power cut:

Outside of extended working hours including weekends and holidays

The CC Ops will deal with the incident. Their job will be to notify by phone a list of contacts of sites that have 24h support and can send a broadcast on behalf of CERN.

A list of contacts can be found here. It is extracted from the site contacts provided in the GOC database (GOCDB).

Provisionally this list will be maintained by the CERN ROC. In the future this list will be moved to either the CIC portal or the GOCDB itself and automatically updated there.

Still to be clarified

-A list of services which if are down or affected and to which degree (eg Remedy down for >30 mins etc), a notification should be sent during working hours

- Any additions/mods to the list of services provided for the case 1 above for the operators.

-- DianaBosio - 06 Jun 2008

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2008-06-06 - DianaBosio
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback