LHCOPN backup tests

Backup connectivity in the LHCOPN must be regularly tested.

Table of Content

Regular tests

Every site is responsible to verify the correct functioning if its backup connectivity.
It's recommended to test all the available backup path at least once every year. If possible, they should also be tested every time a major router configuration change happens.
The Tier1s have to run the tests 1,2,3,5 described below.
The Tier0 has to run the tests 4,5 described below.
The results of the passed year are discussed at the first LHCOPN meeting.

Process

  • Call for a maintenance window following the procedure, at least 10 (ten) working days before the date. The engineer in charge of running the tests has to provide email address, telephone number and Instant Message username in order be quickly contacted.
  • If the active collaboration of the engineers of a peering site is needed, they must be explicitly contacted.
  • Execute all the necessary tests
  • Report the results in the table provided here
  • Take the responsibility of fixing all the issues that may have been detected

Recommended Tests

1 - Backup connectivity to the Tier0

  • Do not stop normal data transfers during the maintenance
  • Shutdown the primary link to the Tier0
  • Verify that the Tier0 is still reachable from all its own LHCOPN prefixes.
  • Verify the impact on the running data transfer
  • Restore normal connectivity

2 - For sites with multiple backup paths to the Tier0

  • Verify the symmetry of the traffic.
  • Keeping the primary link down, shutdown the first backup link and verify again the connectivity with the Tier0
  • Repeat for all the other possible backup paths
  • Restore all the links

3 - Tier1-Tier1 Backup connectivity via Tier0

  • Do not stop normal data transfers during the maintenance
  • Shutdown the direct link between the two Tier1s
  • Verify the connectivity among all the LHCOPN prefixes of the two Tier1s
  • In case of multiple back up paths, verify symmetry
  • Restore normal connectivity

4 - Backup connectivity in case of failure of one CERN router

  • Do not stop normal data transfers during the maintenance
  • Power off one of the two CERN LHCOPN router
  • Verify that all the Tier1s prefixes are still reachable from the CERN one.
  • Verify the impact on the running data transfer
  • Power on the router again and repeat with the other router.

5 - Additional tests

  • verify that the monitoring tools (MDM, E2Emon, local monitorings) show the faults correctly and correctly notify the ongoing issues

Test reports

Backup Link test status

Site Date of last backup test report Have we a report
since 1 year?
CA-TRIUMF 2012-??-?? OK
CH-CERN 2012-11-22 OK no report
DE-KIT 2012-03-13 OK
ES-PIC 2012-10-23 OK
FR-CCIN2P3 2010-03-08 KO
IT-INFN-CNAF 2008-04-09 KO
NDGF 2008-04-09 KO
NL-T1 2009-02-10 KO
TW-ASGC 2010-12-28 KO
UK-T1-RAL 2010-08-24 KO
US-FNAL-CMS 2008-04-24 KO
US-T1-BNL 2008-03-27 KO

Resources






















































Edit | Attach | Watch | Print version | History: r16 < r15 < r14 < r13 < r12 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r16 - 2014-01-17 - BrunoHoeft
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCOPN All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback