Storage intervention affecting severely OVM infrastructure


A clean-up action on the storage affected several production volumes used by production OVM.


  • All DBoD instances, application servers running on that OVM pool.

Time line of the incident

  • Fri Feb 22 09:34:04 CET: apps_ovm_g3 set offlined
  • Fri Feb 22 09:36:05 CET: apps_ovm2gen3a, apps_ovm2gen3b set offlined
  • ~ Fri Feb 22 09:42:00 CET: volumes set back online.


A clean-up operation on the storage for safety reasons is made up of two steps:
  • offlining the volumes: NFS accesses is blocked.
  • destroying the volumes a few days after

Destruction is always done a few days after if not unexpected impact is detected. Sadly a misunderstanding of the admin in charge took wrong volumes for the clean-up.

Follow up

Two incidents where open to follow-up this issue: &

Downtime for affected services varies from 30 to 60 minutes.

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2013-02-22 - RubenGaspar
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    DB All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback