Message forwarded by Maria Girone:

PVSS2COOL incident 27-6

Incident report and affected services:

Sunday afternoon 27-6 Viatcheslav Khomutnikov (Slava) from Atlas reported to the Physics DB service that the online reconstruction was stopped because of an error was returned by the PVSS2COOL application (on Atlas offline DB). The error started appearing on Saturday (26-6) evening.

Issue analysis and actions taken:

The error stack reported by Atlas indicated that the error was generated by a 'drop table operation' being blocked by the custom trigger set up by Atlas to prevent 'unwanted' segment drop. The trigger is operational since several months. This information was fed back by Physics DB services to Atlas on Sunday evening. On Monday morning Atlas still reported this blocking issue and upon further investigation they were not able to find which table the application (PVSS2COOL) wanted to drop (therefore causing the blocking error) as the issue appeared in a block of code responsible for inserting data. Physics DB service in collaboration with Atlas DBAs then ran 'logmining' of the failed drop operation and found that the application was indeed trying to drop some segments on the recycle bin of the schema owner (ATLAS_COOLOFL_DCS). Further investigations with SQL trace by the DBAs showed that Oracle attempted to drop objects on the recycle bin when PVSS2COOL wanted to bulk insert data. This operation was then blocked by the custom Atlas trigger that blocks drop in production, hence the error message originally reported. Metalink note "265253.1" then further clarified that the issue was a side effect of a expected behaviour of Oracle's space reclamation process.

Issue resolution and expected follow-up:

In the evening on 29-6 Physics DB support in collaboration with Atlas DBAs extended the datafile of the PVSS2COOL application to circumvent this space reclamation process issue. Atlas has reported that this has fixed the issue. Further discussions on the role of the recycle bin and on possible improvements of the 'block drop trigger' of Atlas are currently in progress to avoid further occurrences of this issue.

-- OlofBarring - 30 Jun 2009

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2009-06-30 - OlofBarring
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback