Ongoing work to get latest MoU pledges into a DB for use by the other monitoring applications. Updating with latest numbers post RRB
15 October 2008
Nothing
17 September 2008
Regular (intermittent) failures of messaging brokers during the night. investigating. Hampered by a bug which makes recovery take a long time. Effect is missing OSG tests (due to the 2 hour Gridview timeout) - will need resummarisation.
CMS will set up an elog service on their vobox. We will provide expertise in the quattor and elog configuration for them.
SAM running ok (last outage 9th Sept)
9 July 2008
Tue 2nd July - CERN network problem prevented SAM BDII reading site BDIIs (1 hour )
Wed 3rd July - All services in an 'ERROR' status due to host-cert tests failing - package missing on SAM UI
Some results corrected for June in Gridview (22-23, 6th) for outages. New general procedure being put in place to 'mask' results
25 June 2008
SAM - CERN hit limit of 100 nodes over weekend which stopped gstat tests working.
DB issues in GV to be covered in meeting with DB Devs.
Deployed new messaging based gridftp producers on all CERN disk servers. Testing message based L&B reporting system. Will be send to certification in next days. When deployed outside of CERN we'll turn off R-GMA and WS based publication at same time. * monb001 - R-GMA box to be shut off.
30 Apr 2008
SAM - DB intervention yesterday (Tue) to fix some tables in GV schema which had problem last time. All went ok - SAM turned off for 2 test submission cycles (2hours)
elog - moved from VM to a 'real' machine for duration of CCRC'08 phase 2