--
HarryRenshall - 15-Jun-2010
WLCG Service Open Issues
- This is a list of items raised at the daily WLCG Operations meeting that need to be followed up in some way.
Q3 2010
Subject |
Site(s) |
Date raised |
Status |
Action on |
Report |
client bdii timeouts |
Many |
14 June 2010 |
Open |
SCOD |
Laurence Field has looked into the ATLAS bdii client timeouts reported and concluded they were mostly individual glitches. The RAL incident happened because the CERN bdii exceeded 5MB with the addition of some software tags and RAL had not yet applied the increase to 10MB that was in the last release of the bdii configuration. Laurence/Maarten have suggested that the GlueLocation object, which takes 1.5MB, may not be necessary as its information is duplicated in the SoftwareRunTimeEnvironment so would experiments please check if they use this object and let us know (wlcg-scod@cernNOSPAMPLEASE.ch). Removal of this object could then be scheduled. |
25 June 2010: Laurence has released a useful bdii troubleshooting guide.
See https://tomtools.cern.ch/confluence/display/IS/DM_Troubleshooting
SAM to nagios switchover |
All |
15 June 2010 |
Open |
SCOD |
The SAM to Nagios switchover planned for 15 June has been postponed for further MB consideration. Planning information needs to be given across the WLCG. A secondary question raised by RAL is if the access to the results database changes. Harry to follow-up. |
From mail of 21 June: Due to the migration for EGI to Nagios from SAM, and the agreement of
the WLCG Management Board to use Nagios test results for site
availability and reliability, We will stop OPS job submission by the
old SAM framework on Wednesday 23th June 2010.
Please note the following details:
1) VO testing of sites using SAM will not be affected; these tests
will continue for the foreseeable future. Migration to Nagios will be
planned individually with each VO using SAM.
2) The existing SAM Programmatic Interface will continue to be used to
provide Nagios results. There is no change in the output format, the
only change is in the list of test names which are considered critical
(these will now be Nagios names, not SAM test names).
We believe there should be no changes for users of SAM user
interfaces, Gridview or the programmatic interface. If you notice any degradation
in tools using the SAM interface please submit a GGUS bug to the SAM
support unit. The SAM Team.
Stoppage of LCG-CE |
CERN |
7 June 2010 |
Open |
Harry |
PES propose to stop the LCG-CE at the end of 2010. From the IT-ATLAS meeting their ability to switch to the CREAM-CE needs to be checked with ATLAS. |