Request to sites supporting CMS to deploy CREAM CEs in production, or to enable CMS in the existing ones if this is not the case, to support the CMS CREAM pilot which has decided to focus activity in a new instance of wms-ice deployed at cnaf pointing to production cream CEs.
invitation to T1s and T2s to request for wider testing in production environment of glexec and SCAS, to be done through the pilot installation, contact egee-pps-pilot-scas@cernNOSPAMPLEASE.ch
VO related info published by YAIM for CEs maybe incorrect, only for new installations (new CEs), read release notes for update 45 to gLite 3.1
sites without 64 bits capability have to stay with gLite 3.1 for now; no plans at the moment to deprecate gLite 3.1, depends on how long need to be supported before all hw at the sites moves to 64 bits.
Attendance
EGEE
Asia Pacific ROC: Jason Shih
Central Europe ROC: Malgorzata Krakowian
OCC / CERN ROC: John Shade, Antonio Retico, Nick Thackray, Steve Traylen, Diana Bosio, Maite Barroso
French ROC: Pierre Girard
German/Swiss ROC: Angela Poschlad, Wen Mei
Italian ROC:
Northern Europe ROC: Vera Hansper, Ron Trompet, Gert Svensson,
On the 30th April, a restricted checkpoint meeting (CMS and supporting sites) decided to focus activity in a new instance of wms-ice deployed at cnaf pointing to production cream CEs. Sites supporting cms in production have only ¼ of the cream CEs in production. Request to sites supporting CMS to deploy CREAM CEs in production, or to enable CMS in the existing ones if this is not the case.
Angela reported that when at FZK they published their CREAM on the production BDII they had issues with CMS because this was interfering with their production, so that the CMS VO Support suggested them to stop publishing the CMS queues from CREAM. However, the adaptation of the CMS production framework (including CMS monitoring) in order to cope with and exploit the agreed layout of the pilot is definitely on the pilot task force’s agenda. In addition to that we would like to remind that, as one can read in the minutes of the check-point meeting (https://twiki.cern.ch/twiki/bin/view/LCG/PPIslandFollowUp2009x04x30 ), we have run a test together with CMS on the production system and we saw already ~ 10 queues for CMS currently available in production without CMS referring of any problems with those sites. Therefore we confirm the request to the sites supporting CMS to enable CMS queues on their CREAM.
Glexec-scas, mail circulated last week from John Gordon to GDB: invitation to T1s and T2s to request for wider testing in production environment of glexec and SCAS, to be done through the pilot installation (no official release yet), contact egee-pps-pilot-scas@cernNOSPAMPLEASE.ch. Patch 2973: fixes incompatibility with cream, strongly advised to start with this version.
All details are in wiki linked to the agenda
gLite Release News
2 versions of yaim core released at the same time, complicated to do otherwise. Latest version of yaim core was required by GFAL
Known issue, reported in main release page: VO related info published by YAIM for CEs maybe incorrect, only for new installations (new CEs), read release notes.
Installed capacity? Yes, changes are included in yaim core 4.0.7 and this is now released. You can look at the April EGEE league table to check the KSI2K values and find out the sites that are currently publishing correctly according to the new guidelines.
Coming releases: DPM and VOMS. It will affect nearly all grid nodes because of changes in voms clients. Dependency in DPM client, they use a new version of VOMS API. VOMS API is used everywhere, in all node types, so all node types will be affected by this release. Up to sites to upgrade everywhere or not, not high priority for the rest of the node types. Preview will be made available tomorrow. Production at the beginning of next week.
EGEE Items From ROC Reports
France ROC:IN2P3-CC is down from Sunday 3rd May 19:00, due to air cooling failure. Most of the grid services have been restarted this morning (May 4th).
An unscheduled dowtime is still active until tomorrow afternoon for CEs and SEs.
SEE ROC: Are there any developments/plans/ideas towards to a high availability mechanism for the LFC service from the development team?
From the developers: LFC can be deployed in a HA setup, as it does not hold internal state apart from the database. One can deploy multiple front-ends pointing to the same database back-end, which are in a load balanced or fail-over configuration.
Of course in this case the database is a single point of failure, which one can mitigate by deploying on Oracle RAC or having a multi-tier LFC setup: https://twiki.cern.ch/twiki/bin/view/LCG/LfcConceptDeploymentUsage
In this case there is one master LFC service, which updates read-only replicas via Oracle streams database level replication.
In theory one can also think about MySQL based replication (tested for VOMS, but not for LFC) and also about multi-master Postgres DB, depending on the actual requirements coming from the sites.
glite 3.2, SL5, 64 bits. No plans for 32 bits.
Pierre Girard, France: 32 bit compatibility libraries available for the WNs; python 32 needed with SL5?
Mario, SWE: what about sites without 64 bits capability? They have to stay with gLite 3.1 for now; no plans at the moment to deprecate gLite 3.1, depends on how long need to be supported before all hw at the sites moves to 64 bits.
SEE ROC: Which is the current status of the top-BDIIs? Tests we made within the HellasGrid infrastructure showed to us that many of the problems at the current version of top-BDII are solved in the top-BDIIs 5.0 being certified.
A pilot is being defined for this new BDII version to make sure that all the interactions with this version are fine. Count on weeks before it is released to production.
WLCG Items
WLCG issues coming from ROC reports
Item 1
Upcoming WLCG Service Interventions
Grid Service Interventions: many, check the agenda or GOCDB, there are many these days
Helene: query from CIC portal to extract alarms for CEs takes a very long time, as reported in previous meetings. John: It is being solved in SAM, query is being corrected; additional problem with responsiveness of the DB.
Next Meeting
The next meeting will be Monday, 11 May 2009 14:00 UTC (16:00 Swiss local time).
Attendees can join from 13:45 UTC (15:45 Swiss local time) onwards.
The meeting will start promptly at 14:00 UTC (16:00 Swiss local time).