WLCG-OSG-EGEE Ops' Minutes Mon 18 Aug 2008
Summary
Quiet week, some although little discussion on the relative merits of upgrading the
FTS service to SL4.
CMS introduced their summer 08 production.
Attendance
No attendance list was made so this has many omissions.
EGEE
- Asia Pacific ROC:
- Central Europe ROC:
- OCC / CERN ROC: Nick Thackray, Steve Traylen, Dianna
- French ROC: Helen
- German/Swiss ROC:
- Italian ROC:
- Northern Europe ROC: Gert Svensson, Vera Hansper
- Russian ROC:
- South East Europe ROC:
- South West Europe ROC:
- UK/Ireland ROC: Jeremy Coles
- GGUS:
WLCG
- WLCG Service Cordination: Harry Renshall, Jamie Shiers
WLCG Tier 1 Sites
- ASGC:
- BNL:
- CERN site:
- FNAL:
- FZK:
- IN2P3:
- INFN:
- NDGF:
- PIC:
- RAL: Derek Ross,
- SARA/NIKHEF:
- TRIUMF:
LHC Experiments
- ATLAS:
- LHCb:
- CMS: Daneille
- ALICE:
Feedback on Last Week's Minutes
None was given.
EGEE Items
Grid Operator Hand Over on Duty
|
Primary Team |
Secondary Team |
From |
France |
Central Europe |
To |
CERN |
Germany/Switzerland |
- A problem with Gstat has caused many false alarms related to BDII tests on every ROCs. These failed tests to BDIIs were caused by transitive ASGC network outage for 20 minutes from 06:05 till to 06:25 on 11-Aug-2008.
PPS Reports
None given.
EGEE Items From ROC Reports
None.
gLite Release News
gLite3.1 Update28. The release contains:
- glite-CONDOR_utils for lcg-CE(PATCH:1856
)
- New version of gsoap plugin with a vulnerability fix (affecting LB, WMS, UI, WN, VOBOX, CE)(PATCH:1846
)
- Several bug fixes on WMS and clients (PATCH:1780
)
- New Short Lived Credential Service (SLCS), allowing to get short-lived personal certificate based on Shibboleth AAI identity (PATCH:1693
)
- MyProxy version 1.6.1-7 (fixes build issue related to globus flavour, already deployed in production) (PATCH:1978
)
- Various improvements on lcg-extra-jobmanagers (CE) (PATCH:1942
)
- GFAL and lcg_util update with new function gfal_removedir and Several bug fixes
- FTS SL4 release (32 and 64 bit) This version has a critical bug and should not be installed. The RPMs have been removed from the repository.
gLite3.1 Update 29 in preparation. The release contains:
WLCG Items
FTM Endpoints.
The list of FTM end-points we have so far is:
NDGF commented that they will follow up with the missing entry.
Upcoming WLCG Service Interventions
- CNAF [OUTAGE]
- CASTOR upgrade. From Tuesday, 19 August, 09:00 UTC+2 to Wednesday, 20 August, 20:00 UTC+2. Affected nodes:
- castorgrid.cr.cnaf.infn.it
- srm-v2.cr.cnaf.infn.it
- srm-v2-cms.cr.cnaf.infn.it
- castorsrm.cr.cnaf.infn.it
- DESY [at risk]
- One poolnode will move its location. Some files in dq2 and user directories will not be available. From: Tuesday, 19 August, 10:00 UTC+2 to Thursday 21 August 21:00 UTC+2. Affected nodes:
- CSCS [OUTAGE]
- Replacement of a faulty DIMM on storage pool node. From: Tuesday 19 August, 11:30 UTC+2;
- To
- Tuesday 19 August, 13:30 UTC+2. Affected nodes:
- storage01.lcg.cscs.ch
- ce01.lcg.cscs.ch
- GRIF [OUTAGE]
- electrical maintenance. From: Thursday 21 August 23:11 UTC+2; To: Wednesday 27 August 21:117:30 UTC+2. Affected nodes:
- apcse01.in2p3.fr
- apcce01.in2p3.fr
FTS on SL4 at T1/2s
Is it needed by experiments, CMS only were present, nor was there any comment. Danielle, please provide a large level of caution with
any upgrade at this time.
Availability Metrics
Correction of availability metrics due to incorrect setting of LFC write test to Critical
The Gridview team are recalculating the metrics and the correct data should be available within 1-2 days.
CMS Service
- CRUZET-4
- It is a slow start of CRUZET-4 atm (day-1 today). HCAL and DT are in, Tracker may join in the afternoon. DAQ currently is addressing some issues seen. From the computing standpoint, we have regular data operations shifts in place and operational - focusing mostly on T0 workflows - and we are using the CRUZET-4 exercise to implement the general computing shift design put in place recently, which is supposed to complement and integrate the DataOps approach and extend it to monitor the overall infrastructure, interfacing with the Grid Ops and the distributed facilities.
- Summer08 production
- More details will follow from DataOps team. Anyway, the most urgent and needed info by T1 sites has been already provided to them at the end of last week (they need it to prepare tape families on their MSS systems); current storage needs estimated to be as follows:
- ASGC: 27.0 TB (RAW) + 13.5 TB (RECO) = 40.5 TB
- CNAF: 26.5 TB (RAW) + 13.25 TB (RECO) = 39.75 TB
- FNAL: 64.6 TB (RAW) + 32.3 TB (RECO) = 96.9 TB
- FZK: 58.8 TB (RAW) + 29.4 TB (RECO) = 88.2 TB
- IN2P3: 22.0 TB (RAW) + 11.0 TB (RECO) = 33.0 TB
- PIC: 8.4 TB (RAW) + 4.2 TB (RECO) = 12.6 TB
- RAL: 23.9 TB (RAW) + 11.95 TB (RECO) = 35.85 TB
Action Items
Newly Created Action Items
Review of Open Action Items
- 238
- Maite has some news.
- 239
- Had not seen slow response, close for now and reopen if it reaccurs.
- 241
- Should be closed.
- 243
- Check MB minutes.
Open Action Items
Id | Submitter | Description | Creation | Due | Assigned To | |
---|
Actions Closed in Last 20 Days
Id | Submitter | Description | Creation | Due | Assigned To | Closed | |
---|
Next Meeting
The next meeting will be Monday, dd mmm 2007 15:00 UTC (16:00 Swiss local time).
- Attendees can join from 14:45 UTC (15:45 Swiss local time) onwards.
- The meeting will start promptly at 15:00 UTC (16:00 Swiss local time).
- The WLCG section will start at the fixed time of 15:30 UTC (16:30 Swiss local time).
- To dial in to the conference:
- Dial +41227676000
- Enter access code 0157610
These minutes can only be changed by members of: