GD Group Report for C5-09-May-2008 ================================== LCG deployment: --------------- - Total number of Sites (1): 282 - Status -> Num. Sites (1): ok -> 179 degraded -> 22 down -> 79 na -> 2 - Software -> Num. Sites (2): gLite-3_1_0 -> 191 gLite-3_0_2 -> 44 gLite-3_0_0 -> 1 LCG-2_7_0 -> 3 - Average of concurrently running jobs during this week (3): ~32k (1) Sites that are Certified, in Production and that have been monitored by SAM during the last week under OPS credentials. SAM is available at: https://lcg-sam.cern.ch:8443/sam/sam.py To see this page one needs a grid certificate loaded in the browser. The calculation of the Site availability (Status) is described at:https://cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf
(2) Software version is coming from the 'CE-sft-softver' CE test. Sites not supporting SAM 'CE' service, or not having sent results for this particular test during the last week, are not counted. (3) Job statistics taken from GStat:http://goc.grid.sinica.edu.tw/gstat/
EGEE Pre-Production Service Coordination: ----------------------------------------- * Release of gLite 3.1 Update22 in preparation. The update, to be released today, will contain: - lcg-CE . SGE Engine enabled on lcg-CE . fix for DENY tags to lcg-info-dynamic-scheduler - dcache . Dcache 1.8.0.12.p6 (First dcache 1.8 release) - MPI_utils . Rebuild MPI_utils mpich RPM with Fortran wrappers - gLite-PX . first version of the dynamic service publisher, replacing the previous static configuration - 64 bit WNs + recet updates to GFAL (already deployed for 32 bit) - VOMS core (affecting clients) . new VOMS core 1.8.3-4 (affecting VOMS servers and clients on UI WN VOBOX CE SE_dpm LFC WMS LB . Many bug fixes. Fully backward compatible. . fix to trustmanager install script - client tools . lcg-infosites: new option to query for the wms and the lb associated to a certain VO. The -f option to filter based on the site name is also available. . bug fixes for edg-gridftp-client * Pilot of WMS3.1 at CNAF and CERN-PROD in progress with Atlas and CMS * Pilot of AMGA at CERN_PPS in progress with LHCb CERN GRID Pre-Production Site (CERN_PPS): ----------------------------------------- * Set-up of AMGA service for LHCb in progress EGEE CERN Regional Operations Centre (ROC): ------------------------------------------- * Request to join received by the site 'Uniandes' "Universidad de los Andes" Bogota - Colombia Certification in progress SAM Unavailabilities: -------------------- From 29-04-2008 (Tue) 09:00 To 29-04-2008 (Tue) 11:00 Impact Two sets of tests skipped for all sites Problematic Service SAM/GridView Database Symptoms Test submission & publishing stopped for 2 hours during scheduled DB intervention Reason DB downtime at 10:00-10:20 plus checks Solution Framework restarted normally Grid User Support: ------------------ A User Support Advisory Group (USAG) meeting took place on May 6th with VO and ROC participation. It clarified the plan for direct routing of GGUS tickets submitted by LHC experiment experts on shift and the Grid Sites. Notes in: http://indico.cern.ch/getFile.py/access?resId=0&materialId=minutes&confId=33010 Grid Operational Security: -------------------------- There will be three security talks at HEPiX on Friday between 11:15 and 12:45, which all are welcome to attend. gLite 3.x Integration & Build: ------------------------------ - Certification repository gLite 3.0 --------------------------------------- .. Presently .. 0 in preparation .. 0 in configuration .. 2 in certification gLite 3.1 --------------------------------------- .. Presently .. 3 in preparation .. 1 in configuration .. 16 in certification - PPS repository gLite 3.0 --------------------------------------- .. No new release (latest release 3.0.2 PPS Update 48) .. Next set of patches scheduled for release to PPS : #1810 R3.0 lcg-vomscerts-5.0.0 adds next cert for vo.racf.bnl.gov #1811 R3.0 WMS lcg-vomscerts-5.0.0 adds next cert for vo.racf.bnl.gov gLite 3.1 --------------------------------------- .. 3.1 PPS Update 26 #1800 New vdt_globus_jobmanager_common to fix globus-cass-cache.. .. Next set of patches scheduled for release to PPS : #1782 VOMS Admin Server 2.0.14.1 & VOMS Admin Client 2.0.7.1 & VOMS Admin Interface 2.0.2.1 #1787 VOMS server configuration update (multiple bug fixes) #1809 New JobManager version for SGE - Production repository gLite 3.0 --------------------------------------- .. No new release (latest: 3.0.2 Update 42) .. Next set of patches scheduled for release to Production : #1810 R3.0 lcg-vomscerts-5.0.0 ... #1811 R3.0 WMS lcg-vomscerts-5.0.0 ... gLite 3.1 --------------------------------------- .. 3.1 Update 21 #1800 New vdt_globus_jobmanager_common to fix globus-cass-cache.. .. Next set of patches scheduled for release to production: #1219 fix for DENY tags to lcg-info-dynamic-scheduler #1278 Service Information Provider #1474 Patch to enable Sun Grid Engine support for the lcg-CE #1542 WN 3.1 for sl4 64bits #1645 R3.1/SLC4/x86_64: GFAL/lcg_util update #1663 lcg-infosites (patch 1646 revisited) #1680 R3.1/SLC4/x86_64: GFAL 1.10.8 #1683 Dcache 1.8.0.12.p6 (First dcache 1.8 release) #1713 VOMS Core + logging Fix v2 #1721 edg-gridftp-client-1.2.8 fixes bugs 33205, 27274 #1723 Rebuild MPI_utils mpich RPM with Fortran wrappers #1729 APEL working with external log4j and BC #1759 R3.1/x86_64/SLC4: GFAL & lcg_util update #1788 Trustmanager fix for install script gLite 3.x Testing & Certification: ---------------------------------- * Patches certified: #1782: VOMS Admin Server 2.0.14.1 & VOMS Admin Client 2.0.7.1 & VOMS Admin Interface 2.0.2.1 #1787: VOMS server configuration update (multiple bug fixes) #1809: New JobManager version for SGE * Testing of FTS, first release on SL4. * Work continues on Glue 2.0 and gstat 2. * Debugging of segfault issues on the lcg-CE; problem is understood and a fix is forthcoming. ETICS: ------ Nothing reported. The full GD report can be consulted on: --------------------------------------- https://twiki.cern.ch/twiki/bin/view/LCG/GDC5Reports ---Zdenek