GD Group Report for C5-07-Dec-2007 ================================== LCG deployment ============== - Total number of Sites (*): 245 - Software -> Num. Sites (*): gLite-3_1_0 -> 94 gLite-3_0_2 -> 128 gLite-3_0_0 -> 3 LCG-2_7_0 -> 5 unknown -> 15 - Status -> Num. Sites (*): ok -> 177 degraded -> 11 down -> 58 - Average of concurrently running jobs during this week (+): ~14k (*) Sites that are Certified _and_ Production _and_ Monitored by SAM: https://lcg-sam.cern.ch:8443/sam/sam.py To see this page one needs a grid certificate loaded in the browser. The calculation of the Site availability (Status) is described at: http://goc.grid.sinica.edu.tw/gocwiki/SAM_Metrics_calculation Software version is coming from the 'CE-sft-softver' CE test. Sites not supporting SAM 'CE' service, or not having sent results for this particular test during the last week, are counted as 'unknown'. (+) Job statistics taken from GStat: http://goc.grid.sinica.edu.tw/gstat/ http://goc.grid.sinica.edu.tw/gstat/total/GIISQuery_Usage_cpu_.html For the time being we do not report CPU numbers: 1. Not all the reported CPUs are actually available for grid jobs. 2. Sites with multiple CEs may have their CPUs double-counted. 3. GStat includes sites that are not considered by the SFTs. CERN Grid Operations managed by GD: ----------------------------------- * Patch #1491 installed on one of the CMS gLite WMS nodes (wms102.cern.ch). * Current status of the gLite WMS, gLite LB and LCG RB nodes can be found at https://twiki.cern.ch/twiki/bin/view/LCG/CurrentStatusWMSLBNodesCERN. CERN GRID Pre-Production Site (CERN_PPS): ----------------------------------------- * Upgrade of the site to glite 3.1 PPS-update10: In Progress - AFS UI: done - DPM: done - LFC: In progress PPS Coordination: ----------------- * EGEE Pre-Production Service Coordination: After the pre-deployment test, the PPS sites are upgrading to * gLite3.1.0-PPS-UPDATE10 * A number of new services for SL4(32 bit) were introduced by this update. - glite-AMGA_postgres - glite-LFC_mysql - glite-LFC_oracle - glite-PX - glite-SE_dpm_disk - glite-SE_dpm_mysql - glite-VOMS_mysql - glite-VOMS_oracle * Release of gLite3.1 Update07 to production in preparation: (To be announced today) The release contains: - glite-VOMS_mysql metapackage for gLite 3.1 and SL(C)4 - glite-VOMS_oracle metapackage for gLite 3.1 and SL(C)4 - Bug fixes for UI and WN * gLite3.1.0-PPS-UPDATE11 was released to PPS and is currently in the phase of pre-deployment testing This new version of the middleware contains several minor patches, plus a new version of GFAL/lcg_util versions and the glite-VOBOX service for SL4 (32 bit). Also a new version of glite-yaim-core has been released, which requests all meta-packages to be updated * Within the study for the re-organisation of PPS, a discussion was initiated with Diligent about the possibility to migrate a subset of PPS sites/services currently supporting the Diligent production into a "pilot" service in the production grid. WLCG Transfer Service: --------------------- * Transfer ranging from 80 to 520 MB/s, averaging around 220 MB/s per day. * Involving all major T1 sites. * Mostly traffic from CMS and less from Atlas * 0 open ticket in total * Throughput plots: http://gridview.cern.ch/GRIDVIEW/ gLite 3.x Build & Integration: ------------------------------ * Certification repository gLite 3.0 --------------------------------------- .. Presently .. 0 in preparation .. 0 in configuration .. 18 in certification gLite 3.1 --------------------------------------- .. Presently .. 4 in preparation .. 7 in configuration .. 19 in certification * PPS repository gLite 3.0 --------------------------------------- .. Next set of patches scheduled for release to PPS #1245 tool to renew proxies with VOMS extension via myproxy server (3.0) #1457 New package glite-sd2cache for FTS and FTA #1530 R-GMA yaim module for gLite 3.0 #1555 voms 1.7.24 + gSOAP 2.7 #1369 SLC3/i386/R3.0 DPM/LFC 1.6.7-2 #1517 glite-yaim-core 4.0.3 for the 3.0 repository gLite 3.1 --------------------------------------- .. 3.1.0 PPS Update 10 #1349 glite-LFC_mysql metapackage for gLite 3.1/SLC4 #1350 glite-SE_dpm_disk metapackage for gLite 3.1/SLC4 #1352 glite-SE_dpm_mysql metapackage for gLite 3.1/SLC4 #1541 glite-LFC_oracle metapackage for gLite 3.1/SLC4 #1420 glite-PX metapackage for gLite 3.1/SLC4 #1472 glite-AMGA_postgres metapackage for gLite 3.1/SLC4 #1501 glite-VOMS_oracle metapackage for gLite 3.1/SLC4 #1540 glite-VOMS_mysql metapackage for gLite 3.1/SLC4 .. 3.1.0 PPS Update 11 in preparation #1389 R3.1/SLC4/i386: GFAL and lcg_util update #1458 New package glite-sd2cache for FTM Node. #1512 3.1 VOBOX 5 - Normal #1513 glite-yaim-clients 4.0.2 for the 3.1 repository #1516 glite-yaim-core 4.0.3 for the 3.1 repository 5 - Normal #1521 Updated glite-info-templates 5 - Normal #1531 Updated glite-info-generic 5 - Normal #1544 patch for bug 29600 #1545 glite-yaim-lcg-ce 4.0.2-1 for gLite 3.1 5 - Normal #1546 glite-yaim-torque-utils 4.0.2-1 for gLite 3.1 5 - Normal #1552 lcg-info-dynamic-software 5 - Normal #1568 new DPM configuration #1389 R3.1/SLC4/i386: GFAL and lcg_util update 7 - High - Production repository gLite 3.0 --------------------------------------- .. 3.0.2 Update 37 released on 29.11.07 #1368 R3.0/SLC3: FTS cancellation (3.0.2 PPS Update 41) #1433 Updated Torque (2.1.9-4) and Maui (3.2.6p19-4 (3.0.2 PPS Update 41) #1498 MySQL server/client update (3.0.2 PPS Update 42) #1499 MySQL server update (3.0.2 PPS Update 42) .. Next set of patches scheduled for release to production: None. gLite 3.1 --------------------------------------- .. Next set of patches to be released (no release date planned yet) #1257 On glite 3.1 UI is stopped the python libraries missing #1403 R3.1 edg-mkgridmap-2.9.1 fixes dependency #1423 WN and UI R-GMA patch which adjusts dependencies #1500 R3.1 updated a1_grid_env.sh script #1501 glite-VOMS_oracle metapackage for gLite 3.1 and SL(C)4 #1540 glite-VOMS_mysql metapackage for gLite 3.1 and SL(C)4 gLite 3.x Certification & testing: ---------------------------------- * Certification Patches certified: #1245: tool to renew proxies with VOMS extension via myproxy server (3.0) #1457: New package glite-sd2cache for FTS and FTA #1458: New package glite-sd2cache for FTM Node. #1513: glite-yaim-clients 4.0.2 for the 3.1 repository #1517: glite-yaim-core 4.0.3 for the 3.0 repository #1530: R-GMA yaim module for gLite 3.0 #1555: voms 1.7.24 + gSOAP 2.7 #1568: new DPM configuration #1569: Adding {LFC,DPM}-interfaces to 3.1 metapackages * gLite 3.1 / SL4 - 32bit: glite-VOBOX released to PPS - 64bit: nothing to report * Configuration: Updated configuration for gLite3.1/SL4 WMS/LB. Gridview: --------- * New release of gridview frontend containing - Fix for Reliability Displays - Fix for XSS vulnerability This is currently in the pre-production system. * New release of gridview summarizer containing - fix for handling backdated change of downtime in GOCDB - fix for handling change in set of critical tests for a service This is deployed in production. Grid Data Management: --------------------- Nothing to report. SAM: ---- * From 2007-11-29 00:00 until 13:00 Problem: Job submission failures (missing tests) for some sites Reason: high load on RB Solution: removed overloaded RB from SAM RB "cluster" for the time being * From 2007-11-27 23:00 until 2007-11-28 13:00 Problem: Missing test restults Reason: SAM DB overloaded Solution: Temporarily disabled overloading queries, fixed table & queries that were involved. Grid User Support: ------------------ * The "Big Bang" GGUS 6.0 Release of 2007-11-29 was successful. Release notes in: https://gus.fzk.de/pages/owl.php * The next GGUS Release will take place on 2007-12-13 and will contain a dramatically different submission form. The prototype for evaluation, still available for some days, is in: https://iwrgustrain.fzk.de/pages/ticket_60.php * A talk on LHC VO User Support was given on 2007-12-05 to the GDB. Slides in: http://indico.cern.ch/materialDisplay.py?sessionId=4&materialId=1&confId=8508 * Notes and GGUS enhancement requests from the 2007-11-29 meeting with ALICE: http://cern.ch/dimou/lcg/UserSupport/alice/2007-11-29 Grid Authentication & Authorization Services: --------------------------------------------- * The problem which appeared on Thursday last week on CERN VOMS servers was due to a human error (no software bug). It was fixed within a few hours. * Migration to SLC4 and major VOM(R)S update is scheduled on Dec. 10th 2007 - New DB schema - Service interruption between 8.00 and 11.00 UTC * Given the major, announced, VOM(R)S upgrade of 2007-12-10, worries about the number of VOMS code are addressed to the EMT in: https://twiki.cern.ch/twiki/bin/view/LCG/VomsWG#2007_12_05_Notes Grid Operational Security: -------------------------- Nothing to report.