GD Group Report for C5-23-Nov-2007 (full) ========================================= LCG deployment: --------------- - Total number of Sites (*): 247 - Software -> Num. Sites (*): gLite-3_1_0 -> 85 gLite-3_0_2 -> 143 gLite-3_0_0 -> 3 LCG-2_7_0 -> 6 unknown -> 10 - Status -> Num. Sites (*): ok -> 181 degraded -> 15 down -> 51 - Average of concurrently running jobs during this week (+): ~20k (*) Sites that are Certified _and_ Production _and_ Monitored by SAM: https://lcg-sam.cern.ch:8443/sam/sam.py To see this page one needs a grid certificate loaded in the browser. The calculation of the Site availability (Status) is described at: http://goc.grid.sinica.edu.tw/gocwiki/SAM_Metrics_calculation Software version is coming from the 'CE-sft-softver' CE test. Sites not supporting SAM 'CE' service, or not having sent results for this particular test during the last week, are counted as 'unknown'. (+) Job statistics taken from GStat: http://goc.grid.sinica.edu.tw/gstat/ http://goc.grid.sinica.edu.tw/gstat/total/GIISQuery_Usage_cpu_.html For the time being we do not report CPU numbers: 1. Not all the reported CPUs are actually available for grid jobs. 2. Sites with multiple CEs may have their CPUs double-counted. 3. GStat includes sites that are not considered by the SFTs. CERN Grid Operations managed by GD: ----------------------------------- * CAs upgrade on all gLite WMS, gLite LB and LB RB nodes. * Update 36 done on all the gLite WMS and LB nodes. * Current status of the gLite WMS, gLite LB and LCG RB nodes can be found at https://twiki.cern.ch/twiki/bin/view/LCG/CurrentStatusWMSLBNodesCERN. ROC CERN: -------- * Certification: Ongoing certification of the BEIJING-CNIC-LCG2-IA64 site * ROC Team: Farida Naz joined the CERN ROC Team CERN GRID Pre-Production Site (CERN_PPS): ----------------------------------------- * Upgrade of the site to glite 3.1 PPS-update09: CEs, glite-WNs, site-BDII * PPS AFS-UI(3.1) upgraded to PPS-update09 * Version v1.18-1 of CAs tested and deployed * Set-up of lsf based CE lxb2090 implementing the new (uncertified) version of the jobpriority configuration for Atlas and CMS. So now we have two CEs (Torque and LSF based) ready to run the jobpriority tests for Atlas and CMS. Grid Data Management: --------------------- * LFC 1.6.7 certified for SL4. Gridview: --------- * Quattor templates for Gridview archiver module written and tested. * The gridview NCM component was also enhanced for configuring gridview archiver modules and this was tested on a test node successfully. WLCG Transfer Service: --------------------- * Transfer ranging from 20 to 680 MB/s, averaging around 320 MB/s per day. * Involving all major T1 sites. * Mostly traffic from CMS * 0 open ticket in total * Throughput plots: http://gridview.cern.ch/GRIDVIEW/ SAM: ---- * new software releases: - Submission Framework: lcg-sam-client 1.2.0 - Sensors (standard set): lcg-sam-client-sensors 1.4.0 including upgraded tests for CA release 1.18-1 - Database scripts: lcg-sam-server-db 1.1.0 - Web Services: lcg-sam-server-ws 0.10.0 See release twiki page https://twiki.cern.ch/twiki/bin/view/LCG/SamReleaseActivity for more information * unavailability: 21 Nov 10:00-17:00 Reason: lcg-sam-client-sensors were upgraded, however it contained a critical bug which was not discovered during testing because of one configuration parameter being different on SAM validation setup compared to the production instance. This difference has been discovered only when rolling back to the previous version. See more details in the description on SAM Unavailabilities wiki https://twiki.cern.ch/twiki/bin/view/LCG/SAMProdServUnavail gLite 3.x Build & Integration: ------------------------------ * Certification repository (patches): - gLite 3.0 .. 0 in preparation, .. 1 in configuration .. 15 in certification - gLite 3.1 .. 6 in preparation .. 3 in configuration .. 19 in certification * PPS repository - gLite 3.0 .. Next set of patches scheduled for release to PPS None. - gLite 3.1: 3.1.0 PPS Update 10 in preparation #1349 glite-LFC_mysql metapackage for gLite 3.1/SLC4 #1350 glite-SE_dpm_disk metapackage for gLite 3.1/SLC4 #1352 glite-SE_dpm_mysql metapackage for gLite 3.1/SLC4 #1541 glite-LFC_oracle metapackage for gLite 3.1/SLC4 #1420 glite-PX metapackage for gLite 3.1/SLC4 #1472 glite-AMGA_postgres metapackage for gLite 3.1/SLC4 #1501 glite-VOMS_oracle metapackage for gLite 3.1/SLC4 #1540 glite-VOMS_mysql metapackage for gLite 3.1/SLC4 * Production repository - gLite 3.0 .. Next set of patches to be released: #1368 R3.0/SLC3: FTS cancellation (3.0.2 PPS Update 41) #1433 Updated Torque (2.1.9-4) and Maui (3.2.6p19-4) (3.0.2 PPS Update 41) #1498 MySQL server/client update (3.0.2 PPS Update 42) #1499 MySQL server update (3.0.2 PPS Update 42) - gLite 3.1 .. Next set of patches to be released: #1255 JobWrapper tests - new version with no R-GMA dependencies (3.1.0 PPS Update 08) gLite 3.x Certification & Testing: ---------------------------------- * Certification Current GFAL/lcg_utils patches (#1389/#1390) rejected (details in the patch) - updated rpms expected very soon. Patches certified: .. #1349: glite-LFC_mysql metapackage for SLC4 .. #1350: glite-SE_dpm_disk metapackage for SLC4 .. #1352: glite-SE_dpm_mysql metapackage for SLC4 .. #1420: glite-PX for glite 3.1 .. #1472: gLite-AMGA_postgres metapackage for gLite 3.1 .. #1541: glite-LFC_oracle metapackage for SLC4 * gLite 3.1 / SL4 - 32bit: DPM, LFC, VOMS, AMGA and MyProxy services have been certified and are being prepared for release to PPS. - 64bit: WN has entered the standard release process (patch #1542). * Configuration Further work on integration of CREAM configuration into main yaim trunk. * Other work - Improved batch system support is in progress with patches for LSF, Condor and SGE now in certification. - Analysis of bug and patch statistics Grid User Support: ------------------ * GGUS 6.0 is a major release that will take effect on 2007-11-29. Page https://gus.fzk.de/pages/owl.php contains details on service interruption and release contents. Regular broadcasts will be published in due time. Grid Authorization & Authentication Services: --------------------------------------------- Nothing to report. Grid Operational Security: -------------------------- Nothing to report this week. ---Zdenek