LCG deployment ============== - Total number of Sites (1): 259 - Status -> Num. Sites (1): ok -> 189 degraded -> 10 down -> 60 - Software -> Num. Sites (2): gLite-3_1_0 -> 223 gLite-3_0_2 -> 23 gLite-3_0_0 -> 1 - Average of concurrently running jobs during this week (3): 32.2k (1) Sites that are Certified, in Production and that have been monitored by SAM during the last week under OPS credentials. SAM is available at: https://lcg-sam.cern.ch:8443/sam/sam.py To see this page one needs a grid certificate loaded in the browser. The calculation of the Site availability (Status) is described at: https://cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf (2) Software version is coming from the 'CE-sft-softver' CE test. Sites not supporting SAM 'CE' service, or not having sent results for this particular test during the last week, are not counted. (3) Job statistics taken from GStat: http://goc.grid.sinica.edu.tw/gstat/ http://goc.grid.sinica.edu.tw/gstat/total/GIISQuery_Usage_job_.html Operational Security =================== The security incident affecting HEP sites (see report from 07 Aug) is still being investigated by the EGEE OSCT and the relevant academic CSIRTs. EGEE Pre-Production Service Coordination ======================================== 2008-09-04: an issue was found with the version of GFAL released with gLite3.1 Update30: After the upgrade, some sites in various regions appear to be failing the SAM tests. The issue, still under analysis appears to be due to LDAP searches no longer compatible with the gLite 3.0 top-level BDII. Affected sites (those failing the SAM tests) should make sure to be be pointing to a 3.0 top-level BDII as LCG_GFAL_INFOSYS. SA1 is preparing an official recommendation for the sites and interacting with the ROCs to make sure that all regional top level BDII are running a compatible version 2008-09-02: Pilot service of Cream CE: * A check point meeting was held with JRAI, SA1 and Alice * Progresses in the usage of the services in preview were reported by Alice * The decision was made to extend the duration of the pilot in the current configuration until the 30th of September * The pilot will be sligtly extended to include cream services running the version about to be released in produciton * Minutes (in preparation) will be soon available at http://indico.cern.ch/conferenceDisplay.py?confId=38527 * Details about the single tasks can be found in the tracker https://savannah.cern.ch/task/?group=sa1dep specifically listing the subtasks of TASK:7143 2008-09-2: the EMT made the decision to delay the deployment of the CREAM CE (PATCH:1755). This is because there are instances of the WMS of unsupported version in production that can still accidentally match the Cream CE and cause submission failure. The number of WMS potentially affected by this issue has been estimated to be order of 4/5 by an analysis of the information published. A confirmation is expected from the WLCG EGEE Operation meeting. In addition to that, the proxy renewal mechanism wasn't working properly because it required opening of ports on the WNs. Until these issues are fixed by an incoming patch, Cream will be released with a GlueCEServiceState? different from 'production' and changed later changed again later. The release with the workaround that is currently in certification can be expected in about 2 weeks 2008-09-02: The gLite release procedure https://twiki.cern.ch/twiki/bin/view/LCG/PPSReleaseProcedures#PPS_Production was changed to include the Release Testing as a formal step. The new service is described in https://twiki.cern.ch/twiki/bin/view/LCG/PreProductionServiceDescription#Release_Testing This is a complementary step of the release procedure: selected and specialised sites receive the updates at a shortly earlier stage than the other production sites. So that they can provide useful feedback to the release managers in case of any anomalies. This happens in parallel with the release preparation and it is so far a non-blocking step. Sites part of the initial team are all from the Central Europe region, namely: * WCSS - Pawel Dziekonski (Production) * GUP-CERTIF-TB - Martin Polak (PPS) * PPS-SiGNET - Dejan Lesjak (PPS) 2008-09-02: gLite3.1 Update30 was released to production The update, meant to be released next Thursday, was delayed due to last-minute changes to the repository and further analysis of the impacts of bugs found in PPS. It is going however to be released today or tomorrow . The update will affect the vast majority of services. It contains, notably: * A patch to globus VDT , fixing the issue raised with BUG:37563 (limit in proxy delegation chain) * dCache 1.8.0-15p5 : Minor bug fixes and inclusion of the Chimera filesystem which can be configured through new yaim-dcache module. * GFAL/lcg_util bugfix release * gLite YAIM clients update o VOBOX specific variables are now distributed under services/glite-vobox and defaults/glite-vobox.pre o The AMGA client configuration function is now included in the UI, WN, TAR UI and TAR WN o The config_vomsdir function configuring the .lsc files under vomsdir is now included un the UI, WN and VOBOX. There is a known problem with config_vomsdir on the UI_TAR and WN_TAR. o Please check also the YAIM-Client Known Issues in https://twiki.cern.ch/twiki/bin/view/LCG/YaimGuide400 Testing and Certification ========================= CERN hosted a three day Workshop on the adoption of BES/JSDL/GLUE standards. Participants included Unicore developers, ARC developers and Glite developers. The workshop was supported by OGF Europe. ETICS ===== Nothing to report this week. Activities are proceeding normally. Statistics: Projects registered: 31 Packages available in the repository: 64734 Build/test reports total: 2729 Build/tests executed this week: 612 EGEE CERN Regional Operations Centre (ROC) ========================================= * Business as usual with tickets CERN GRID Pre-Production Site (CERN_PPS) ======================================== nothing to report