-- HarryRenshall - 25 Jan 2008

Week of 080128

Open Actions from last week: Castor operations to check if write access to the CERN ATLAS CAF disk pool is now restricted. Castor operations to monitor the CMS instance running the new 2.1.6-7 software with a view to scheduling upgrades to the other experiments this week.

Monday:

see the weekly phone conference in Indico

Tuesday:

Experiments ATLAS: Restarting their rests for space tokens. LHCb:
  • Waiting on twiki pages to be set up per site.
  • Problem with LFC replica @ RAL - LFC Developers in the loop.
  • PIC site admins are ready to migrate from CASTOR tape to Enstore, so might need to run a LFC script to change the replicas. Need to check that the first migrated stuff is ok before pic proceeds with the full migration

Core Services

  • ATLAS Castor upgrade done. LFC upgrade (to 1.6.8/SLC4 64bit) for all VOs is still ongoing.

Databases

  • DB intervention at CNAF is done. Streams reconfigured to remove old dbs and add new ones. SARA is still out of streams configuration

Monitoring:

  • RAS

Release Update:

  • RAS

Site Issues:

  • Michel Jouvin: Problem with old versions of lcg_utils against DPM (perhaps a gridftp2 problem). Fixed in latest versions. We probably need to specific a minimum version of UI/WN to be installed

  • Michel Ernst: recent upgrade of VOMS has broken the VOMS XML interface which is affecting a mirror version of VOMS @ BNL. Contacted developer and they agree to fix it in then ext release. Patch is already available - would like to have it put in place @CERN. Markus/Jamie notified last friday.

  • Problem with the SAM tests failing at BNL (double publication via OSG) - James to follow up and involve Rob Quick/Arvind.

Wednesday

Experiments
  • Alice: RAS
  • ATLAS: SAM Criticality for SE tests changed.
  • LHCb: Some problems with network server on rb123. Under investigation. Possibility to run analysis on large T2, e.g. GRIF is interesting
  • CMS:

Core Services

  • FTS - intervention ongoing.

Databases

  • Interruption of few minutes on ATLAS RAC due to hardware problems on a storage node - should have been handled automatically in Oracle, but didn't work. Followup with Oracle.

  • Itrac315 - hosts monitoring apps for streams/OEM. - hardware failure - host down . Availability is still available via SLS.

  • Streams - lost connection to ASGC - likely due to an intervention (announced) on their side - DBA has restarted it.

  • Oracle client on Linux modifies setting for FP rounding precision on the FPU -very minor, probably not an issue for physics calc, but an issue for reproducibility. Workaround in place, raised to Oracle as severity 1 bug to get a patch.

  • Applied oracle CPU successfully on ATLAS, and also now we have LCG integration RAC back again.

Monitoring:

  • Gridmap will be showing degraded for Atlas SE due to a problem in SAM PI until old results timeout (7 days). Need to followup for availability calculation impact in Gridview.

Release Update:

Site Issues:

  • GRIF : down due to upgrading to glite 3.1 for SE.
  • RAL/LHCb : Issue at RAL with LFC and streams following LHCb upgrade. Solved by restarting LFC daemon.

  • BNL: Markus following up with isolating VOMS jar as a single fix.
  • BNL Meeting with BNL tomorrow to followup on SAM issues - possible solution identified - need to work on what is possible.

Thursday

Experiments
  • Alice: RAS
  • ATLAS: Issues to be covered in the FDR meeting later
  • LHCb: RAS
  • CMS: RAS

Core Services

  • RAS (no attendance)

Databases:

  • 3D streams and OEM monitoring is still down. FIO investigating the repeated hardware faults on this. Two oracle service requests filed, no update.

Monitoring:

Release Update:

Site Issues:

  • CERN restarted VOMS server with the new fixed applied (by accident !). BNL to check if it makes things better to them.

AOB:

  • For Oracle problem, will mention at Operations meeting, but there will be no action for sites, since the AA will take care of this for the experiments.

Friday

Experiment report(s):

Core services (CERN) report:

DB services (CERN) report:

Monitoring / dashboard report:

Release update:

Questions from sites/experiments:

AOB:

Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r5 - 2008-01-31 - JamesCasey
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback