Week of 080818

Open Actions from last week:

Daily WLCG Operations Call details

To join the call, at 15.00 CE(S)T Monday to Friday inclusive (in CERN 513 R-068) do one of the following:

  1. Dial +41227676000 (Main) and enter access code 0119168, or
  2. To have the system call you, click here

General Information

See the weekly joint operations meeting minutes

Additional Material:

Monday:

Attendance: local(Simone, Jean-Philippe, Harry, Luca, Ricardo);remote(Gonzalo, Derek, Daniele, Michael, Jeff).

elog review:

Experiments round table:

ATLAS (SC): Had 3 rather critical problems over the weekend. 1) Many Castor errors trying to export reprocessed data due to its being on tape not disk. These data were garbage collected before older data whereas we thought GC was a FIFO operation. We will follow up with CASTOR operations. Despite this all cosmics were exported (except to RAL) though at low efficiency and DAQ ran into CASTOR at 750 MB/s in 12 hours slots. 2) Affecting all of the Tier 1 is that during the weekend the data set type name embedded at the beginning of each file changed from data08_cos to data08_cosmag without advance warning to allow Tier 1 sites to switch to a new storage directory mapping. We will follow up this poor communication with the ATLAS management. 3) RAL had several problems over the weekend . On Saturday their SRM Oracle data base was giving errors then on Sunday they were not accepting FTS data from CERN though they were still exporting. We found the CERN channel set to 0% and it would have been good to tell us that had been done. Today I can see RAL is in an unscheduled downtime. D.Ross explained their CASTOR was down from 07.00 to 09.00 Saturday then Sunday they had first an LSF disk full (which stops staging) then a database disk full which lead to the unscheduled downtime. The expert called out set the CERN-ATLAS FTS channel share to zero as it was the only one active. H.Renshall said we should follow up how sites could indicate such a configuration change (e.g. site statusTwiki).

CMS (DB): CMS have started the CRUZET4 (cosmics but with magnet on) run with a couple of subdetectors in so far. They have a dataops shift in place concentrating on Tier 0 workflows.They have daily meetings at 16.00 and will use the CCRC08 elog for general (unstructured) observations and their cruzet3 elog for more detailed reports. H.Renshall invited them to make relevant observations to these minutes.

Sites round table: Jeff (NL-T1) reported they are at risk today and tommorow while they change network routers. Will be application transparent unless a cable switch over exceeds the tcp timeout.

Core services (CERN) report:

DB services (CERN) report: - The apply process at ATLAS OFFLINE was aborted on Friday afternoon when trying to replicate the statements in order to drop the tables from one schema. The problem is a known bug, reproduced on ATLAS after setting a new parallel Streams setup between the ONLINE and the OFFLINE databases to replicate the PVSS schemas. This bug is assigned to Oracle development but the progress is very slow. The workaround is to setup schema rules at the apply side. This change will be implemented this week.

- The corruption found on the atlas online server is not affecting services, but an intervention is scheduled on Wednesday from 14:00 till 14:45 to fix the issue via a switch to a new database using Oracle dataguard/standby technology.

Monitoring / dashboard report:

Release update:

AOB:

Tuesday:

Attendance: local();remote().

elog review:

Experiments round table:

Sites round table:

Core services (CERN) report:

DB services (CERN) report:

Monitoring / dashboard report:

Release update:

AOB:

Wednesday

Attendance: local();remote().

elog review:

Experiments round table:

Sites round table:

Core services (CERN) report:

DB services (CERN) report:

Monitoring / dashboard report:

Release update:

AOB:

Thursday

Attendance: local();remote().

elog review:

Experiments round table:

Sites round table:

Core services (CERN) report:

DB services (CERN) report:

Monitoring / dashboard report:

Release update:

AOB:

Friday

Attendance: local();remote().

elog review:

Experiments round table:

Sites round table:

Core services (CERN) report:

DB services (CERN) report:

Monitoring / dashboard report:

Release update:

AOB:

Edit | Attach | Watch | Print version | History: r14 | r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 2008-08-18 - HarryRenshall
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback