March 2012 Reports

To the main

30th March 2012 (Friday)

  • Data reStripping and user analysis at Tiers1 going well
  • Stripping jobs (both 17b & 18) are ~complete at all sites except GridKa and IN2P3
  • The associated merging jobs are ongoing.
  • MC simulation at Tiers2 ongoing

  • T0
  • T1
    • GRIDKA : Back in production

29th March 2012 (Thursday)

  • Data reStripping and user analysis at Tiers1 going well
  • Stripping jobs (both 17b & 18) are ~complete at all sites except GridKa and IN2P3
  • The associated merging jobs are ongoing.
  • MC simulation at Tiers2 ongoing

28th March 2012 (Wednesday)

  • Data reStripping and user analysis at Tiers1 going well
  • Stripping jobs (both 17b & 18) are ~complete at all sites except GridKa and IN2P3
  • The associated merging jobs are ongoing.
  • MC simulation at Tiers2 ongoing

  • T0
    • CERN : waiting for implementation ...(GGUS:79685)
  • T1

27th March 2012 (Tuesday)

  • Data reStripping and user analysis at Tiers1 going well
  • Stripping jobs (both 17b & 18) are ~complete at all sites except GridKa and IN2P3
  • The associated merging jobs are ongoing.
  • MC simulation at Tiers2 ongoing

26th March 2012 (Monday)

  • Data reStripping and user analysis at Tiers1 going well
  • Stripping jobs (both 17b & 18) are ~complete at all sites except GridKa and IN2P3
  • The associated merging jobs are ongoing.
  • MC simulation at Tiers2 ongoing

  • T0
    • CERN : waiting for implementation ...(GGUS:79685)
  • T1
    • GRIDKA : slow tape recall (GGUS:80589)
    • CNAF : (GGUS:80592) some CVMFS endpoint not accessible (wn-103-03-31-07-a :: df: `/cvmfs/lhcb.cern.ch': Transport endpoint is not connected)
    • IN2P3 : low number of running jobs (GGUS:80595)

23rd March 2012 (Friday)

  • Data reStripping and user analysis at Tiers1 going well
  • Stripping jobs (both 17b & 18) are ~complete at all sites except GridKa and IN2P3
  • The associated merging jobs are ongoing.
  • MC simulation at Tiers2 ongoing

  • T0
  • T1
    • Had issues at GridKa this morning (and several T2s) due to pilots failing from failed proxy renewal from the WMS. This has happened before but we're not sure of cause though believe there is a fix on it's way
    • Yesterday we increased the number of stripping & merging jobs at IN2P3. This seems to have hit a limit as over night a backlog of GridFTP transfers built up until there were 2700+ jobs running this morning. Went back to previous limit and jobs are slowly transferring as they should.

22nd March 2012 (Wednesday)

  • No significant issues from yesterday
  • Data reStripping and user analysis at Tiers1 going well
  • Stripping jobs (both 17b & 18) are ~complete at all sites except GridKa and IN2P3
  • The associated merging jobs are ongoing.
  • MC simulation at Tiers2 ongoing

  • T0
  • T1
    • Update on CVMFS issue at IN2P3 ([GGUS:80405]): It seems there were some 'dead' dirs that reported as if it didn't exist. Forced cache refresh caused the client to die and so a corrupt cache is suspected. CVMFS people have been notified.

21st March 2012 (Wednesday)

  • No significant issues from yesterday
  • Data reStripping and user analysis at Tiers1 going well (~2700 running jobs)
  • 17b Stripping will complete in ~1 week at current rate
  • MC simulation at Tiers2 ongoing

  • T0
  • T1

    • Reenabled SARA after DT. SAM tests and jobs working again.
    • IN2P3: Corrupt file issue and CVMFS issue are still under investigation

20th March 2012 (Tuesday)

  • No significant issues from yesterday
  • Data reStripping and user analysis at Tiers1 Ongoing
  • MC simulation at Tiers2 ongoing

  • T0
  • T1

    • Banned SARA in DIRAC due to the DT today

    • Update to investigation into corrupt files at IN2P3 (https://ggus.eu/ws/ticket_info.php?ticket=80338)
      • LHCb jobs use lcg-cp and then register separately in LFC * As files seem consistent at storage, one of the only explanations now is that the lcg-cp didn't transfer properly but returned successfully * Very unklikely an overwrite occurred
      • More investigations are going on....

19th March 2012 (Monday)

  • Data reStripping and user analysis at Tiers1
  • MC simulation at Tiers2
  • Latest productions using Stripping 18 are going well

  • T0
  • T1

    • Ongoing investigation into corrupt files at IN2P3 (https://ggus.eu/ws/ticket_info.php?ticket=80338)
      • DCache reports checksum correctly for file available but this is different to that reported by LFC * Affects both MC & Data, ROOT can open then, but will reach a bad event and crash * Jobs report successful upload, but pfn-metadata and lfn-metadata checksums are different

    • Minor config issue at CNAF affecting internal ARCHIVE/TAPE transfers was quickly fixed by Vincenzo

16th March 2012 (Friday)

  • Data reStripping and user analysis at Tiers1
  • MC simulation at Tiers2

  • T0
  • T1

    • unscheduled downtime RAL

15th March 2012 (Thursday)

  • Data reStripping and user analysis at Tiers1
  • MC simulation at Tiers2

New GGUS (or RT) tickets

  • T0
  • T1

14th March 2012 (Wednesday)

  • Data reStripping and user analysis at Tiers1
  • MC simulation at Tiers2

New GGUS (or RT) tickets

  • T0
    • one ticket for pilots aborted at ce205 GGUS 80190
  • T1

13th March 2012 (Tuesday)

Experiment activities:

  • Data reStripping and user analysis at Tiers1
  • MC simulation at Tiers2

New GGUS (or RT) tickets

  • T0

  • T1
    • Gridka: queues were closed much earlier than the start of downtime. Site unusable for almost 1 week.

12th March 2012 (Monday)

Experiment activities:

  • Data reStripping and user analysis at Tiers1
  • MC simulation at Tiers2

New GGUS (or RT) tickets

  • T0
    • CERN : Castor upgrade this morning
  • T1
    • Gridka: queues were closed much earlier than the start of downtime. Site unusable for almost 1 week.
  • All sites:
    • scratch space in WNs local disk has been set as 20GB in VO card. Before it was 10GB. A broadcast message will be sent to all sites.

9th March 2012 (Friday)

Experiment activities:

Restripping, MC, User analysis

New GGUS (or RT) tickets

  • T0
    • CERN : setting the variable TMPDIR : (GGUS:79685)

  • T1
    • SARA : Incorrect platform (GGUS:80048) Solved by "hot fix" at site
    • GRIDKA : Condition DB unavailable (GGUS:79800)
    • PIC : Request for space token migration (GGUS:79305)
    • GridKa : Request for space token migration (GGUS:79303) "nearly finished"
    • SARA : Request for space token migration (GGUS:79307)

  • T2

8th March 2012 (Thursday)

Experiment activities:

Restripping, MC, User analysis

New GGUS (or RT) tickets

  • T0
    • CERN : setting the variable TMPDIR : (GGUS:79685)

  • T2

7th March 2012 (Wednesday)

Experiment activities:

Restripping, MC, User analysis

New GGUS (or RT) tickets

  • T0
    • CERN : setting the variable TMPDIR : (GGUS:79685)

  • T1

  • T2

6th March 2012 (Tuesday)

Experiment activities:

MC, User analysis, Restripping

New GGUS (or RT) tickets

  • T0
    • CERN : setting the variable TMPDIR : (GGUS:79685)

  • T1
    • GRIDKA : Pilots aborted at cream-5-kit (GGUS:79914) Fixed
    • GRIDKA : Condition DB unavailable (GGUS:79800)
    • PIC : Request for space token migration (GGUS:79305)
    • GridKa : Request for space token migration (GGUS:79303) "nearly finished"
    • SARA : Request for space token migration (GGUS:79307)

  • T2

5th March 2012 (Monday)

Experiment activities:

MC, User analysis, Validation productions for restripping

New GGUS (or RT) tickets

  • T0
    • CERN : setting the variable TMPDIR : (GGUS:79685)

  • T2

2nd March 2012 (Friday)

Experiment activities:

MC, User analysis, Validation productions for restripping

New GGUS (or RT) tickets

  • T0
    • CERN : setting the variable TMPDIR : (GGUS:79685)

  • T1

  • T2

1st March 2012 (Thusday)

Experiment activities:

MC, User analysis, Validation productions for restripping

New GGUS (or RT) tickets

  • T0
    • CERN : CASTOR problem to access some files : (GGUS:79629) Solved
    • CERN : setting the variable TMPDIR : (GGUS:79685)

-- JoelClosier - 01-Apr-2012

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2012-09-12 - JoelClosier
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCb All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback