January 2014 Reports

To the main

30 Jan 2014 (Thursday)

  • Mostly simulation and user jobs. Smooth running over most of the grid.
  • T0: NTR
  • T1: Brief problem at SARA on 28th when two rogue worker nodes caused a lot of jobs to fail (GGUS:100576, GGUS:100577). Fixed quickly.
  • T2: Failed pilots at ARAGRID-CIENCIAS (Spain - GGUS:100625).

27 Jan 2014 (Monday)

  • Mostly simulation and user jobs. Smooth running over most of the grid.
  • T0: NTR
  • T1: Brief scheduled downtime of IN2P3 for "node reconfiguration"
  • T2: Downtime of CBPF (Brazil) due to powercut. Admins still trying to bring up services there.

23 Jan 2014 (Thursday)

  • Mainly MC, few users jobs.
  • T0: Yesterday, all jobs failed at CERN ONLINE: "Failed to upload output data".
  • T1: NTR

20 Jan 2014 (Monday)

  • Mainly MC jobs (less than 3% with erros), few users.
  • T0: Ticket opened on Saturday (https://ggus.eu/ws/ticket_info.php?ticket=100368) concerning MC erros: ":[Errno 28] No space left on device" Problem has been identified and fix is ongoing.
  • T1: NTR
  • T2:
    • CBPF(T2-D) SEs banned. FTS3 was stuck due to this site SE issues.

16 Jan 2014 (Thursday)

  • Only MC and user jobs
  • T0: NTR
  • T1: NTR
  • T2:
    • problems of transfers to CBPF (FTS3) partially understood (gridftp not returning performance marked -> disabled), but still transfers >~ 3650 seconds failing, although timeout is 7200 s)

13 Jan 2014 (Monday)

  • Heavy ion reprocessing completed but 2 files (too complex events, timing out)
  • MC and user jobs only: no tape recalls required, only disk access for user jobs and upload from MC jobs and MC-merging jobs
  • T0:
    • NTR
  • T1:
    • Problems of SRM instability at IN2P3. Downtime over, but still SE under scrutiny
  • T2:
    • Problems with CBPF storage: limited number of concurrent transfers to 4

9 Jan 2014 (Thursday)

  • Main activities are Monte Carlo and User jobs
  • T0:
    • Move to new SRM (SHA2 enabled) was not successful and switched back to the previous version.
  • T1:

6 Jan 2014 (Monday)

  • reprocessing of ProtonIon collisions almost finished (GRIDKA & CERN)
  • At other sites main activities are simulation & user jobs
  • T0:
  • T1:
    • GRIDKA: problems with staging of files, issue resolved after vendor intervention (GGUS:99972)

-- JoelClosier - 31 Mar 2014

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2014-03-31 - JoelClosier
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCb All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback