August 2014 Reports

To the main

28 August 2014 (Thursday)

  • MC and User jobs: average 15,000 concurrent jobs, peaks at 35,000
  • Data transfers: DM operations for cleaning dataset placement (transfers and removals)
  • T0: NTR
  • T1: we again have problems with file transfers for users with a Brazilian certificate at two sites: GridKa and PIC. It works at other dCache sites. dCache developers are involved, as it seems related to the usage of UTF-8 by the Brazilian CA. It would be worth comparing the releases used e.g. at SARA or IN2P3 with that used at GridKa and PIC...
  • Services: not receiving any GOCDB eMail since August 7th... Problematic for operations team

25 August 2014 (Monday)

  • MC and User jobs: still low level due to holidays (few MC requests)
  • T0: Oracle intervention tomorrow morning. We shall do nothing but warn users. A few jobs may fail, but nothgin worth taking drastic actions if the intervention is short (~2mn)
  • T1:

21 August 2014 (Thursday)

  • MC and User jobs
  • T0: Problem with CASTOR lhcbtape. under investigation
  • T1:

18 August 2014 (Monday)

  • MC and User jobs
  • T0:
  • T1: PIC found some jobs running on their site and accessing file at GRIDKA.. Under investigation.
  • VAC : incident in Manchester

14 August 2014 (Thursday)

  • MC and User jobs
  • T0: PIlots submission problem (GGUS:107663 ); Fixed
  • T1: SARA, GRDKA: Impossible to replicate user's file (GGUS:107655). Brasilian certificate?

11 August 2014 (Monday)

  • MC and User jobs mostly
  • T0: Alarm ticket (GGUS:107587) EOSLHCB unavailable. SRM-EOSLHCB restarted.
  • T1:

7 August 2014 (Thursday)

  • MC and User jobs mostly
  • T0:
  • T1: Our ARC CE support fixed, so RAL in use again. Users have been reporting problems with timeouts from files on dCache at T1s. Investigating one by one with help of LHCb contacts at T1s. Will write tickets about any remaining problems.

4 August 2014 (Monday)

  • MC and User jobs mostly
  • Due to allowing more than one job manager, had 55000 jobs running concurrently over the weekend.
  • T0:
  • T1: PIC CRLs problem fixed. More RRCKI DNS problems but again fixed. Unable to use RAL currently because of ARC CE client library fixes needed on our side.

-- JoelClosier - 31 Mar 2014

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2014-10-06 - JoelClosier
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCb All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback