Jan 2018 GDB notes

Agenda

https://indico.cern.ch/event/651349/

Introduction (Ian Collier)

presentation

  • Helge: Upcoming events: Condor workshop at RAL 04 07 September is still missing (Note: added now.)
  • Romain: Spectre and Meltdown:
    • Variant 1 and 3 of vulnerabilities as listed on the slide are easily exploitable and fixable by installing updated kernels, hence keep your systems up to date!
    • Variant 2 is more involved, requires microcode update, which are not available for all CPU types concerned yet.
  • Mattias: With Debian and Ubuntu, microcode is not installed by default, what is the situation with RedHat/CentOS/SL?
    • Romain: It is installed by default on these distributions

CWP Status (Graeme Stewart)

presentation

  • Andrea S: Is the CWP a one-time effort, or will it be kept updated?
    • Graeme: Periodic updates, some every 2...3 years, have been proposed, but it will be a question of balance between effort and gain
  • Andrea V: When will the author list to the CWP be closed?
    • Graeme: If we publish in CSBS, the submission would be a natural deadline; question will be further discuss in the HSF meeting tomorrow

Workload Management update (Erik Mattias Wadenstein)

presentation

  • Maarten: HTCondorCE not yet ready for large-scale deployment due to issues concerning monitoring and accounting.
  • Ian C: Has a process been defined yet how to address these issues? Perhaps in the context of EOSCHub something can be foreseen; the accounting task force should take ownership.
  • Helge: Monitoring issues were already discussed in HEPiX, and a working group was formed that did not really take off; we should take this up again in the HEPiX board.
    • Ian C: We should profit from the proximity (in space and time) of the spring HEPiX with the HTCondor week to address these issues, which requires a list of issues to be addressed to be prepared in advance.
  • Mattias: Im not aware of any bug tracker holding all the strings together.
  • Tim B: Note that some large sites are using HTCondorCE just fine, avoiding a full load of issues encountered with CreamCE previously.
  • Maarten: Will update the document with known issues.
  • Julia A: It is not clear whether solutions working at CERN and PIC are generic enough, in particular if they work with all batch systems.
  • Helge: This is not required HTCondorCE is expected to be used with HTCondor only. (Julia: but people try nonetheless!)

Container WG report (Gavin McCance)

presentation

  • Latchezar: the analysis use cases must not slow down the
    convergence toward recommended Singularity configurations
  • Some discussion followed...
  • Answer: the analysis use cases have lower priority indeed

WLCG Workshop forward look (Benedikt Hegner)

presentation

  • Alessandra D: the big room is also available on the last day now

  • Ian C:
    • In his presentation, Graeme highlighted the relevant CWP areas
    • The workshop will allow people to see what the CWP means in practice
      and may engage people to take part in specific work packages
    • GDB meetings will be used to report on their progress

ARC update (Erik Mattias Wadenstein)

presentation

  • Ian C: the support for container jobs should be made known to the Container WG

  • Ian C:
    • Might the ARC community take an example from the HTCondor community
      and have regular workshops where devs and users can interact?
  • Mattias, Ulf:
    • The yearly NorduGrid conference already has provisions exactly for that

OSG-NSF 2 factor & HPC issues (Edgar Fajardo Hernandez)

presentation

  • Maarten: shouldn't the funding agencies be informed about how
    these technical hurdles make life difficult for scientists?

  • Edgar:
    • Indeed, e.g. the Stampede2 deal had to be agreed between the directors of TACC and OSG
    • We would like to meet with NSF showing working examples
      • Compare different approaches
      • XSEDE sites are very different

  • Ian C: what numbers of cores are we talking about?
  • Edgar:
    • Sizable allocations on the various supercomputers
    • Their total numbers of cores can give an idea:
      • Comet: 50k
      • TACC: 100k+ KNL
      • Blue Waters: 250k
Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2018-01-12 - IanCollier
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback