WLCG SL6 Migration Task Force - 19th March 2013

Agenda

Attendance

  • Alessandra Forti, Maarten Litmaath, Renaud Vernet, Rod Walker, Alessandro De Salvo, Emil Obreshkov, Joel Closier, Helge Meinherd, Di Qing, Peter Gronbech, Alessandra Doria, Andreas Petzold, Shawn McKee, Brian Bockelman, Christoph Wissing, Matt Doidge, Andrea Valassi, Ben Couturier, Marco Clemencic, Stefan Roiser

HEPOS_libs and other rpms

After long discussion both via email and during the meeting we got to the following conclusions:

  1. Documentation for SL6 is now available in the twiki page. If there are any issues they should be reported to Andrea or Fabrizio
  2. Any problem with the rpm itself should also be reported to Andrea or Fabrizio as reported in the documentation twiki page
  3. HEPOS_libs is going to be recompiled without castor dependencies which were required only for the compilation of atlas castor storage plugins which are not needed by normal users.
  4. Andrea verified that the current rpm doesn't depend on curl-openssl which is in the current repository
  5. The rpm needs to be tested on SL6 and one or more of the sites will do that.

we then discussed why the current location isn't ideal for sites outside CERN

  1. It's a CERN "extra" repository i.e. it is likely to contain CERN dedicated rpms that might conflict when used at another side with a similar but not identical OS such as SL6(-C)
  2. We need a place for possible other experiments rpms and pieces of software that at the moment are out in the wild in different format. At the moment there are the N2N xrootd libraries and monitoring plugins used by atlas that interest also CMS. We need to track down what other software might be out there. This problem of the software out in the wild has been dragged for many years and we need to put in place some security standards for the software that is distributed.
  3. Ideal would be to have a common experiments repository at CERN, Maarten has an action to start a preliminary investigation to find the resources for this WLCG repository.

Experiments and Sites Status

Experiments

Question Alice Atlas CMS LHCb
Can the experiment run on SL6? yes but tested only on small scale yes, but still some compilation problems with analysis rels yes, in compatibility mode working on native exec yes, in compatibility mode working on native exec
Can the experiment run on mixed clusters? yes no yes in compatibility mode but would prefer different queues so there are no problems when SL6 native executables are introduced yes, pilot framework can do right thing
Big bang or gradual upgrade? gradual allows to find any problem without risks gradual but can cope with big bang if sites need gradual doesn't matter as can run on mixed clusters
Does the experiment have already SL6 sites? a few small sites SARA T1 and US atlas site, Brunel (only prod) half USCMS and few test sites in EU (Brunel, Desy, CNAF) some sites
Are there time constraints? no prefer sites NOT to upgrade before 1st June 2013 no indicatively would prefer summer though no
Do the experiments have an upgrade procedure in place? yes, but informal yes, two linked from TF page    
Comunication with sites site contacts list, EGI ops central ops and cloud support local experts  
Test procedure yes several sites with test queues yes, if sites strongly require yes

Sites

  • T0 has a scheduled already established and reported in various other meetings included WLCG ones, some pointers are linked from the TF page for those who are interested but in short: at the end of April 20% of lxplus will be moved to SL6, the remaining SL5 machines will go under lxplus5 and a fraction of lxbatch resources will also be moved to SL6 but SL5 will continue to be there for the next few months.
  • AGLT2, Lancaster, Oxford, Manchester are all thinking about setting up test queues their major problem is that they are all moving to puppet as site configuration tool
  • Napoli and TRIUMF are keen also to start testing and don't have the configuration management problem
  • IN2P3 and KIT are keen to start moving resources but can wait until June
Conclusions that can be valid for all experiments (and work on shared clusters)

  1. Between now and the 1st of June: unless a site needs to upgrade it is suggested sites start testing SL6 especially if they have test queues already setup.
  2. After the 1st of June: sites will be encourage by all experiments to upgrade.
    • Since the target date is 31st October this gives us 5 months to move the bulk of resources.
  3. Most experiments prefer gradual upgrade rather than big bang and LHCb doesn't care.
  4. The Task Force will track sites with one or more tables in the twiki
  5. Task Force will communicate this scheleton plan to sites.

Actions

  • Maarten: investigate CERN internal resources for WLCG rpms repository.
  • Sites setting up SL6 test queues to test HEPOS_libs and report problems to Andrea/Fabrizio (and the TF)
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2013-04-24 - AlessandraForti
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback