WLCG MW Readiness WG 6th meeting Minutes - October 1st, 2014

Agenda

Attendance

  • Local: Alberto Aimar (CERN-IT/SDC management), Vincent Brillault (CERN Computer Security Group, Pakiti expert), David Cameron (ATLAS), Simone Campana (ATLAS), Lionel Cons (Monitoring expert, developer), Maria Dimou (chair & notes), Maarten Litmaath (ALICE & notes), Andrea Manzi (MW Officer, DPM expert), Alberto Peon (T0), Stefan Roiser (LHCb), Andrea Sciaba` (CMS & WLCG Ops Coord).
  • Remote: Maria Alandes Pradillo (WLCG Ops Coord co-chairperson), Cristina Aiftimiei (EMI), Stephen Burke (UK), Joel Closier (LHCb), Jeremy Coles (GridPP), Mario David (LIP), Daniel Kouril (Pakiti expert), Joao Pina (EGI Staged Rollout manager).
  • Apologies: Massimo Sgaravatto (Legnaro).

Minutes of previous meeting

The minutes of the last (5th) meeting HERE were approved.

Summary

  • The experience developed from the Readiness verification of DPM, the CREAM CE and BDII at different Volunteer sites was well documented and paved the path to now test the next DPM version and more products from our shortlist starting with dCache, Storm, xrootd and FTS3. This will entail the involvement of more Volunteer sites and the completion of relevant experiment workflows' documentation.
  • A database (DBoD for now) was designed by the MW Officer Andrea M. and the Package Reporter developer Lionel to store the verification results.
  • ATLAS asked the MW Readiness WG to be involved in the HTCondor testing for various CE types. CMS take care of such testing themselves.
  • LHCb will test the VOMS client on behalf of the MW Readiness WG.
  • The Tier0 will participate in the MW Readiness effort by testing EOS and FTS3.
  • The developer of the MW Package Reporter presented its design, the number of hosts and sites it now runs, the alternatives being examined for interoperability with Pakiti. Pakiti expert Daniel Kouril was also connected to the meeting.
  • The next meeting will take place on November 19th at 4pm CET,

MW Officer report

Slides on the agenda. Related discussion:
  • ATLAS populate atlas.cern.ch CVMFS area from tar balls. They will not use grid.cern.ch, as it would be tricky to integrate with the ATLAS SW.
  • LHCb do use grid.cern.ch. They commit to participate in the MW Readiness effort by testing the VOMS client.
  • The DPM pilot for the MW Readiness verification went well and its procedures can now be adapted for dCache and StoRM. The monitoring links for tracking the DPM verification worked OK.
  • The workflow for CREAM verification is not yet complete.
  • Only Edinburgh, Legnaro, GRIF were active so far, but other Volunteer sites will naturally be involved for other products in the list.
  • The successful verification of a new version may have implications for the baseline of the given product.

WLCG Package Reporter

Slides on the agenda. Related discussion: - Lionel prefers option A2 for joining the Package Reporter to Pakiti

- Daniel has no preference yet, all 4 options will be evaluated, A1 first

- different options have different implications on the long-term support by WLCG and the Pakiti team

- when the 2 projects are joined, each will depend on the other

- we want to avoid another support saga like we are having for Argus

- different options may also have different scalability concerns

- while each worker node will get the MW via CVMFS, it still needs to report its rpms to Pakiti

- the rpm reporting frequency is daily

- CERN grid services want to have a single solution that is acceptable to the security team

- we need to continue with deployment of the current Package Reporter to gain operational experience, even if at some point we may have to replace it with a new version for the joint goals

- Lionel, Andrea and the Pakiti team will soon decide the chosen option

- the Package Reporter will allow us to see what is installed where, compare versions with the baseline, compare them with what has been verified by a given experiment, and produce various reports

- we will also be able to match operational issues with certain versions that are found at affected sites

- the visualization is being worked on

Sites' feedback

Only a few sites were involved so far. They just install the MW as usual, possibly with some manual tweaking because of the special status of the affected services.

Discussion on HTCondor verification

  • ATLAS do not want to track releases of the different CE types: that should be done by the MW Officer
  • Α small testbed is needed to allow for continuous testing of all the components involved: HTCondor-G (pilot factory), CREAM, ARC, HTCondor CE
  • Αn ATLAS expert will run the test pilot factory and upgrade HTCondor-G when a new release is announced e.g. by the MW Officer
  • Αt least one friendly ATLAS site is needed per CE type and it should upgrade its test CE when a new release is announced e.g. by the MW Officer
  • CMS have not expressed interest in a similar setup for them. Their position The testing that OSG and in particular the glideinWMS developers do, is enough. CMS experts discuss with them which version should be deployed on the pilot factories, etc. From this point of view, HTCondor is seen as an experiment service, because of course its usage as batch system is not tested by CMS or the glideinWMS team. So, the feeling is that the current interaction between CMS and the HTCondor team is good enough and there is no strong motivation to set up any new testing system as we do in for the MW Readiness verification of other services.

Actions

  • 20141001-01: Andrea M to enroll in the condor-announce mailing list (htcondor-world@csNOSPAMPLEASE.wisc.edu), inform Napoli, which tests the CREAM CE to install HTCondor and test the two together. NEW!
  • 20140702-06 Andrea M & Lionel Discuss the visualization of testing results. On-going
  • 20140702-05 Volunteer Sites Install the WLCG MW Package Reporter and report on the clarity of the instructions. Done. Feedback was given and reflected in the code.
  • 20140702-04 Andrea M. Present the status of DPM Readiness verification exercise at the 20140724 WLCG Ops Coord Meeting. Done. Report HERE.
  • 20140702-03 David C. (ATLAS) Clarify the ATLAS position on the CVMFS use and the exact location for clients’ candidate releases. Done. Documented HERE.
  • 20140702-02 Joel (LHCb) Discuss in the LHCb collaboration and document in their workflow page, linked from the WG twiki, if and which sites will participate in the DPM Readiness verification exercise. Done. LHCb will only be involved with VOMS client verification for now.
  • 20140702-01 Andrea S. (CMS) & David C.(ATLAS) Decide internally if USATLAS or USCMS can take ownership of HTCondor new versions’ validation, via test instances of pilot factories., also validating against CREAM and ARC CEs. Done. See the HTCondor row in the Product Table.

 

Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r5 - 2014-10-15 - MariaDimou
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback