WLCG MW Readiness WG 7th meeting Minutes - November 19th 2014

Agenda

Attendance

  • Local: Vincent Brillault (CERN Computer Security Group, Pakiti expert), David Cameron (ATLAS), Lionel Cons (Monitoring expert, developer), Maria Dimou (chair & notes), Maarten Litmaath (ALICE & notes), Andrea Manzi (MW Officer, DPM expert), Alberto Rodriguez Peon (T0), Andrea Sciaba` (CMS & WLCG Ops Coord), Markus Schulz (CERN-IT/SDC leader).
  • Remote: Jeremy Coles (GridPP), Catherine Biscarat (IN2P3), Sven Gabriel (EGI Security leader).
  • Apologies: Joel Closier (LHCb)

Minutes of previous meeting

The minutes of the last (6th) meeting HERE were approved with a correction in Action 20141001-01 where no HTCondor mention should be present. New text at the end of this page.

Summary

  • The discovery of the DPM 1.8.9 bug via the MW Readiness verification process was the proof that this effort is needed, useful and actually working as it should.
  • When a MW package version is proved to work via the workflow of a given experiment and a new version is out for verification, other experiments which started later should go directly to the most recent version at hand.
  • The validation of a new version will (need to) have a deadline in practice, beyond which the affected MW may (need to) get deployed anyway e.g. to fix issues experienced by sites.
  • FTS3 is listed as desired product to verify for Readiness for ATLAS and CMS, but it already has a well-established validation process in close partnership with the experiments and the few sites that need to run the service for WLCG. In fact, the MW Readiness paradigm has been inspired by the experience gained with the FTS pilot service model in the last many years. The FTS3 team should rather be asked to publish their validation exercises also in the MW Readiness realm.
  • The validation of new Condor-G versions for ATLAS requires one pilot factory for testing. The MW Officer will follow up with ATLAS.
  • New versions of Xrootd will see significant coverage by the validation of new versions of EOS, but standalone Xrootd instances may require further testing. For now only OSG sites are concerned and they are covered by validation processes in OSG, US-ATLAS and US-CMS.
  • As CMS jobs take a few common grid clients from the WN (not CVMFS), it makes sense to have new versions of the WN validated for CMS.
  • Following technical discussions between the developers of the MW Package Reporter and Pakiti and the WLCG and EGI Security responsibles, a technical solution of common agreement was adopted by which each site will be given the option to enable pakiti only, the Package Reporter or both. Thus security concerns are addressed and the site independence is respected. A release along these lines is expected during the 1st quarter 2015.
  • Our JIRA tracker (alternative Dashboard view) contains up-to-date progress information, on all fronts, at all times.
  • The Experiment Workflows linked from our twiki show the Products, Volunteer Sites and applications used for the Readiness verification.

MW Officer report

Andrea M. gave a presentationof activities carried out since our Oct 1st meeting. This is the list of Products verified for Readiness, so far:
  • DPM, pioneer product & Proof of Concept, i.e. bug discovered thanks to the Readiness verification process.
  • CREAM CE, helped enhance our Guidelines
  • dCache, v.2.6.35 ok. Now tackling v.2.11.0
  • StoRM, completing the set-up for ATLAS
  • EOS, completing the set-up for CMS
  • VOMS client, the LHCb participation
  • FTS3, de facto done in production, as it runs at few sites and bugs are caught early and fixed fast.
  • Coming up: HTCondor, xrootd, Check here our full list of candidate products.

Here is a summary of the Volunteer Sites active so far (or with the intention to act a.s.a.p,):

  • Edinburgh, verifies DPM for ATLAS, runs the Package reporter.
  • Napoli, verifies CREAM CE for ATLAS, runs the Package reporter.
  • Triumf & NDGF, verify dCache for ATLAS, do not run the Package reporter.
  • QMUL & CNAF, verify StoRM for ATLAS, do not run the Package reporter.
  • GRIF, verifies DPM for CMS, runs the Package reporter.
  • Legnaro, verifies CREAM CE for CMS, runs the Package reporter.
  • PIC, verifies dCache for CMS, does not run the Package reporter.
  • CERN, verifies FTS3 for ATLAS & CMS and EOS for CMS, does not run the Package reporter.
  • Check here the whole Volunteer Sites list.

WLCG Package Reporter

None of the alternative technical solutions presented at our last meeting on 2014/10/01 and minuted here will be adopted. Technical discussions between the developers of the MW Package Reporter and Pakiti as well as the WLCG and EGI Security responsibles led to a technical solution of common agreement by which each site will be given the option to enable pakiti only, the Package Reporter or both. Thus security concerns are addressed and the site independence is respected. A release along these lines is expected during the 1st quarter 2015.
  • The integration option with Pakiti has been chosen: code will be shared but data will not be shared
  • Sites will have full control on where their package information goes: EGI security or MW readiness or both (or none)
  • The package reporter will be made more generic and will be submitted to EPEL
  • Work on the package database has progressed: a proof-of-concept REST API is being tested
  • Work on the reporting will follow next
  • The rollout plan may to some extent depend on Pakiti development timelines.
  • The reporter client should be in EPEL in ~Jan.
  • The current version is still good for scale tests of the DB, hence we will try to get it deployed on the WN of volunteer sites.
  • Question by Sven Gabriel, EGI Security officer: do sites know what is logged and who has access?
    • At this time only the few volunteer sites are concerned and there is documentation.
    • When we have the combined client ready for large-scale rollout, we will make that information very clear.

Sites' feedback

  • CERN:
    • FTS3 instances are being optimized.
    • An EOS test endpoint is being set up for CMS.

Actions

Action items Done from past meetings can be found HERE.

  • 20141119-04: Andrea M. to discuss with Legnaro their current installation of the MW Package Reporter. New(?) given it is done as per JIRA:14?
  • 20141119-03: Andrea M. to contact the GRIF site to proceed with WN testing via the CMS workflow NEW!
  • 10141119-02: ATLAS, CMS, LHCb to contact Andrea M. if interested in the Data Management Client (DMC) libs available on grid.cern.ch CVMFS area. NEW!
  • 20141119-01: Maria D. to update the Guidelines. Done
  • 20141001-02: Lionel, Andrea M. and the Pakiti team to decide the MW Reporter-Pakiti join option. Done
  • 20141001-01: Andrea M. to enroll in the condor-announce mailing list (htcondor-world@csNOSPAMPLEASE.wisc.edu), ask Napoli to set up a test CREAM CE (that should get tested by HTCondor at some point). Done
  • 20140702-06: Andrea M. & Lionel Discuss the visualization of testing results. On-going

AOB

Edit | Attach | Watch | Print version | History: r20 < r19 < r18 < r17 < r16 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r20 - 2018-02-28 - MaartenLitmaath
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback