WLCG MW Readiness WG 10th meeting Minutes - May 6th 2015
Agenda
Attendance
- Local: Alberto Aimar (CERN IT/SDC mgnt), David Cameron (ATLAS), Lionel Cons (Monitoring expert, developer), Maria Dimou (chair & notes), Maarten Litmaath (ALICE & notes), Andrea Manzi (MW Officer), Andrea Sciaba (CMS), Alberto Rodriguez Peon (T0), Vincent Brillault (security).
- Remote: Alessandra Doria (Napoli), Steve Jones (Liverpool), Raul Lopes (Brunel Univ.), Gen Kawamura (GoeGrid), Jeremy Coles (GridPP), Matt Doidge (Lancaster), Pepe Flix (PIC), Sam Skipsey (Glasgow), Vincenzo Spinoso (EGI Ops Officer).
- Apologies: Sven Gabriel (EGI Security)
Minutes of previous meeting
The minutes of the
last (9th) meeting HERE were approved.
Summary
- After one year of full Readiness Verification activity, the WG is making a check-point of goals and priorities.
- ATLAS and CMS were invited to review their workflow twikis for possible changes in the MW products to verify.
- LHCb and ALICE are invited to declare if and for which products they plan to contribute to the MW Readiness WG.
- Possible stress tests were discussed, for products already verified from within experiment workflows. The decision was to leave this to the MW Officer, the experiment involved and the site on a case by case basis, as per the original definition in the WG documentation, point 2.5..
- EOS test for CMS at CERN is ready to start.
- ARGUS testbed at CERN is set-up and ready to start.
- NDGF, PIC, CNAF and Triumf are reminded to install the pakiti client Instructions.
- The presentation on software being developed
for our activities intends to show product versions being under test and their matching to the relevant rpms and, in addition, an automated way to display Baseline versions instead of the current, manually updated, table.
- The next vidyo MW Readiness WG meeting will take place on Wednesday June 17th at 4pm CEST
MW Officer report
Please follow Andrea M's slides
here
for full information.
- ALICE Xrootd 4 validation
- document at least which sites are running it in production
- remind ALICE of MW Readiness goals, infrastructure etc.
- new DM protocols are interesting for ATLAS
- cf. HTTP deployment task force
- typically tested via HammerCloud
- SAM: new tests will only come at the end, when a new protocol or MW has been established and deemed important
- stress tests
- by default only functionality is tested
- stress tests may occasionally be done depending on the product, use case, experiment
- MW Readiness setup (e.g. VMs) at sites may not support stress tests
- CMS: testing EOS at CERN currently hampered by a permission or mapping problem
- Argus
- test instance has been created, but shared
gridmapdir
currently missing
- needs dedicated NFS volume
- more effort should be available in June
- Pakiti client: PIC was busy with dCache upgrades, will install it soon
- Xrootd 4
- ALICE: see above
- ATLAS: not urgent
- CMS: depends on AAA plans
- PIC: still using Xrootd 3, also in test setup
- Brunel: tested DPM + Xrootd 4 + IPv6 OK
WLCG MW Software Status
Please follow Lionel's slides
here
for full information.
- Pakiti access normally is given per site to the admins named in the GOCDB plus the relevant NGI
- the same approach may need to be implemented for the MW Readiness Pakiti
- SSB is only relevant for sites sending Pakiti data
- the current SSB view shows mock data; it is to be integrated with the rest of the MW Readiness infrastructure
- the SSB view must not mix the data from MW Readiness test hosts with data from production hosts at the site
- the test hosts should send Pakiti data with a well-defined tag
- not "test" but rather "MW-Readiness" or so to avoid confusion
- the test and production hosts will then appear in separate views
Sites' feedback
- PIC
- new dCache 2.12.8
- Xrootd 3 + local redirector for CMS
- 10 TB
- working with CMS for PhEDEx config etc.
- on upgrades the dCache activity history is deleted, while the file data is kept: is that OK?
- CERN
- Pakiti client deployed on FTS hosts
- Argus progress
- Napoli
- got hit by auto update of Torque from EPEL
- like a number of other sites in EGI/WLCG
- Torque now taken from a repo with precedence over EPEL
- GridPP
- Edinburgh: MW Readiness effort kept alive with remote help from Glasgow
- QMUL: Daniel Traynor has taken over from Chris Walker
- Brunel: OK for ARC
Actions
Action items
Done from past meetings can be found
HERE.
- 20150506-04: CNAF to participate in the StoRM Readiness verification NEW
- 20150506-03: NDGF, Triumf, CNAF, PIC to install the pakiti client NEW
- 20150506-02: Joel and Stefan to state if and how they wish to participate in the MW Readiness verification effort. NEW
- 20150506-01: Maarten to check with ALICE which version use which xrootd version and if they wish to participate in the MW Readiness verification effort. NEW
- 20150318-05: Pepe to proceed with the MW Readiness set-up at PIC In Progress
- 20150318-04: Raul to proceed with the ARC-CE testing at Brunel, as soon as it is released next week DONE
- 20150318-03: Alessandra to coordinate with Andrea M. the CREAM-CE testing. DONE
- 20150318-02: Ben to set-up the ARGUS testbed at the T0 DONE
- 20150318-01: Manuel to communicate to EOS and FTS managers the reminder of the Pakiti client installation instructions here.
- 20141119-03: Andrea M. to contact the GRIF site to proceed with WN testing via the CMS workflow POSTPONED
- 20140702-06: Andrea M. & Lionel Discuss the visualization of testing results. On-going
AOB & Next meeting
- EGI: support of EPEL-7 is becoming important for sites
- depending on the product
- discussed at Operations Management Board
meeting on Apr 30
- products released on that platform will need to be verified
- for the UMD
- in MW Readiness testing where relevant
- WLCG sites need to provide their WN capacity on SL6-compatible OS
- readiness for RHEL-7 derivatives would make things easier for sites and allow access to more resources
- verification by the experiments will be a lot of effort and thus cannot be expected to happen any time soon
--
MariaDimou - 2015-03-17