January 2017 GDB notes



Introduction (Ian Collier)

Slides: https://indico.cern.ch/event/578983/contributions/2463643/attachments/1409303/2154949/GDB-Introduction-2017-02-08.pdf

AFS phase out (Jan Iven)

Slides: https://indico.cern.ch/event/578983/contributions/2463101/attachments/1408377/2154415/NOAFS_GDB_20170208.pdf

  • Background to the AFS Disconnection test - 2017-02-15 09:00 CET lasting 24h

Benchmarking F2F Report & Discussion (Andreas Petzold, Domenico Giordano, Manfred Alef)

Slides: https://indico.cern.ch/event/578983/contributions/2463103/attachments/1409363/2154992/pre-GDB_benchmarking_summary.pdf

  • CMS stated that they will get involved and provide their findings next time.
  • Seems it is clear that DB12 provides better information for the job running on the node. Need to look into the way publishing is affected
  • Suggested in discussion - particularly by LHCb and ALICE - based on the way DB12 scales for applications - that proposal goes to the MB that we start the process of migrating away from HS06 for all aspects, including pledges.
  • Will have a further discussion at the April GDB with the aim of making a proposal to the MB.

EOS Workshop Report (Andreas Joachim Peters)

Slides: https://indico.cern.ch/event/578983/contributions/2463106/attachments/1409434/2155132/EOS_Workshop_Summary_GDB.pdf

The 1st EOS Workshop (2.-3. February 2017) meeting was well attended. The slides presented at the GDB show a summary of the workshop, in around 140 slides. This initiative hope to continue and extend the EOS team collaboration with the sites, users and developers. It is a useful from to get feedback, discuss new developments, get new requests from the community, and define and maintain joint projects.


(Q) Ian C.: Is CERN intended to provide EOS support in the community, as a whole? (A) Dirk H.: From the development p.o.v, the CERN already provides support to any community using EOS, but the direct support cannot be guaranteed. CERN is always interested in people providing feedback, and applying changes and introducing new requests, but the current team is small. Since this is Open Source project, contributions are welcome as well. Said that, feature requests are minimal and support/questions are minimal. People is confident using EOS, since CERN is using it.

CS3 Synchronising Data Workshop report (Luca Mascetti)

Slides: https://indico.cern.ch/event/578983/contributions/2463108/attachments/1409446/2155288/CS3_Summary.pdf

The Workshop occurred in SurfSARA, Amsterdam, 30-Jan.-1 Feb. 2017. This is the 3rd workshop organized of these series, and it was a well attended event (130 participants, form sites and some companies). Several sessions were scheduled (no parallel), including a Dropbox keynote talk. Talks centered on application and user needs (swan, Earth Observation interactive analysis, Data Management system for Heart Valve diseases). The key message: scaling & performance optimization. CS3 satisfaction survey was circulated afterwards: positive feedback collected so far.


(Q) Ian C: Many features from communities can be common - convergence would make sense from the different tools/communities involved? (A) Luca M: Own cloud, etc… Already did that exercise in the past. This was discussed in the workshop, under the projects and collaboration session.

CERN Tape Archive (Steven Murray)

Slides: https://indico.cern.ch/event/578983/contributions/2463111/attachments/1409449/2155155/GDB_meeting_08_Feb_2017.pdf

This is a new tape backend project at CERN, attached to EOS, and supposed to replace CASTOR. CTA (CERN Tape Archive) is a natural evolution of CASTOR: tape backend for EOS / tape drive scheduler / clean separation between disk and tape. One single CTA instance for all of the VOs. When? End of 2018 ready for the LHC experiments. CTA will be flexible, and it could be used anywhere EOS is used. ORACLE DB will be used, but it could be modified to run over other DBs.


(Q) Pepe F.: Why not to use or join other initiatives in our community, such as Enstore, and adapting it to EOS? (A) German C. / S. Murray: CASTOR built-in info will be re-used. The tape backend is not going to be rewritten from scratch.

(Q) Pepe F. Which support will be provided to the Tier1s that would adopt CTA? (A) S. Murray: not in the plan, but the project can be collaborative if other sites are interested in using this.

(Q) Peter G.: Will CASTOR be totally discontinued? Since other sites are using it, in particular at the UK Tier-1? (A) S. Murray: support will be available for the moment, but there will be some flexibility supporting it for the near future.

(comment #1) Ian C.: if CTA offers possibilities to other communities, or offer sustainable product, or missing features, would be really interesting.

(comment #2) S. Murray: CASTOR metadata/namespace operations is slow, not as EOS

Baseline for WLCG Stratum 1 Operations (Jakob Blomer)

CERNVM-FS has extended, including other communities that have adopted it (OSG, EGI). This talk should be used to discuss baseline/guidelines wrt. WLCG. The presentation gives detailed information how the infrastructure is setup for WLCG, which are the most typical VO requirements, handling operational troubles, as well as proposing a baseline/guideline for WLCG Stratum 1 deployment.


(Request) Possibility to keep a table list of Stratum 1 URLs in WLCG? Maarten L: this will be sorted out.

(comment #1)) Maarten L: We might need more documentation - in the CVMFS area and then in WLCG area - about configs, etc...

(comment #2) It is expected that default configs will be disabled

(comment #3) CVMFS sometimes if serves corrupted data or slow - this hurts -> For example, the DOS on ASGC Stratum 1, then port 80 was closed. Service degraded.

(comment #4) Mattias W.: We could have a RPM for deployment, but then the actual config. could come from the site. We hit performance thresholds for not using >1 source of data. It's the client who contact. Client goes to S1 from a squid. Multiple routes, like in the CDNs, could be useful. J. Blumer: the commercial CDNs are much more flexible, agreed.

(comment #5) There is an area for further development work, so select the S1s from the clients. Maybe not through squids? or an http-proxy? If ASGC is slow, then you can switch and query other S1. This could solve this problem.

(Q) Mattias W.: do we need more S1 or less and more reliable S1? (A) Ian C: we don't need more, but it's good to have one in each side of the world. The infrastructure as it is now is vulnerable if one S1 is unreliable. We should aim not to increase much the number of S1s we have around, but improve their reliability.

-- JosepFlix - 2017-01-12

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2017-02-28 - IanCollier
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback