Status and Purpose

Minutes of the phone meeting on 20/02/2014: https://indico.cern.ch/event/303632/

This page summarizes the ideas for distributing configuration and keys for Stratum 0 / Stratum 1 installations beyond WLCG ("small VOs"). This becomes an issue as more and more Stratum 0 and Stratum 1 servers not hosted at CERN or Fermilab but still wanting to use grid resources. These additional servers are likely to develop in a mesh fashion rather than a strictly hierarchical fashion.

With the exception of public cern.ch keys, key and configuration distribution is currently not part of the CernVM-FS client but has to be distributed through alternative channels. It might be beneficial, however, to work on a better integration.

Status of Non-WLCG Stratum 0 installations

  • CERN Stratum 0 only: atlas-nightlies.cern.ch
  • EPFL: bpp.epfl.ch (?)
  • DESY: ilc.desy.de (others?)
  • OSG: oasis.opensciencegrid.org (others?)
  • EGI (located at NIKHEF): vlemed.amc.nl
  • EGI (located at RAL): {mice, na62, hone, phys-ibergrid, wenmr, hyperk, cernatschool, biomed, glast, t2k}.gridpp.ac.uk
  • EGI - others?

CernVM-FS Components and Roles

Fully qualified repository name (FQRN)

A valid DNS name that uniquely defines a repository. The fqrn, however, does not identify any network host. The domain gives merely a hint where the repository is maintained.

Repository Installation Box

  • Single machine with cvmfs_server tools installed
  • ssh access given to repository maintainers
  • Has the host key and certificate that is used to sign the repository manifest (".cvmfspublished")
  • May have it's own http server or make its changes directly available to the stratum 0 with a shared filesystem"

Stratum 0

  • Authoritative web server used for distribution
  • Usually not queried by clients (exception: nightly builds)
  • Source of replication for Stratum 1

Stratum 1

  • HTTP mirror server
  • Serving purely static content (currently)

Repository signing service

  • The whitelist (.cvmfswhitelist) contains the certificates that are allowed to sign the manifest
  • The whitelist is signed by a master key, whose public part has to be available at the clients and Stratum 1s
  • CERN's public master keys currently comes with the source
  • The .cvmfswhitelist file is resigned every month (adjustable) and can be updated independent from actual repository updates
  • Often hosted on the same box as the Stratum 0

Note: Dave and Jakob published a paper at CHEP that sketches PKI based variation of the whitelist signatures. The whitelist would be signed by one or several certificates (instead of a key). All certificates are fully verified (including CRL). While this allows for key revocation, I think that would not necessarily change the distribution problem: instead of a public master key, in this case the DNs of the certificates that may sign the whitelist need to be distributed.

Problems and questions

  • Problem: Configuration and key changes (content of /etc/cvmfs/...) is difficult to disseminate throughout the grid in a timely fashion
  • Question: can we use the same distribution methods that are used for grid CRLs and critical security updates? CRL distribution unsuitable for cvmfs key distribution, yum repository only if dedicated and set to auto-uptdate.

  • Problem: the manual resigning of the whitelists single point of failure
  • Question: how well does distribution and resigning work with grid certificates in comparison to cvmfs keys?

  • Problem: the repository name gives no handle to determine Stratum 0 / Stratum 1 servers.
  • Question: do we need a registry to solve the bootstrap problem? Desirable.

  • Problem: the configuration of site Squids is mostly local
  • Question: how far are we away from implementation of the recommendations of the Squid auto discovery task force? Dave: envisaged end of 2014, depends on squid monitoring task force and cvmfs implementation. Jakob: Implementation in cvmfs is ready except for round-robin DNS.

  • Problem: the order of Stratum 1 server has to be selected manually by clients

  • Problem: some repositories might have non-standard configuration, e.g. non-shared cache, environment variables, ...
  • Question: shall we provide configuration hints in the repository that clients can apply?

Ideas

Distribution of the keys

  • Distribution of keys and config through RPMs
  • Pro: easy
  • Contra: Can we ensure up-to-date RPMs on grid sites?

  • Distribution of keys and configs using a special cvmfs repository
  • Pro: easy, with some cvmfs client additions
  • Contra: do we create short circuit on clients? Can we live with the inter-repository dependency?

  • A few sites (e.g. CERN, OSG Global Operations Center) provide a key signing service for others
  • Pro: now key distribution problem, can be done as now with the sources
  • Contra: all small VOs need to run through the key-signing sites, key-signing sites need to trust small VOs.

  • A few sites provide the cvmfs repositories to all of the grid; installation boxes can be elsewhere but the domain name is fixed to those few sites.
  • Pro: easy, client already supports
  • Pro: establishes chain of responsibility for every repository
  • Pro: this is the current OSG plan
  • Contra: similar to key signing service, a bit more work to do by the few sites
  • Contra: can lead to misleading repository names

Discovery and ordering of Stratum 1s

  • DNS based registry of repository name --> stratum 1 mapping
  • Pro: DNS is everywhere available
  • Pro: Commercial providers available
  • Contra: Someone has to operate/pay for such an additional service
  • Contra: provides only stratum 1 host names, nothing else; in particular no chain of responsibility. If a PKI-based infrastructure were used and a well-known alias within a domain (such as cvmfs-authorizer.domain.name) then security could be established to show that the repository really did come from that domain, but there still needs to be an authority somewhere saying that the domain is acceptable.

  • Geo-IP DNS that returns optimal order of Stratum 1s ("Wikipedia approach")
  • Pro: We would benefit from DNS caching, DNS is everywhere available
  • Pro: Commercial providers available
  • Contra: Requires PowerDNS or expensive subscription (~500-1000 dollars a year)
  • Contra: does not work if the resolver is not close to the client (e.g. Google's 8.8.8.8)

  • Stratum 1s provide a REST interface that returns optimal order based on Geo-IP
  • Pro: We already have stable web servers and proxy caches
  • Pro: knowing about a single stratum 1 server would be enough; the repository maintainers can maintain the full list of all Stratum 1s.
  • Contra: What currently serves only static content becomes a web service
  • Contra: Requires getting a (non-ordered) list of Stratum 1s to the client by another means

Discovery of Squid proxies

This is essentially covered by the WLCG Http Proxy Discovery task force. It might be an advantage to also integrate dynamically provisioned Squid proxies as done by Shoal.

-- JakobBlomer - 06 Feb 2014

Edit | Attach | Watch | Print version | History: r12 < r11 < r10 < r9 < r8 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r12 - 2014-02-21 - JakobBlomer
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CvmFS All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback