Host Details of the CERN IT CVMFS Service

There are two worlds of stratum 0s currently:

  • Old world uses the netapp filer.
  • New world uses ceph volumes.

Hostgroup cvmfs
This cluster contains following node types:
These are typically hosts with name lxcvmfsXX. They are nodes where maintainers of software in cvmfs login to install their releases and publish these releases. There is exactly one node for each repo, e.g atlas, cms which corresponds to each of the cvmfs repositories. For all such hosts there is a DNS alias in landb e.g. -> lxcvmfs37. cvmfs/lxcvmfs is for old world stratum zeros.
These are typically hosts with name lxcvmfsXX. They are nodes where maintainers of software in cvmfs login to install their releases and publish these releases. There may be many repos on one box. e.g atlas, cms could be on the same box. For all such hosts there is a DNS alias in landb e.g. -> lxcvmfs55. cvmfs/lx is for old world stratum zeros.
Host back up zfs repositories of of stratum 0 machines.
The one backend node is the backend of the stratum one service that pulls from the stratum 0 service.
These are multiple reverse squid proxies that serve data as the public endpoint. They pull files on demand from the backed of stratum one service.
These are the simple stratum 0 webservers that serve data from the netapp. Only the stratum ones at T1s connect to this service.
Hostgroup ourproxy
These are not detailed below, the proxy nodes are the normal forward proxies that are used by CERN batch workers, as such they are not part of the central CvmfFS service. They are all in the ourproxy hostgroup.

Determining if a Repository is old world or New World.

A rough tally is maintained on progress on NetappToCephDiskServer but definitively. Check where e.g cvmfs-atlas resolves to. If lxcvmfsNN < 50 it is old world. If new world then it is > 50.

Old World Stratum 0s.

  • Contained in the hostgroup cvmfs/lxcvmfs.
  • One node per respoitory e.g ->
  • In puppet cvmfs is configured by the now deprecated class cvmfs::server which is included from the hostgroup.
  • For each repo there is netapp volume e.g . This is mounted
    • rw on the stratum 0 node. (lxcvmfs37)
    • ro on the stratum 0 webserver node. i.e cvmfs/zero hostgroup. It is from here the content is served under

New world Stratum 0s.

  • Contained in the hostgroup cvmfs/lx
  • Can be many cvmfs repos per node, e.g cvmfs-test and both resolve to
  • In puppet there are two defined types per stratum 0. * cvmfs::zero - this contains generic stratum 0 configuration that could be used on all sites everywhere. * hg_cvmfs::private::localzero - contains CERNisms e.g mounting up the file systems and such like.
  • Each lxcvmfs node contains a configuration in e.g fqdn/ a create_resources call is made to load of a list of cvmfs::zero and hg_cvmfs::private::localzero instances. One per repo on each node.
  • Each new world stratum 0 node runs apache as well. The hosts zero05 and zero06 behind both reverse proxy requests back to the lxcvmfs node.

New world Stratum 0s Filesystem

  • Each repository trypically has two CEPH filesystems named
    • 20150702-stratum0-ams - This is the live stratum 0 ceph file system mounted on the cvmfs-ams node.
    • 20150702-backup-stratum0-ams - THis the backup ceph volume of the first.
  • Both ceph volumes are a zpool each called containing one filesystem which is the actual data.
  • Snapshots:
    • The zfs filesystem on is snapshoted via /etc/cron.*/zfs-snap-shot to produce hourly, daily, weekly snapshots. These are purged by the same scripts.
    • The zfs filesystem on is pushed incementally using zrep to one of the backup machines. e.g in the hostgroup cvmfs/backup. The push is done via the script /etc/cron.hourly/ WHen the push is done it is incremental. It also pushes all the snapshots that have been made via the above scripts.
    • On the backup server the snapshots from the zfs-snap-shot script are not purged so another cronjob /usr/local/bin/ runs on the backup server to purge these.
  • More detail on stratum 0s setup this way can be learnt from looking at the setup in NewRepo.

Stratum 0 file system upsizing.

  • I tried detaching a ceph volume and resizing it. ZFS was afterwards not happy so until tested don't do this.
  • Instead read the man pages and google, very standard zfs operation.
    • Create a new bigger ceph volume and attach it to node.
    • Attach as mirror to existing ceph volume zpool attach virtio-SMALL virtio-BIG
    • Remember to set the myid property on the volume as always.
    • Monitor zpool status to check how mirroring is doing.
    • Once mirror complete drop the old ceph volume. zpool detach virtio-BIG
    • Expand the volume. See zpool status to understand and the zpool online -e virtio-BIG
    • Increase the quota on the zfs filesystem

Stratum 0 Failure Senarios

Someone just remove the contents of Stratum 0

Writers of the stratum 0 have permissions to e.g *rm -rf /var/spool/cvmfs/ In this case zfs snapshots are your friend basically.
  • Find a suitable snapshot and rewind to it. See google.
  • To get backups to work again you probably need to delete snapshots from the destination also.

Loosing a Stratum 0 Node.

This assumes you still have a CEPH volume with all the data.
  • Install a new node via NewRepo.
  • Attach the existing ceph volume.
  • Run zfs import , it will complain about a bad shut down and then you force it. I have never tried it.

zrep replication fails.

Particularly after a crash of backup server or sending node zrep may stop doing replications and report

sending to
cannot receive incremental stream: destination has been modified
since most recent snapshot

The fix here is to roll the destination system back to the last good snapshot that was made so ignoring the partial changes.

# zfs list  -t snap | grep ams                                          136K      -  2.24T  -                                          136K      -  2.24T  -                                          136K      -  2.24T  -                                          136K      -  2.24T  -                                          136K      -  2.24T  -

# zfs rollback                          

Hopefully after that a normal zrep should work again.

Stratum One

The stratum one is stand alone diskserver at the end of and currently 3 reverse squid proxies behind They are deployed in the hostgroups.

  • cvmfs/one/backend
  • cvmfs/one/frontend/live

A script it-cvmfs/stratum1/ exists for deploying frontend squid servers into cvmfs/one/frontend/spare.

Once deployed there are a couple of manual steps which I must puppetize one day.

  • Stop squid systemctl stop squid.service
  • Create cache directories squid -z
  • Reboot and run puppet. If good then stick the node in cvmfs/one/frontend/live to have to join the alias

If you are mega unlucky and the cvmfs-backend fails then panic quite frankly. CvmFS should work without it since the other stratum ones should cope with the load. Find a big disk server and install into cvmfs/backend hostgroup. It will take days to be ready via cron jobs even once the host is installed. It will only be noticed by CvmFS clients once the cron jobs have finished.

This topic: CvmFS > WebHome > MachineResources
Topic revision: r11 - 2015-07-27 - SteveTraylen
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback