Stratum One Service Operations

For WLCG and friends of WLCG, major stratum 1 CvmFS services are operated at the below sites.

Current software version documented on this page : cvmfs-server >= 2.2.2 from https://ecsft.cern.ch/dist/cvmfs or from the cernvm-testing yum repository.

Major Stratum Ones

These are the major stratum one URLs and the distribution domains they serve as configured in the default, egi, and osg configuration repositories:

Location URL Distribution
ASGC http://cvmfsrep.grid.sinica.edu.tw:8000 CERN, EGI, HSF, OSG
BNL http://cvmfs-s1bnl.opensciencegrid.org:8000 CERN, DESY (for osg), EGI (for osg), HSF, KEK, OSG
CERN http://cvmfs-stratum-one.cern.ch:8000 CERN, DESY, HSF, LSST, RAL
DESY http://grid-cvmfs-one.desy.de:8000 DESY, KEK
FNAL http://cvmfs-s1fnal.opensciencegrid.org:8000 CERN, DESY (for osg), EGI (for osg), HSF, KEK, OSG
IHEP http://cvmfs-stratum-one.ihep.ac.cn:8000 CERN, DESY, EGI, HSF, KEK, OSG
IN2P3 http://cclssts1.in2p3.fr LSST
KEK http://cvmfs-stratum-one.cc.kek.jp:8000 KEK
NIKHEF http://cvmfs01.nikhef.nl:8000 EGI, OSG, RAL
RAL http://cvmfs-egi.gridpp.rl.ac.uk:8000 CERN, DESY, EGI, HSF, KEK, OSG, RAL
Swinburne http://cvmfs-s1.hpc.swin.edu.au:8000 CERN, DESY, EGI, HSF, KEK, OSG
TRIUMF http://cvmfsrepo.lcg.triumf.ca:8000 EGI
UNL http://cvmfs-s1goc.opensciencegrid.org:8000 CERN (for replicas), DESY (for osg), EGI (for osg), OSG

Not all repositories of each distribution are on all stratum 1s. For example, replication of EGI and OSG repositories between each other's stratum 1s are explicitly on request via their respective ticketing systems.

For full replications for purposes other than stratum ones:

Europe http://cvmfs-stratum-zero-hpc.cern.ch:8000 CERN, DESY, HSF, LSST, RAL
North America http://cvmfs-s1goc.opensciencegrid.org:8001 CERN, DESY, EGI, HSF, KEK, OSG

Mailing lists

Stratum One Providers have an email list for discussion: cvmfs-stratum-operations@cern.ch please join this list if you run a stratum one. Please also join the list for alarm emails from stratum ones, cvmfs-stratum-alarm@cern.ch.

Monitoring

Here are step-by-step instructions for installing a CVMFS Stratum1 including WLCG squid monitoring with MRTG and awstats.

Highly available stratum one

If you want to have a highly available stratum one service, see https://github.com/DrDaveD/cvmfs-hastratum1/wiki for one way to do it.

Configuration

To add a repository replica, these are the recommended steps using for example the atlas repository, which need to be done for every cern.ch repository. The main reason for the complication is that we still need to maintain backward compatibility with client configurations that used to look for /opt/ rather than /cvmfs/.cern.ch.

  1. Create a /srv/cvmfs/atlas.cern.ch symlink pointing to where you want the repository replica storage to be. For backward compatibility in the cern.ch distribution we need the repositories available without the cern.ch, so the real directory is recommended to be the short name without the cern.ch. If you're creating from scratch, this can be an empty directory. (If you don't pre-create the symlink, the next step will create an empty repository replica in /srv/cvmfs/atlas.cern.ch).
    Alternatively, if you want your storage to be in /srv/cvmfs instead of somewhere else, for every cern.ch repository you can instead make a symlink after the next step pointing the short name of the repository in /srv/cvmfs to the long name. Then you can remove the RewriteRule in the apache configuration below that deletes .cern.ch.
  2. Run the following command to add the replica:
    # cvmfs_server add-replica -o root http://cvmfs-stratum-zero.cern.ch:8000/cvmfs/atlas.cern.ch /etc/cvmfs/keys/cern.ch/cern.ch.pub:/etc/cvmfs/keys/cern.ch/cern-it1.cern.ch.pub:/etc/cvmfs/keys/cern.ch/cern-it2.cern.ch.pub
  3. Remove the apache configuration file it created because we won't be using it:
    # rm /etc/httpd/conf.d/cvmfs.atlas.cern.ch.conf
  4. Install the initial snapshot:
    # cvmfs_server snapshot atlas.cern.ch

Set up /etc/httpd/conf.d/cvmfs.conf to be like the following to serve all replicas (this assumes a cvmfs-server version >= 2.4.2). Replace the two occurrences of /storage/cvmfs with wherever you have all the short names of the cern.ch repositories stored.

KeepAlive On

RewriteEngine on

# for debugging
#RewriteLog "/var/log/httpd/rewrite.log"
#RewriteLogLevel 3

# Point /opt to /cvmfs for backward compatibility of old client configs
RewriteRule ^/opt/(.*)$ /cvmfs/$1

# Point api URLs to the WSGI handler
RewriteRule ^/cvmfs/([^/]+)/api/(.*)$ /var/www/wsgi-scripts/cvmfs-server/cvmfs-api.wsgi/$1/$2

# Remove .cern.ch because CERN repos are stored with short name for 
#   backward compatibility to old client configs.
# Avoid for other distributions; they should instead store with full name.
RewriteRule ^/cvmfs/([A-Za-z0-9-]+)(\.cern\.ch)/(.*)$ /cvmfs/$1/$3

# point /cvmfs to where the storage is
RewriteRule ^/cvmfs/(.*)$ /storage/cvmfs/$1

<Directory "/storage/cvmfs"> 
    Options -MultiViews FollowSymLinks -Indexes
    AllowOverride All 
    Require all granted

    EnableMMAP Off 
    EnableSendFile Off

    <FilesMatch "^\.cvmfs">
        ForceType application/x-cvmfs
    </FilesMatch>

    Header unset Last-Modified
    RequestHeader unset If-Modified-Since
    FileETag None

    ExpiresActive On
    ExpiresDefault "access plus 3 days" 
    ExpiresByType text/html "access plus 15 minutes" 
    ExpiresByType application/x-cvmfs "access plus 61 seconds"
    ExpiresByType application/json "access plus 61 seconds"
</Directory>

WSGIDaemonProcess cvmfs-api threads=64 display-name=%{GROUP} \
    python-path=/usr/share/cvmfs-server/webapi
<Directory /var/www/wsgi-scripts/cvmfs-server>
    WSGIProcessGroup cvmfs-api
    WSGIApplicationGroup cvmfs-api
    Options ExecCGI
    SetHandler wsgi-script
    Require all granted
</Directory>
WSGISocketPrefix /var/run/wsgi

Squid

It is recommended to also install frontier-squid on the same machine, in order to cache the api calls and to be able to participate in WLCG monitoring. See instructions for installing frontier-squid and frontier-awstats at https://opensciencegrid.org/docs/other/install-cvmfs-stratum1.

Cron jobs

The /etc/cron.d/cvmfs.cron entry for updates should be something like this, with the times according to the schedule above:

1-59/5 * * * *    root    cvmfs_server snapshot -a

This cron will run cvmfs_server snapshot in all created repositories, in the order from oldest snapshot to newest. If it finds one that is still in progress from a previous cron, it will skip that one and do the next. So it is possible to get a few repositories downloading in parallel, one additional every 5 minutes. In practice the parallelism shouldn't get very high.

Logs will go into /var/log/cvmfs/snapshots.log. Because of the chance of parallelism, they will go first into a temporary file and then be appended to the log when the snapshot finishes. To instead keep a separate log for each repository, add the '-s' option.

This will run a snapshot on a repository even if an initial snapshot is not done there (as long as another one is in progress). If you have a script that does both add-replica and snapshot in a single step, you can add the '-i' option to the cron to skip repositories that don't have their initial snapshot done.

In addition, garbage collection of all garbage collectable repositories should be done weekly:

13 0 * * 0   root cvmfs_server gc -af

or to do it daily replace the second "0" above with a star "*".

This will put logs in /var/log/cvmfs/gc.log. Snapshots will be updated

Rotate logs

Don't forget to rotate the logs. Here's suggested contents for /etc/logrotate.d/cvmfs:
/var/log/cvmfs/*.log {
    weekly
    missingok
    notifempty
}

Cleaning out temporary files

The 2.1 snapshot software is much better at cleaning out temporary files than 2.0 was, but it still occasionally leaves temporary files if it is aborted. The simplest way to keep that cleaned out is to periodically delete files that haven't been modified lately. This is a suggested cron.d entry to daily delete those that are older than 48 hours:
0 9 * * * root   find /srv/cvmfs/*.*/data/txn -name "*.*" -mtime +2 2>/dev/null|xargs rm -f

Apache and syslog

If you use syslog to track your Apache accesses (typically done for security reasons, by putting a pipe in the CustomLog keyword) then make sure that the writing of that output to a file is buffered. This can be done with syslog by putting a dash ('-') at the beginning of the filename in /etc/syslog.conf. Without that, syslog syncs the output to disk after every log entry.

ulimit -n

The maximum number of file descriptors for cvmfs_server snapshot sometimes has to be above the default 1024. You can put "ulimit -n 16384" in the cron job or add "* - nofile 16384" to /etc/security/limits.conf. Note that if you do the latter, in order for it to work under ssh the sshd_config UsePAM option has to be enabled.

Automatic replication

Many of the major stratum 1s use a tool for automatically adding replications as new repositories are added, to require less manual intervention by the system administrator. See cvmfs-manage-replicas.

Edit | Attach | Watch | Print version | History: r85 < r84 < r83 < r82 < r81 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r85 - 2022-09-07 - DaveDykstra
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CvmFS All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback