Stratum One Service Operations
For WLCG and friends of WLCG, major stratum 1 CvmFS services are operated at the below sites.
Current software version documented on this page : cvmfs-server >= 2.2.2 from
https://ecsft.cern.ch/dist/cvmfs
or from the cernvm-testing yum repository.
Major Stratum Ones
These are the major stratum one URLs and the distribution domains they serve as configured in the default, egi, and osg
configuration repositories
:
Location |
URL |
Distribution |
ASGC |
http://cvmfsrep.grid.sinica.edu.tw:8000 |
CERN, EGI, HSF, OSG |
BNL |
http://cvmfs-s1bnl.opensciencegrid.org:8000 |
CERN, DESY (for osg), EGI (for osg), HSF, KEK, OSG |
CERN |
http://cvmfs-stratum-one.cern.ch:8000 |
CERN, DESY, HSF, LSST, RAL |
DESY |
http://grid-cvmfs-one.desy.de:8000 |
DESY, KEK |
FNAL |
http://cvmfs-s1fnal.opensciencegrid.org:8000 |
CERN, DESY (for osg), EGI (for osg), HSF, KEK, OSG |
IHEP |
http://cvmfs-stratum-one.ihep.ac.cn:8000 |
CERN, DESY, EGI, HSF, KEK, OSG |
IN2P3 |
http://cclssts1.in2p3.fr |
LSST |
KEK |
http://cvmfs-stratum-one.cc.kek.jp:8000 |
KEK |
NIKHEF |
http://cvmfs01.nikhef.nl:8000 |
EGI, OSG, RAL |
RAL |
http://cvmfs-egi.gridpp.rl.ac.uk:8000 |
CERN, DESY, EGI, HSF, KEK, OSG, RAL |
Swinburne |
http://cvmfs-s1.hpc.swin.edu.au:8000 |
CERN, DESY, EGI, HSF, KEK, OSG |
TRIUMF |
http://cvmfsrepo.lcg.triumf.ca:8000 |
EGI |
UNL |
http://cvmfs-s1goc.opensciencegrid.org:8000 |
CERN (for replicas), DESY (for osg), EGI (for osg), OSG |
Not all repositories of each distribution are on all stratum 1s. For example, replication of EGI and OSG repositories between each other's stratum 1s are explicitly on request via their respective ticketing systems.
For full replications for purposes other than stratum ones:
Mailing lists
Stratum One Providers have an email list for discussion:
cvmfs-stratum-operations@cern.ch
please join this list
if you run a stratum one. Please also join the list for alarm emails from stratum ones,
cvmfs-stratum-alarm@cern.ch
.
Monitoring
Here are step-by-step
instructions for installing a CVMFS Stratum1
including WLCG squid monitoring with MRTG and awstats.
Highly available stratum one
If you want to have a highly available stratum one service, see
https://github.com/DrDaveD/cvmfs-hastratum1/wiki
for one way to do it.
Configuration
To add a repository replica, these are the recommended steps using for example the atlas repository, which need to be done for every cern.ch repository. The main reason for the complication is that we still need to maintain backward compatibility with client configurations that used to look for /opt/
rather than /cvmfs/.cern.ch.
- Create a
/srv/cvmfs/atlas.cern.ch
symlink pointing to where you want the repository replica storage to be. For backward compatibility in the cern.ch distribution we need the repositories available without the cern.ch, so the real directory is recommended to be the short name without the cern.ch. If you're creating from scratch, this can be an empty directory. (If you don't pre-create the symlink, the next step will create an empty repository replica in /srv/cvmfs/atlas.cern.ch
).
Alternatively, if you want your storage to be in /srv/cvmfs
instead of somewhere else, for every cern.ch repository you can instead make a symlink after the next step pointing the short name of the repository in /srv/cvmfs
to the long name. Then you can remove the RewriteRule in the apache configuration below that deletes .cern.ch.
- Run the following command to add the replica:
# cvmfs_server add-replica -o root http://cvmfs-stratum-zero.cern.ch:8000/cvmfs/atlas.cern.ch
/etc/cvmfs/keys/cern.ch/cern.ch.pub:/etc/cvmfs/keys/cern.ch/cern-it1.cern.ch.pub:/etc/cvmfs/keys/cern.ch/cern-it2.cern.ch.pub
- Remove the apache configuration file it created because we won't be using it:
# rm /etc/httpd/conf.d/cvmfs.atlas.cern.ch.conf
- Install the initial snapshot:
# cvmfs_server snapshot atlas.cern.ch
Set up /etc/httpd/conf.d/cvmfs.conf
to be like the following to serve all replicas (this assumes a cvmfs-server version >= 2.4.2). Replace the two occurrences of /storage/cvmfs
with wherever you have all the short names of the cern.ch repositories stored.
KeepAlive On
RewriteEngine on
# for debugging
#RewriteLog "/var/log/httpd/rewrite.log"
#RewriteLogLevel 3
# Point /opt to /cvmfs for backward compatibility of old client configs
RewriteRule ^/opt/(.*)$ /cvmfs/$1
# Point api URLs to the WSGI handler
RewriteRule ^/cvmfs/([^/]+)/api/(.*)$ /var/www/wsgi-scripts/cvmfs-server/cvmfs-api.wsgi/$1/$2
# Remove .cern.ch because CERN repos are stored with short name for
# backward compatibility to old client configs.
# Avoid for other distributions; they should instead store with full name.
RewriteRule ^/cvmfs/([A-Za-z0-9-]+)(\.cern\.ch)/(.*)$ /cvmfs/$1/$3
# point /cvmfs to where the storage is
RewriteRule ^/cvmfs/(.*)$ /storage/cvmfs/$1
<Directory "/storage/cvmfs">
Options -MultiViews FollowSymLinks -Indexes
AllowOverride All
Require all granted
EnableMMAP Off
EnableSendFile Off
<FilesMatch "^\.cvmfs">
ForceType application/x-cvmfs
</FilesMatch>
Header unset Last-Modified
RequestHeader unset If-Modified-Since
FileETag None
ExpiresActive On
ExpiresDefault "access plus 3 days"
ExpiresByType text/html "access plus 15 minutes"
ExpiresByType application/x-cvmfs "access plus 61 seconds"
ExpiresByType application/json "access plus 61 seconds"
</Directory>
WSGIDaemonProcess cvmfs-api threads=64 display-name=%{GROUP} \
python-path=/usr/share/cvmfs-server/webapi
<Directory /var/www/wsgi-scripts/cvmfs-server>
WSGIProcessGroup cvmfs-api
WSGIApplicationGroup cvmfs-api
Options ExecCGI
SetHandler wsgi-script
Require all granted
</Directory>
WSGISocketPrefix /var/run/wsgi
Squid
It is recommended to also install frontier-squid on the same machine, in order to cache the api calls and to be able to participate in WLCG monitoring
. See instructions for installing frontier-squid and frontier-awstats at https://opensciencegrid.org/docs/other/install-cvmfs-stratum1
.
Cron jobs
The /etc/cron.d/cvmfs.cron
entry for updates should be something like this, with the times according to the schedule above:
1-59/5 * * * * root cvmfs_server snapshot -a
This cron will run cvmfs_server snapshot in all created repositories, in the order from oldest snapshot to newest. If it finds one that is still in progress from a previous cron, it will skip that one and do the next. So it is possible to get a few repositories downloading in parallel, one additional every 5 minutes. In practice the parallelism shouldn't get very high.
Logs will go into /var/log/cvmfs/snapshots.log. Because of the chance of parallelism, they will go first into a temporary file and then be appended to the log when the snapshot finishes. To instead keep a separate log for each repository, add the '-s' option.
This will run a snapshot on a repository even if an initial snapshot is not done there (as long as another one is in progress). If you have a script that does both add-replica and snapshot in a single step, you can add the '-i' option to the cron to skip repositories that don't have their initial snapshot done.
In addition, garbage collection of all garbage collectable repositories should be done weekly:
13 0 * * 0 root cvmfs_server gc -af
or to do it daily replace the second "0" above with a star "*".
This will put logs in /var/log/cvmfs/gc.log. Snapshots will be updated
Rotate logs
Don't forget to rotate the logs. Here's suggested contents for /etc/logrotate.d/cvmfs
:
/var/log/cvmfs/*.log {
weekly
missingok
notifempty
}
Cleaning out temporary files
The 2.1 snapshot software is much better at cleaning out temporary files than 2.0 was, but it still occasionally leaves temporary files if it is aborted. The simplest way to keep that cleaned out is to periodically delete files that haven't been modified lately. This is a suggested cron.d entry to daily delete those that are older than 48 hours:
0 9 * * * root find /srv/cvmfs/*.*/data/txn -name "*.*" -mtime +2 2>/dev/null|xargs rm -f
Apache and syslog
If you use syslog to track your Apache accesses (typically done for security reasons, by putting a pipe in the CustomLog keyword) then make sure that the writing of that output to a file is buffered. This can be done with syslog by putting a dash ('-') at the beginning of the filename in /etc/syslog.conf
. Without that, syslog syncs the output to disk after every log entry.
ulimit -n
The maximum number of file descriptors for cvmfs_server snapshot sometimes has to be above the default 1024. You can put "ulimit -n 16384" in the cron job or add "* - nofile 16384" to /etc/security/limits.conf
. Note that if you do the latter, in order for it to work under ssh the sshd_config UsePAM option has to be enabled.
Automatic replication
Many of the major stratum 1s use a tool for automatically adding replications as new repositories are added, to require less manual intervention by the system administrator. See cvmfs-manage-replicas
.