Installation of EOS MGM for federated cloud

Test log

Tested on

CentOS 6.7

-- EygeneRyabinkin

Installation process

Base system and repositories

Standard software repositories:

EOS repository (users of Scientific Linux and its derivatives should use alternative files, see below):
cat << EOF > /etc/yum.repos.d/eos.repo
[eos-aquamarine]
name=EOS aquamarine, modern location
baseurl=https://dss-ci-repo.web.cern.ch/dss-ci-repo/eos/aquamarine/tag/el-$releasever/$basearch/
gpgcheck=0
enabled=1
priority=45

[eos-aquamarine-depends]
name=EOS aquamarine, dependencies
baseurl=https://dss-ci-repo.web.cern.ch/dss-ci-repo/eos/aquamarine-depend/el-$releasever-$basearch/
gpgcheck=0
enabled=1
priority=45
EOF

Scientific Linux and its derivatives have major.minor $releasever, so for these OS variants we should hardcode mainline version into repo files:

cat << EOF > /etc/yum.repos.d/eos.repo
[eos-aquamarine]
name=EOS aquamarine, modern location
baseurl=https://dss-ci-repo.web.cern.ch/dss-ci-repo/eos/aquamarine/tag/el-6/$basearch/
gpgcheck=0
enabled=1
priority=45

[eos-aquamarine-depends]
name=EOS aquamarine, dependencies
baseurl=https://dss-ci-repo.web.cern.ch/dss-ci-repo/eos/aquamarine-depend/el-6-$basearch/
gpgcheck=0
enabled=1
priority=45
EOF

Use yum-priorities plugin and make EOS repository priority higher than EPEL one (default priority is 99, lower number gives more priority).

Packages

Install packages:

yum install -y eos-server eos-client eos-nginx eos-fuse eos-test eos-apmon eos-cleanup jemalloc nscd

Authentication between MGM and FST

Install EOS keytab: to be done by central team. Keytab ownership/mode must be tweaked:

chmod 400 /etc/eos.keytab
chown daemon:daemon /etc/eos.keytab

Firewall

Firewall configuration:

  • MGM allows incoming connections to the port 1094 from the world: it is the main client port for metadata and redirections
  • MGM allows incoming connections to the ports 1096 and 1097 fromr all other MGMs and FSTs

EOS MGM configuration

MGM needs X.509 certificate since it does GSI authentication. It also rungs ALICE token authentication. So, we must install the needed packages (RDIG CA can be substituted with the whole lcg-CA package: it will install all IGTF trust roots):

yum install -y xrootd-alicetokenacc ca_RDIG

and put X.509 key and certificate to the proper place:

mkdir -p /etc/grid-security/daemon
chmod 600 /etc/grid-security/daemon/hostcert.pem
chown daemon:root /etc/grid-security/daemon/hostcert.pem
chmod 600 /etc/grid-security/daemon/hostkey.pem
chown daemon:root /etc/grid-security/daemon/hostkey.pem

Edit standard SysV -init script configuration (here muon.grid.kiae.ru is the name of the MGM machine and we're running single-head configuration):

cat << EOF > /etc/sysconfig/eos
XRD_ROLES="mq sync mgm"
export EOS_MGM_ALIAS="muon.grid.kiae.ru"
export EOS_MGM_MASTER1="${EOS_MGM_ALIAS}"
export EOS_MGM_MASTER2="${EOS_MGM_ALIAS}"
export EOS_BROKER_URL="root://localhost:1097//eos"
export EOS_INSTANCE_NAME=eosalice
EOF

Create/edit MGM configuration file:

cat << EOF > /etc/xrd.cf.mgm
###########################################################
xrootd.fslib libXrdEosMgm.so
xrootd.seclib libXrdSec.so
xrootd.async off nosf
xrootd.chksum adler32
###########################################################

xrd.sched mint 8 maxt 256 idle 64
###########################################################
all.export /
all.role manager
###########################################################
oss.fdlimit 16384 32768

###########################################################
# UNIX authentication
sec.protocol unix
# SSS authentication
sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab
# GSI authentication
sec.protocol gsi -crl:0 -cert:/etc/grid-security/daemon/hostcert.pem -key:/etc/grid-security/daemon/hostkey.pem -gridmap:/etc/grid-security/grid-mapfile -d:0 -gmapopt:2 -vomsat:1 -moninfo:1

###########################################################
sec.protbind localhost.localdomain unix sss
sec.protbind localhost unix sss
sec.protbind * only gsi sss unix
###########################################################
mgmofs.fs /
mgmofs.targetport 1095
mgmofs.authlib /usr/lib64/libXrdAliceTokenAcc.so
mgmofs.authorize 1
alicetokenacc.noauthzhost localhost
alicetokenacc.noauthzhost localhost.localdomain
alicetokenacc.truncateprefix /eos/alice/grid

###########################################################
#mgmofs.trace all debug

# this URL will be overwritten by EOS_BROKER_URL defined in /etc/sysconfig/eos
mgmofs.broker root://localhost:1097//eos/

# this name will be overwritten by EOS_INSTANCE_NAME defined in /etc/sysconfig/eos
mgmofs.instance eosdev

# configuration, namespace , transfer and authentication export directory
mgmofs.configdir /var/eos/config
mgmofs.metalog /var/eos/md
mgmofs.txdir /var/eos/tx
mgmofs.authdir /var/eos/auth
mgmofs.archivedir /var/eos/archive

# report store path
mgmofs.reportstorepath /var/eos/report

# this defines the default config to load
mgmofs.autoloadconfig default

# this enables that every change get's immediately stored to the active
# configuration - can be overwritten by EOS_AUTOSAVE_CONFIG defined in
# /etc/sysconfig/eos
mgmofs.autosaveconfig true

# this has to be defined if we have a failover configuration via alias -
# can be overwritten by EOS_MGM_ALIAS in /etc/sysconfig/eos
#mgmofs.alias eosdev.cern.ch

###########################################################
# Set the FST gateway host and port
mgmofs.fstgw someproxy.cern.ch:3001

###########################################################
EOF

Create mapfile for X.509 certificates of clients:

cat << EOF > /etc/grid-security/grid-mapfile
/C=RU/O=RDIG/OU=users/OU=spbu.ru/CN=Andrey Zarochentsev" eosuser
/C=RU/O=RDIG/OU=users/OU=pnpi.nw.ru/CN=Andrey Kiryanov" eosuser
/C=RU/O=RDIG/OU=users/OU=grid.kiae.ru/CN=Eygene A. Ryabinkin" eosuser
/C=RU/O=RDIG/OU=users/OU=grid.kiae.ru/CN=Igor Tkachenko" eosuser
EOF

and local Unix user(s) onto which we map external certificates:

groupadd -g 2016 eosuser
useradd -u 2016 -g 2016 eosuser

(Re)start EOS:

service eos restart

Turn on sss and gsi authentication:

eos -b vid enable sss
eos -b vid enable gsi

Create pool groups:

for i in $(seq 1 4); do eos -b group set default.$i on; done
The total number of created groups must be greater than the number of different filesystems with EOS data on any FST: groups are used to avoid putting file replicas into different filesystems on the same server (or it will render more than one replica unusable during single server outage). Files are replicated within the single group, so such strategy will make replicas to sit on the different servers if we place EOS data filesystems on FST to the different groups.

One can later add new groups, so initially the number of groups can be chosen from characteristics of existing (or forthcoming in near future) FST machines.

Create default space:

eos -b space define default
eos -b space set default on

Create filesystem space for federated cloud and tune its ACL:

eos -b mkdir /eosfedcloud
eos -b chown eosuser:eosuser /eosfedcloud

Typical problems

Can't write anything, but have attached FSTs

If "space ls" shows non-zero "sum(capacity)", but "capacity(rw)" equals to zero,

EOS Console [root://localhost] |/> space ls
#------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
# type # name # groupsize # groupmod #N(fs) #N(fs-rw) #sum(usedbytes) #sum(capacity) #capacity(rw) #nom.capacity #quota #balancing # threshold # converter # ntx # active #intergroup
#------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
spaceview default 0 24 6 0 209.24 M 44.54 T 0 0 off off 20 off 2 0 off
we usually just need to enable default space (or space that has this problem) again:
eos -b space set default on
Starting from EOS 4.1 geotags became mandatory. EOS will refuse to write to FST with empty geotag.

Remarks

Debug setting

eos debug notice  

Replica setting

Global setting:

eos space config default space.geobalancer=on
eos space config default space.geobalancer.ntx=10
eos space config default space.geobalancer.threshold=5
And for 1 lvl replica (for save files by geotags)
eos space config default space.geo.access.policy.write.exact=on
Show options:
  ~ ] eos space status default
# ------------------------------------------------------------------------------------
# Space Variables
# ....................................................................................
balancer                         := off
balancer.node.ntx                := 2
balancer.node.rate               := 25
balancer.threshold               := 20
converter                        := off
converter.ntx                    := 2
drainer.node.ntx                 := 2
drainer.node.rate                := 25
drainperiod                      := 86400
geo.access.policy.write.exact    := on
geobalancer                      := on
geobalancer.ntx                  := 10
geobalancer.threshold            := 5
geotagbalancer                   := off
geotagbalancer.ntx               := 10
geotagbalancer.threshold         := 5
graceperiod                      := 86400
groupbalancer                    := off
groupbalancer.ntx                := 10
groupbalancer.threshold          := 5
groupmod                         := 24
groupsize                        := 0
quota                            := off
scaninterval                     := 604800
IP list setting:
eos vid set geotag 85.143 MEPHI
Check IP list:
  ~] eos vid ls
geotag:"85.143" => "MEPHI"
gsi:"<pwd>":gid => root
gsi:"<pwd>":uid => root
sss:"<pwd>":gid => root
sss:"<pwd>":uid => root
sudoer                 => uids()

Catalog setting

Set standart replica setting:
eos attr -r set default=replica eos/fedcloud/zar/rep2
Check replica setting of catalog:
 ~] eos attr ls eos/fedcloud/zar/rep2
sys.forced.blockchecksum="crc32c"
sys.forced.blocksize="4k"
sys.forced.checksum="adler"
sys.forced.layout="replica"
sys.forced.nstripes="2"
sys.forced.space="default"
Set replica setting for 1 copy (for save data by geotag without replica):
eos attr ls eos/fedcloud/zar/rep1
eos attr set sys.forced.nstripes="1" eos/fedcloud/zar/rep1

-- EygeneRyabinkin - 2016-03-15

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2019-01-08 - TusharKantiDas
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback