LCG Disk Pool Manager (DPM) administrator's guide

We have moved all the DPM documentation to our trac page.

Note: this documentation is currently out of date, in particular with respect to the YAIM (automatic installation and configuration) description. gLite 3.1 now uses a later version of YAIM. See the configuring section of

https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide310

for a brief YAIM description or

https://twiki.cern.ch/twiki/bin/view/LCG/YaimGuide400

for more YAIM details. The most signifiant difference with respect to the older YAIM scheme is that YAIM is now strictly only used for the configuration of a node - not the installation of software. Automated installation is now usually handled by YUM - followed by YAIM commands to configure the middleware.

The latest certified release of DPM is 1.6.11-3, which is somewhat newer than the version described and used in the examples below. However most of the DPM description is valid.

Authors: Jean-Philippe Baud, James Casey, Gilbert Grosdidier, Sophie Lemaitre

Main Developers:

Abstract: The Disk Pool Manager (DPM) is a lightweight solution for disk storage management. The DPM is easy to install and to configure. It is available for both Mysql and Oracle as database backend. This document describes how to install and administer the DPM for both backends.

Last certified version: 1.6.3

Support: helpdesk@ggusNOSPAMPLEASE.org (remove the NONSPAM !)


Important

DPM version 1.6.3 provides support for SRMv2.2.

Check DPM with SRMv2.2 support to upgrade to DPM 1.6.3.

Note : YAIM handles all the steps needed for the upgrade/first installation.


Introduction

Description

The Disk Pool Manager (DPM) has been developed as a lightweight solution for disk storage management. A priori, there is no limitation on the amount of disk space that the DPM can handle.

The DPM offers an implementation of the Storage Resource Manager (SRM) specifications, for version 1.1, version 2 and version 2.2. For details about the SRM specifications, see http://sdm.lbl.gov/srm-wg.

The DPM handles the storage on disk servers. In fact, it handles pools : a pool is a group of file systems, located on one or more disk servers. The way file systems are grouped to form a pool is up to the DPM administrator.

The DPM can handle two different kinds of file systems:

  • volatile : the files contained in a volatile file system can be removed by the system at any time, unless they are pinned by a user.
  • permanent : the files contained in a permanent file system cannot be removed by the system.

The DPM is very easy to install: only a few RPMs need to be installed. It is also very easy to configure: for instance, with one command, the DPM administrator can set the threshold from which volatile files that are not pinned by a user are removed automatically.

The DPM is security enabled: the basic GSI security stack (Globus RPMs, pool accounts, etc.) has to be installed on the DPM server machines, as well as on the disk servers.

DPM status as of August 2007:

DPM_poster.JPG


DPM Architecture

Description

The DPM consists of :

  • DPM socket client,
  • SRM server (srmv1 and/or srmv2) : receives the SRM requests and pass it on to the DPM server,
  • DPM server : keeps track of all the requests,
  • DPM name server (DPNS) : handles the namespace for all the files under the DPM control,
  • DPM RFIO server : handles the transfers for the RFIO protocol,
  • DPM Gridftp server : handles the transfer for the Gridftp protocol.

A little bit of DPM terminology:

  • The DPM head node refers to a single machine hosting the DPM, DPNS and SRM servers. It can also host the DPM database server (Mysql or Oracle).
  • The DPM disk servers or pool nodes correspond to the machines where the data is actually stored. The DPM head node can also serve as a disk server and store data.

The following diagram shows the typical configuration of the deployment:


DPM-small.jpg

Requirements

How many machines ?

The DPM, DPNS and SRM servers can either be installed on the same machine or on different ones. In this guide, we cover both possibilities.

The Database can be installed on the same machine as the DPM/DPNS server or on a different one (recommended if your institute has its own Database service).

The DPM client has to be installed on the SRM servers machine, and should be installed on the Worker Nodes (WNs) and the User Interfaces (UIs). It should be installed on the DPM and DPNS machine(s) as well.

The DPM Gridftp and RFIO servers have to be installed on each disk server managed by the DPM.

What kind of machines ?

Apart from the special case of a purely test setup, we recommend to install the DPM central servers (DPM, DPNS and SRM) on hardware that provides (at least):

  • 2Ghz processor with 512MB of memory (not a hard requirement)
  • Dual power supply
  • Mirrored system disk
  • Database backups

There has to be backups in place, in order to be able to recover the DPM metadata (namespace) in case of hardware problems. This is considered the responsibility of the local site and is outside the scope of this document.

Network interfaces

Important note:

If your machine has 2 (or more) network interfaces, it is important that the primary interface corresponds to the same name as the one returned by gethostname.

Which ports need to be open ?

The following ports have to be open :

  • DPM server: port 5015/tcp must be open locally at your site at least (can be incoming access as well),
  • DPNS server: port 5010/tcp must be open locally at your site at least (can be incoming access as well),
  • SRM servers: ports 8443/tcp (SRMv1), 8444/tcp (SRMv2) and 8446/tcp (SRMv2.2) must be opened to the outside world (incoming access),
  • RFIO server: port 5001/tcp and data ports 20000-25000/tcp must be open to the outside world (incoming access), in the case your site wants to allow RFIO access from outside,
  • Gridftp server: control port 2811/tcp and data ports 20000-25000/tcp (or any range specified by GLOBUS TCP PORT RANGE) must be opened to the outside world (incoming access).

The RFIO and Gridftp data ports chosen can be specified with (the delimiter needs to be a comma ","):

  • $RFIO_PORT_RANGE="20000,25000"
  • $GLOBUS_TCP_PORT_RANGE="20000,25000"
  • $GLOBUS_TCP_SOURCE_RANGE="20000,25000"


DPM CLI and API

A simple and detailed description of the DPM Command Line Interface (CLI) and Application Programming Interface (API) can be found here : DPM CLI and API


DPNS Client Timeouts

From version 1.5.8 onwards, timeouts and retries are implemented at the DPNS client level.

3 environment variables can be used:

  • $DPNS_CONNTIMEOUT -> sets the connect timeout in seconds
  • $DPNS_CONRETRY -> sets the number of retries
  • $DPNS_CONRETRYINT -> sets the retry interval in seconds

The default is to retry for one week.


VOMS and ACLs

For details about VOMS and ACLs in LCG Data Management, check VOMS and ACLs in Data Management.


File Systems

File systems must be created on each disk server managed by the DPM. A file system means a separate partition on each disk server, dedicated to files managed by the DPM.

Permissions

All the file systems have to have the following permissions :

ls -ld /data01

drwxrwx--- 3 dpmmgr dpmmgr 4096 Jun 9 12:14 data01

Restriction

The file systems cannot be called "/dpm".

Otherwise, the DPM gets confused between the actual physical names and the logical name as in the DPM Name Server (/dpm/domain.name/home/...).


Security

The DPM is GSI security enabled. This implies several things.

Globus RPMs / configuration

Globus should be installed and configured as in LCG-2. Especially :

  • the standard Globus RPMs should be installed.
  • /opt/globus/lib should appear in /etc/ld.so.conf, and ldconfig be run.

Host certificate / key

On the DPM, DPNS, SRM servers and on each disk server managed by the DPM, copies of the host certificate and key have to be installed in /etc/grid-security/dpmmgr, as shown below.

Pay attention to the permissions and the names of the files!

$ ll /etc/grid-security/dpmmgr | grep dpm
-rw-r--r--    1 dpmmgr   dpmmgr         5423 May 27 12:35 dpmcert.pem
-r--------    1 dpmmgr   dpmmgr         1675 May 27 12:35 dpmkey.pem

This step is handled automatically by YAIM.

IMORTANT : the host cert and key still have to be kept in their original place and belong to root!!!

$ ll /etc/grid-security/ | grep host
-rw-r--r--    1 root   root       5423 May 30 13:58 hostcert.pem
-r--------    1 root   root       1675 May 30 13:58 hostkey.pem

grid-mapfile / gridmapdir

On each machine, there needs to be a gridmapdir directory and a grid-mapfile :

ls -al /etc/grid-security/ | grep grid
drwxrwxr-T 2 root dpmmgr 8192 Mar 31 18:05 gridmapdir
-rw-r--r-- 1 root root 59499 Mar 31 15:03 grid-mapfile

If the DPM servers are installed via YAIM, they will automatically be created.

If manually installed, the permissions on gridmapdir are created as follow :

chmod 1774 /etc/grid-security/gridmapdir

Note: If the gridmapdir directory resides somewhere else, the /etc/sysconfig/service_name file must be changed :

# - gridmapdir location
GRIDMAPDIR=/other/path/to/gridmap-directory

The corresponding server then has to be restarted.

Important: The dpmmgr user needs to have write access to the $GRIDMAPDIR directory.


Virtual Ids / VOMS

The DPM supports virtual Ids and VOMS.

For more details, refer to this page : Virtual Ids / VOMS


On the client

The user needs to have a valid proxy (and to exist in the DPM server grid-mapfile):

grid-proxy-init
Your identity: /C=CH/O=CERN/OU=GRID/CN=Sophie Lemaitre 2268
Enter GRID pass phrase for this identity:
Creating proxy ................................. Done
Your proxy is valid until: Fri Apr 1 22:38:07 2005

Otherwise, this error will appear :

dpns-ls /
send2nsd: NS002 - send error : No valid credential found
/: Could not establish context


DPM Installation via YAIM

Note the YAIM discussion and description and in this guide are out of date: A newer version of YAIM is now used with the gLite middleware: see the note at the top of this guide!

If you are not familiar with the YAIM tool, please refer to these pages:

The YAIM functions to install the DPM are available within the LCG-2_7_0 release.

Before running the YAIM commands, you have to create the dedicated disk partitions on the disk servers.

The site-info.def file has to contain the following variables :

  • $MY_DOMAIN -> your domain, ex : "cern.ch"
  • $DPM_HOST -> the DPM hostname, ex : "dpm01.$MY_DOMAIN"
  • $DPMPOOL -> the name of the pool, ex : "my-pool"
  • $DPM_FILESYSTEMS -> the dedicated file systems on each disk server, ex : "$DPM_HOST:/storage my-disk.$MY_DOMAIN:/data"
  • $DPM_DB_USER -> the DPM database user, ex : dpm
  • $DPM_DB_PASSWORD -> the DPM database user password
  • $DPM_DB_HOST -> the Mysql database host, or the Oracle database SID
  • $DPM_FSIZE -> the space to be reserved by default for a file stored in the DPM (if the client doesn't specify the file size), ex : 200M
  • $SE_ARCH -> should be "multidisk"

Optionally you can set :

  • $RFIO_PORT_RANGE -> open port range for the RFIO DPM server. Ex. "20000,25000". By default, the $GLOBUS_TCP_PORT_RANGE value is taken

The VO_myvo_DEFAULT_SE and VO_myvo_STORAGE_DIR variables are not used in the DPM case.

Then, on the DPM head node, run:

cd /opt/glite/yaim/scripts
./install_node my-site-info.def glite-SE_dpm_mysql
./configure_node my-site-info.def glite-SE_dpm_mysql

These commands will :

  • install the whole security stack (globus RPMs, CA RPMs, pool accounts, etc.),
  • install, configure and start the DPM, DPNS, SRMv1, SRMv2, RFIO, DPM Gridftp servers,
  • install the DPM client,
  • configure a unique pool, with all file systems specified,
  • configure the DPM head node appropriately.

And on each disk server, run:

./install_node my-site-info.def glite-SE_dpm_disk
./configure_node my-site-info.def glite-SE_dpm_disk

These commands will:

  • install the RFIO and DPM Gridftp servers,
  • configure the DPM disk servers appropriately.

For more details, please refer to the YAIM guide.


DPM Installation via RPMs

The CERN Grid Deployment group provides RPMs for the Disk Pool Manager.

The RPMs are available for Scientific Linux 3.

By default, the DPM RPMs are installed in the /opt/lcg directory. So, INSTALL_DIR refers to /opt/lcg by default.

We recommend that you install the DPM-client on each DPM server node.

The RPMs available are:

  • lcg-dm-common-VERSION-1_sl3.i386.rpm
  • DPM-client-VERSION-1sec_sl3.i386.rpm
  • DPM-name-server-mysql-VERSION-1sec_sl3.i386.rpm / DPM-name-server-oracle-VERSION-1sec_sl3.i386.rpm
  • DPM-server-mysql-VERSION-1sec_sl3.i386.rpm / DPM-server-oracle-VERSION-1sec_sl3.i386.rpm
  • DPM-srm-server-mysql-VERSION-1sec_sl3.i386.rpm / DPM-srm-server-oracle-VERSION-1sec_sl3.i386.rpm
  • DPM-rfio-server-VERSION-1sec_sl3.i386.rpm
  • DPM-gridftp-server-VERSION-1sec_sl3.i386.rpm


Dependencies

lcg-dm-common

All the DPM RPMs depend on the lcg-dm-common RPM. Install it as root :

rpm -Uvh lcg-dm-common-version-1.i386.rpm


CGSI_gSOAP

The DPM RPMs depend on CGSI gSOAP version 2.6 >= 1.1.15-6 :

rpm -Uvh CGSI_gSOAP_2.6-1.1.15-6.slc3.i386.rpm

VOMS

The DPM RPMs depend on :

glite-security-voms-api-c-1.6.16-0
glite-security-voms-api-1.6.16-0

Mysql Servers (DPM, Name Server and SRM)

The DPM, DPNS and SRM servers for Mysql depend on the Mysql-client RPM version 4.0.20 or higher. Install it as root :

rpm -Uvh Mysql-client-4.0.21-0.i386.rpm

As from the gLite3.0 release onwards, the LFC RPMs are built against Mysql 4.1. But they stay compatible with Mysql 4.0.

Note: automatic reconnection to the database for the DPM server only works with Mysql 4.1 or higher, as it uses the DUAL table. When using an earlier version of Mysql, the following error message will appear in /var/log/dpm/log:

07/20 09:53:46 23106,24 dpm_pingdb: mysql_query error: Table 'dpm_db.DUAL' doesn't exist

Oracle Servers (DPM, Name Server and SRM)

The DPM, DPNS and SRM servers for Oracle depend on the oracle-instantclient-basic-lcg RPM.

This RPM can also be found in the Grid Deployment AFS directory. Install it as root :

rpm -Uvh oracle-instantclient-basic-lcg-10.1.0.3-1.i386.rpm

If you install the DPM components on the same machine as the Oracle database, do not install the Oracle Instant Client but install the DPM-server-oracle RPM with the --nodeps option.

SRM Servers

The SRM servers RPMs also depend on CGSI gSOAP.


DPM, DPNS and SRM Servers Installation

The DPM, DPNS and SRM servers are available for:

  • Mysql
  • Oracle

Depending on the database backend chosen, install the corresponding RPMs as root (here DPM-server-mysql for instance):

rpm -Uvh DPM-server-mysql-version-1sec.i386.rpm

Create the DPMMGR User

Create the dedicated DPM user on each machine (DPM servers and disk servers):

useradd -c "DPM manager" -r -m -d /home/dpmmgr dpmmgr

The dpmmgr user is automatically created if you install the DPM via YAIM.

Important : currently, the dpmmgr user should have the same uid/gid on the DPM servers and disk servers. Important: if you change the dpmmgr uid/gid, restart all the daemons afterwards.

DPM and DPNS Hosts in the Sysconfig Files

Before running any daemon, create the /etc/sysconfig/service_name file by copying /etc/sysconfig/service_name.templ and set the $DPM_HOST and $DPNS_HOST with the appropriate values.

See below the /etc/sysconfig/srmv1.templ file :

# - DPM host : please change !!!!!!
export DPM_HOST=DPM_hostname

# - DPM Name Server host : please change !!!!!!
export DPNS_HOST=DPNS_hostname

Oracle Environment Variables

Depending if the DPM, DPNS and SRM servers are running on the same machine as the Oracle database, different environment variables have to be set.

Also, if they run on the same machine, the /etc/sysconfig/service_name files have to be modified.

Oracle and DPM servers on different machines

Note: This setup is not tested for every release, and is not recommended if a proper central database service exists in your institute.

If the Oracle database SID is different from DPM, or if the $TNS_ADMIN directory is located somewhere else, you have to modify the /etc/sysconfig/service_name files as follow before running the daemons :

# - Oracle Home :
#ORACLE_HOME=/usr/lib/oracle/10.1.0.3/client

# - Directory where tnsnames.ora resides :
TNS_ADMIN=/another/tns_admin/directory

# - Database name :
TWO_TASK=ANOTHER_SID

Oracle and DPM servers on the same machine

If the Oracle database SID is different from DPM, or if the $TNS_ADMIN directory is located somewhere else, you have to modify the /etc/sysconfig/service_name files as follow before running the daemons. You should also probably modify the $ORACLE HOME variable to be the proper one.

# - Oracle Home :
ORACLE_HOME=/another/oracle_home/

# - Directory where tnsnames.ora resides :
TNS_ADMIN=/another/tns_admin/directory

# If the DPM server is installed on the same box as the Oracle instance,
# use the $ORACLE_SID variable instead of the $TWO_TASK one :
ORACLE_SID=ANOTHER_SID


Database Setup

In this section, we assume that you have an Mysql or Oracle database instance running on a given host.

It is recommended to install the Oracle/Mysql server on a separate machine if your institute has a separate Database service, that will take care of the backups, etc.

If you install the DPM with YAIM, the following steps are done automatically.

Mysql

DPM and DPNS tables

Create the DPNS and DPM tables :

mysql -u root -p < INSTALL_DIR/share/DPM/create_dpns_tables_mysql.sql
mysql -u root -p < INSTALL_DIR/share/DPM/create_dpm_tables_mysql.sql

DPM Mysql Database User

Create the DPM appropriate user, called for instance dpm, with the correct privileges on cns_db and dpm_db:

mysql -u root -p
use mysql
GRANT ALL PRIVILEGES ON cns_db.* TO 'dpm'@DPNS_hostname IDENTIFIED BY 'dpm_password' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON cns_db.* TO 'dpm'@localhost IDENTIFIED BY 'dpm_password' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON dpm_db.* TO 'dpm'@DPM_hostname IDENTIFIED BY 'dpm_password' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON dpm_db.* TO 'dpm'@localhost IDENTIFIED BY 'dpm_password' WITH GRANT OPTION;

Specify different database names

The DPM and DPNS databases are called cns_db and dpm_db by default. But you can specify any other name. In this case, /opt/lcg/etc/NSCONFIG and /opt/lcg/etc/DPMCONFIG should look like :

dpm_user/dpm_password@host/database_name

instead of :

dpm_user/dpm_password@host

Oracle

DPM Oracle Database User

Create a dedicated user for the DPM, called DPM_USER for instance.

DPM and DPNS tables

Create the DPM and DPNS database schemas :

sqlplus DPM_USER/dpm_password@DPM < INSTALL_DIR/share/DPM/create_dpns_tables_oracle.sql
sqlplus DPM_USER/dpm_password@DPM < INSTALL_DIR/share/DPM/create_dpm_tables_oracle.sql

Note: DPM is the Oracle database SID in this example.

tnsnames.ora

By default, we assume that the tnsnames.ora file on the DPM, DPNS and SRM servers is located in /home/dpmmgr/.tnsadmin. If this is different, please modify the /etc/sysconfig/service_name files and set $TNS_ADMIN with the appropriate value before starting the servers.

Oracle minimal number of connections

The Oracle database backend should allow for enough connections. The DPM_USER should accept at least as many connections as there are DPM/DPNS threads.

If you run the DPNS with 50 threads, you should allow for at least 50 connections.

The DPM is started with 20 threads (not configurable).


DPM Client Installation

The DPM-client is required on each disk server managed by the DPM. We also recommend instaling it on each DPM server machine.

To install the DPM client, just install the RPM as root :

rpm -Uvh DPM-client-VERSION-1sec.i386.rpm

Test the DPM client (after having installed, configured and started the DPM and the DPNS servers) :

grid-proxy-init

export DPM_HOST=DPM_hostname
dpm-qryconf

export DPNS_HOST=DPNS_hostname
dpns-ls /

These commands shouldn't return any error.

If pools have already been created, they should appear with the dpm-qryconf command. If directories/files have already been created in the DPM, they should appear with the dpns-ls command.


DPM-RFIO and DPM-Gridftp Installation

Description

The DPM-enabled Gridftp server, as well as the DPM-enabled RFIO server if your site supports this protocol, have to be installed on each disk server managed by the DPM. Both RPMs depend on DPM-client.

The DPM-enabled Gridftp server is a standard Gridftp server with special processing for files belonging to the dpmmgr user. In other words, if the disk servers require a Gridftp server to be installed for something else than the DPM, the DPM-gridftp server can be used for both usages.

The files on the disk servers are owned by the dpmmgr user and cannot be accessed directly by any (other) user. Only access via RFIO or Gridftp is possible (in order to check the authorization with the DPNS).

Note: the DPM-rfio-server RPM conflicts with the CASTOR-client RPM. So, the CASTOR-client RPM should be removed if it is present, since the disks servers should be managed by the DPM only.

As opposed to CASTOR-client RFIO, the DPM RFIO server is GSI-enabled.

Installation

Once the DPM-client installed, install the DPM-enabled Gridftp and/or RFIO servers as root on each disk server:

rpm -Uvh DPM-gridftp-server-VERSION-1sec.i386.rpm
rpm -Uvh DPM-rfio-server-VERSION-1sec.i386.rpm


Installing the DPM Servers on Different Machines

If the servers are installed on different machines, the /etc/shift.conf file has to be created in the following cases.

Note: There has to be only one blank or tab between each word in /etc/shift.conf.

DPM TRUST

  • If the SRM server is not installed on the same machine as the DPM server, the SRM host has to be trusted by it. The disk servers also have to be trusted by the DPM. So, on the DPM machine, specify the disk servers and the SRM hosts as trusted :

less /etc/shift.conf | grep DPM

DPM TRUST SRM_short_hostname1 SRM_full_hostname1 SRM_short_hostname2 SRM_full_hostname2 disk_server_short_hostname1 disk_server_full_hostname1 disk_server_short_hostname2 disk_server_full_hostname2...

Note : The part DPM TRUST disk_server1_short_name disk_server1_long_name is needed, otherwise the Gridftp operations won't work. For instance, you might get :

globus-url-copy file:/etc/group gsiftp://lxb1904/dpm/cern.ch/home/dteam/tests.sophie

error: the server sent an error response: 553 553 /dpm/cern.ch/home/dteam/tests.sophie: Permission denied.

DPNS TRUST

  • If the DPM and/or the SRM servers are not installed on the same machine as the DPNS server, they have to be trusted by it. The disk servers also have to be trusted by the DPNS. So, on the DPNS machine, specify the disk servers and the DPM host(s) and/or SRM host(s) as trusted :

less /etc/shift.conf | grep DPNS

DPNS TRUST DPM_short_hostname DPM_full_hostname SRM_short_hostname1 SRM_full_hostname1 SRM_short_hostname2 SRM_full_hostname2 disk_server_short_hostname1 disk_server_full_hostname1 disk_server_short_hostname2 disk_server_full_hostname2...

Note : The part DPM TRUST disk_server1_short_name disk_server1_long_name is needed, otherwise the Griftp/RFIO operations won't work.

RFIO TRUST

  • If the RFIOD server is not installed on the same machine as the DPM server, it has to be trusted by it. So, on the RFIOD machines, specify the DPM host as trusted :

less /etc/shift.conf | grep RFIOD

RFIOD TRUST DPM_short_hostname DPM_full_hostname
RFIOD RTRUST DPM_short_hostname DPM_full_hostname
RFIOD WTRUST DPM_short_hostname DPM_full_hostname
RFIOD XTRUST DPM_short_hostname DPM_full_hostname
RFIOD FTRUST DPM_short_hostname DPM_full_hostname

Important: This is currently also required on the DPM/DPNS/SRM servers, if they act as disk server.

Note: If you want to allow direct access to the pool nodes via Gridftp (i.e. not always go through the DPM server), you have to add :

RFIOD TRUST DPM_short_hostname DPM_full_hostname disk_server1_short_name disk_server1_long_name...


Machines on different subnets/domains

The DPM supports disks servers spread over different subnets/domains.

When adding the files systems to the DPM pools, make sure you specify the fully qualified host names :

dpm-addfs --poolname MyPool --server se.my.domain --fs /dedicated_file_system


DPM Configuration

The configuration of the DPM is done via the Command Line Interface on the DPM server machine. No central configuration file is needed by the DPM. Only two files containing the database connection information are required.

Database Configuration Files

Configuration Files by default

By default, the database configuration files are :

  • /opt/lcg/etc/NSCONFIG
  • /opt/lcg/etc/DPMCONFIG

If you have a different setup, you have to modify the /etc/sysconfig/dpnsdaemon and/or /etc/sysconfig/dpm files. For instance :

# - DPM configuration file :

DPMCONFIGFILE="/another/dpm/config/file"

For Mysql

Create the DPM and DPNS configuration files.

On the DPM host :

echo DPM_username/DPM_password@Mysql_server_hostname > /opt/lcg/etc/DPMCONFIG

On the DPNS host :

echo DPNS_username/DPNS_password@Mysql_server_hostname > /opt/lcg/etc/NSCONFIG

For Oracle

On the DPM host :

echo DPM_username/DPM_password@Oracle_Database_SID > /opt/lcg/etc/DPMCONFIG

On the DPNS host :

echo DPNS_username/DPNS_password@Oracle_Database_SID > /opt/lcg/etc/NSCONFIG

Create the DPM Namespace

As root, on the DPNS machine, run the following commands for each supported VO, replacing domain.name and VO with the appropriate values :

export DPNS_HOST=DPNS_hostname

dpns-mkdir /dpm
dpns-mkdir /dpm/domain.name
dpns-mkdir /dpm/domain.name/home
dpns-mkdir /dpm/domain.name/home/VO

dpns-chmod 775 /dpm
dpns-chmod 775 /dpm/domain.name
dpns-chmod 775 /dpm/domain.name/home
dpns-chmod 775 /dpm/domain.name/home/VO

dpns-entergrpmap --group VO
dpns-chown root:VO /dpm/domain.name/home/VO

dpns-setacl -m d:u::7,d:g::7,d:o:5 /dpm
dpns-setacl -m d:u::7,d:g::7,d:o:5 /dpm/domain.name
dpns-setacl -m d:u::7,d:g::7,d:o:5 /dpm/domain.name/home
dpns-setacl -m d:u::7,d:g::7,d:o:5 /dpm/domain.name/home/VO

This is handled automatically by YAIM.

Add Disk Pools

You have to create the disk pools as root. For instance, let’s assume you want to create a volatile pool and a permanent one :

export DPM_HOST=DPM_hostname

dpm-addpool --poolname Volatile --def_filesize 200M --def_pintime 600 --s_type V
dpm-addpool --poolname Permanent --def_filesize 200M --s_type P

With these commands, both pools will reserve 200MB of space for each file by default, in case the client application does not provide the estimated size (lcg-utils commands do provide this information). And the volatile pool will keep a file pined for at least 600 seconds.

Note: the name of a pool is maximum 15 characters long.

If you want to modify the characteristics of a pool later on, just use the dpm-modifypool command.

For instance, with the following command, the DPM will start delete volatile files if only 10% of space is left, until 20% of space is free. And 100MB will be reserved for a file, instead of 200MB.

dpm-modifypool --poolname Volatile --def_filesize 100M --gc_start_thresh 10 --gc_stop_thresh 20

For more details about these commands, refer to the corresponding man pages.

You then have to add file systems to the pools you created. Let’s assume you have the following file systems :

  • /data01 on the machine se001.my.domain, that should be volatile,
  • /data02 on the machine se002.another.domain, that should be permanent and read-only.

As already mentioned, all the file systems have to have the following permissions :

ls -ld /data01
drwxrwx--- 3 dpmmgr dpmmgr 4096 Jun 9 12:14 data01

You have to make the DPM server aware of them:

export DPM_HOST=DPM_hostname

dpm-addfs --poolname Volatile --server se001.my.domain --fs /data01
dpm-addfs --poolname Permanent --server se002.another.domain --fs /data02

  • Note 1 : disks servers on different domains is supported.

  • Note 2 : as from LCG-2_7_0 (i.e. DPM >= 1.4.2), you should use the fully qualified host names of the disk servers.

  • Note 3 : /data01 and /data01/ are currently considered as two different file systems...*

To see the current configuration of the DPM, use the dpm-qryconf command. Here is an example of its output :

dpm-qryconf

POOL Volatile DEFSIZE 200.00M GC_START_THRESH 0 GC_STOP_THRESH 0 DEFPINTIME 600 
PUT_RETENP 86400 FSS_POLICY maxfreespace GC_POLICY lru RS_POLICY fifo GID 0 S_TYPE V
CAPACITY 17.73G FREE 17.73G (100%)
se001.my.domain /data01 CAPACITY 17.73G FREE 17.73G (100.0%)

And that’s all ! The configuration of the DPM is over. You only need to publish the DPM in the Information System, and the users can start using it !

Restrict a pool to one or several VOs/groups

By default, a pool is generic: users from all VOs/groups will be able to write in it.

But it is possible to restrict a pool to one or several VOs/groups. See the dpm-addpool and dpm-modifypool man pages.

For instance:

  • Possibility to dedicate a pool to several groups
       $ dpm-addpool --poolname poolA --group alice,cms,lhcb
       $ dpm-addpool --poolname poolB --group atlas

  • Add groups to existing list
       $ dpm-modifypool --poolname poolB --group +dteam

  • Remove groups from existing list
       $ dpm-modifypool --poolname poolA --group -cms

  • Reset list to new set of groups (= sign optional for backward compatibility)
       $ dpm-modifypool --poolname poolA --group =dteam

  • Add group and remove another one
       $ dpm-modifypool --poolname poolA --group +dteam,-lhcb

IMPORTANT:

Secondary groups are not supported at the pool level, so that the VOs / groups who actually use the space "get the bill at the end of the month". This is the same behaviour as in UNIX.

In other words, only the primary virtual gid of the user matters when writing. Thus, to dedicate a pool to an entire VO, you should add the VO subgroups and roles to the pool.

For instance:

$ dpm-addpool --poolname Pool-Ops --group ops,ops/Role=lcgadmin

When a pool explicitely dedicated to a virtual gid is full, the generic pool is then used (provided there is one).


Run the Daemons

Sysconfig Files

Before starting a service, create the corresponding /etc/sysconfig/service_name file by copying /etc/sysconfig/service_name.templ. You have to set the $DPM_HOST and $DPNS_HOST where required. And you can modify the values that don’t match your setup.

Start the Servers

In this order, start the daemons :

  • On the DPNS server machine :
service dpnsdaemon start

  • On each disk server managed by the DPM :
service rfiod start

  • On the DPM and SRM server machine(s) :
service dpm start

service srmv1 start

service srmv2 start

service srmv2.2 start

  • On each disk server managed by the DPM :
service dpm-gsiftp start

To know if one of the services is running : service service_name status


Number of threads

DPNS

By default, the DPNS daemon is started with 20 threads. To specify more/less threads, modify /etc/sysconfig/dpnsdaemon before running the daemon :

# - Number of DPNS threads :
NB_THREADS=50

Important : increasing the number of threads doesn't mean improving the performance. Most of the DPNS operations are fast and don't occupy the thread long. And on a dual CPU machine, only 2 active threads will be served at the same time. So, increasing the number of threads is useful for instance if there are usually many threads waiting for the database to respond (and only if the database is on a different machine...)

DPM

For the DPM daemon, there are :

  • 20 threads for the asynchronous processing of slow requests like get/put/copy,
  • 20 threads for the fast requests,
  • 1 garbage collector thread per pool,
  • 1 thread to remove expired puts.


DPM Log Files

Log Files by Default

By default, the log files are :

  • /var/log/dpns/log
  • /var/log/dpm/log
  • /var/log/srmv1/log
  • /var/log/srmv2/log
  • /var/log/rfiod/log
  • /var/log/messages (Griftp server)

Except for the DPM-gridftp server, you can specify different log files. For this, you have to modify the corresponding /etc/sysconfig/service_name file. For instance :

# - RFIO log file :

RFIOLOGFILE=/another/rfio/log

Log Rotation

In order to avoid the DPM log files to fill up and stop the DPM servers to work, the log files are automatically rotated every day. The logrotate files are /etc/logrotate.d/service_name

Example of Logs

DPM and DPNS Logs

The DPM and DPNS logs are similar. Below we describe only the DPM log.

Each line contains :

  • a timestamp,
  • the process id of the daemon and the number of the thread taking care of the request,
  • the name of the method called,
  • the kind of request (put, get or copy),
  • the error number (POSIX error numbers),
  • useful information about the request (token/request id, file, etc.)

Here is an example of the DPM log:

05/24 22:10:41 3572,23 dpm_srv_inc_reqctr: DP092 - inc_reqctr request by /C=CH/O=CERN/OU=GRID/40.cern.ch
05/24 22:10:41 3572,23 dpm_serv: incrementing reqctr
05/24 22:10:41 3572,23 dpm_serv: msthread signalled
05/24 22:10:41 3572,23 dpm_srv_inc_reqctr: returns 0
05/24 22:10:41 3572,2 msthread: calling Cpool_assign_ext
05/24 22:10:41 3572,2 msthread: decrementing reqctr
05/24 22:10:41 3572,2 msthread: calling Cpool_next_index_timeout_ext
05/24 22:10:41 3572,2 msthread: thread 2 selected
05/24 22:10:41 3572,2 msthread: calling Cthread_mutex_lock_ext
05/24 22:10:41 3572,2 msthread: reqctr = 0
05/24 22:10:41 3572,4 dpm_srv_proc_put: processing request 48
05/24 22:10:41 3572,4 dpm_srv_proc_put: calling Cns_stat
05/24 22:10:41 3572,4 dpm_srv_proc_put: calling Cns_creatx
05/24 22:10:41 3572,4 dpm_srv_proc_put: calling dpm_selectfs
05/24 22:10:41 3572,4 dpm_selectfs: selected pool: Permanent
05/24 22:10:41 3572,4 dpm_selectfs: selected file system: lxb1540:/storage

SRM Logs

The SRM (v1 and v2) logs contain :

  • a timestamp,
  • the process id of the daemon and the number of the thread taking care of the request,
  • the DN of the user,
  • the kind of request (PrepareToPut, PrepareToGet, etc.),
  • the SRM error number,
  • useful information about the request (GUID, SURL, etc.)

Here is an example of the SRMv2 log:

12/03 15:54:51 17419 srmv2: started
12/03 16:54:54 17419,0 PrepareToPut: request by /C=CH/O=CERN/OU=GRID/CN=Sophie Lemaitre 2268 (18119,2688)
12/03 16:54:54 17419,0 PrepareToPut: SRM98 - PrepareToPut 3 bfc51f99-17bd-4a45-b440-0fc33ec7a8d4
12/03 16:54:54 17419,0 PrepareToPut: SRM98 - PrepareToPut 0 srm://lxb0722.cern.ch:8444//dpm/dteam/

The SRM error numbers are defined by the SRM interface definition. For more information about the SRM standard, please visit http://sdm.lbl.gov/srm-wg.


Space Management

From DPM 1.6.0, space can be reserved by:

  • the DPM administrator (as root),
  • any user.

Space reservations can be referred to by either:

  • a space token: a UUID created internally by the DPM. Ex: fe869590-b771-4002-b11a-8e7430d72911, or
  • a user space token description: a description provided by a user or the DPM admin. Ex: myspace

Whereas a space token is unique, a user space token description can correspond to several space tokens.

Reserve Space

The DPM administrator can reserve space for any given group.

  • Ex1: reserve 20G for the atlas group for an infinite amount of time
$ dpm-reservespace --gspace 20G --lifetime Inf --group atlas --token_desc Atlas_ESD

  • Ex2: reserve 100M for the dteam/Role\=lcgadmin group for one hour
$ dpm-reservespace --gspace 100M --lifetime 1h --group dteam/Role=lcgadmin --token_desc dteam_lcgadmin

A user can only reserve space for himself/herself:

$ grid-proxy-init
$ dpm-reservespace --gspace 200M --token_desc sophie_28Feb2007

We recommend to specify a token description (token_desc) everytime as:

  • it is easier to remember,
  • it is being used by the client tools (FTS, lcg_util and GFAL).

Notes:

  • if the same token description is used several times (ex: Atlas_ESD) , you can use the real space token (ex: fe869590-b771-4002-b11a-8e7430d72911) instead, to distinguish one from the other.
  • user space token description are case sensitive.

Update Space

A user can update the space he/she reserved. The DPM administrator can update any given space.

  • Ex1: update reserved space to be valid for one month
$ dpm-updatespace --space_token fe869590-b771-4002-b11a-8e7430d72911 --lifetime 1m

  • Ex2: update reserved space to 5G
$ dpm-updatespace --token_desc myspace --gspace 5G

Release Space

A user can release the space he/she reserved. The DPM administrator can release any space.

If force is not specified, the space will not be released if it has files that are still pinned in the space.

  • Examples:
$ dpm-releasespace --space_token fe869590-b771-4002-b11a-8e7430d72911
$ dpm-releasespace --token_desc myspace --force


Integration of the DPM

Publishing the DPM in the Information System

To use the DPM as in the LCG-2 environment, you have to publish it in the Information System.

Here is a page to help you doing this : How to publish the LFC/DPM in the Information System. However, we don't garanty that this page is up-to-date. In case of problems, refer to the appropriate documentation.

Test if the DPM server appears in the BDII, by using the lcg-infosites command on a UI: lcg-infosites --vo dteam se --is BDII_hostname

The --is option specifies the BDII to query, in case it is not defined by the $LCG_GFAL_INFOSYS environment variable. So, the previous command is equivalent to:

export LCG_GFAL_INFOSYS=BDII_hostname

lcg-infosites --vo dteam se

Testing the Integration

On a UI (where lcg_util, lcg-gfal and DPM-client are installed), test that the DPM is correctly integrated, for instance by trying to copy and register an existing file (for instance, /tmp/hello.txt) :

export DPNS_HOST=DPNS_hostname

dpns-mkdir /dpm/cern.ch/home/dteam/mydir

lcg-cr -v -d srm://SRMv1_server_hostname:8443/dpm/cern.ch/home/dteam/mydir/hello1.txt --vo dteam

file:/tmp/hello1.txt

Note: lcg_utils now supports SRMv1 and SRMv2.2.

The file should then appear in the DPM namespace:

dpns-ls /dpm/cern.ch/home/dteam/mydir


RFIO Usage Examples

rfdir / rfrm

To use rfdir with the DPM, the recipe is :

$ export DPNS_HOST=<my_dpns_host>
$ rfdir /dpm/cern.ch/home/dteam/

To use rfrm, you need to set DPM_HOST and DPNS_HOST :

$ export DPNS_HOST=<my_dpns_host>
$ export DPM_HOST=<my_dpm_host>

$ rfrm -r /dpm/cern.ch/home/dteam/tests_sophie


Going from a Classic SE to the DPM

Turning your Classic SE into a DPM is easy : it doesn’t require to move the data in any way. You only need to make the DPM server aware of the files that are present on your Storage Element. In other words, this is only a metadata operation, and no actual file movement is required at all.


Troubleshooting

See the Troubleshooting page.

In case you have questions or you need help, do not hesitate to contact helpdesk@ggusNOSPAMPLEASE.org (remove the NOSPAM !)


Developers' Documentation

The Developers' documentation is under construction at : Developers' documentation.

-- SophieLemaitre - 08 Nov 2007


Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2010-06-03 - RicardoBritoDaRochaSecondary
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback