How to migrate a DPM on SL3 to SL4

It is mandatory to read the whole page before performing any operations. You will have TO SCHEDULE A DOWNTIME. ALL THE SERVICES RELATED TO DPM WILL NEED TO BE STOPPED IN THE CORRECT ORDER. You may also require to stop your MySQL server if the host needs to be reinstalled to SL4.

1. Introduction

The objective of this page is to give guidelines on safely migrating a DPM installed on SL3 to SL4. This document is meant for DPM site administrators. A DPM installation has 3 main components which can be installed or not on the same machine:
  • the DPM head node on which the daemons (dpm, dpns, srm*) run.
  • the DPM MySQL server (DPM and DPNS databases).
  • the DPM disk servers where rfio/gridftp servers run.

2. Preamble

Before going further, you need to have a clear picture of your DPM installation and what components you will be affected by moving to SL4:
  • Are the MySQL server and the DPM server (or headnode) installed on the same machine?
  • What is the task of the machines involved in your DPM set up? Are they DPM disk servers? Are they DPM headnode?
  • What machines {MySQL server, DPM server, DPM disk servers} will be reinstalled to SL4?
  • What is the current DPM version installed? Do you have the equivalent for SL4?
  • even if it is not recommended, are there any other services on this machine (Torque server, etc...). It is important to keep in mind, that this document only provides information about the DPM, for the other services, please ask them.
  • if you have a shared area, use it to keep all your back ups.

3. Structure of the document

The document describes the different files which need to be backed up and saved per DPM component. Depending on your DPM set up and what machines will be reinstalled to SL4, some sections can be skipped. For instance, if only the machine which hosts the DPM headnode will move to SL4, you can skip the DPM disk server section. If this machine hosts also the MySQL server, you will have to read the MySQL server (4.2) and DPM headnode (4.1) section. If you have a fancy DPM set up or if you are not sure of something, don't hesitate to send a mail to the DPM support.

4. What needs to be saved per component ?

Please follow the order.

4.1 DPM headnode

In this section, we consider DPM head node, the ensemble of machines on which the srmv2.2, srmv2,srmv1, dpm and dpnsdaemon daemons run.

4.1.1 Config files

On the DPM head node, most of the configuration files are generated by Yaim. So when you will install your DPM via Yaim, the config files will be automatically regenerated. However there are a few files which need to be backed up and put in safe place, i.e. not affected by the SL4 installation:
  • the host certificate (hostcert.pem and hostkey.pem) must be saved to avoid geting a new one (located under /etc/grid-security/). The other one under /etc/grid-security/dpmmgr will be generated by Yaim. If your daemons run on different machines, you need to keep a copy of all the host certificates.
  • the log files of the daemons must be backed up: /var/log/dpm/* (where the DPM daemon runs) , /var/log/dpns/* (where the DPNS daemon runs), /var/log/srmv1 (srmv1 runs), /var/log/srmv2 (srmv2 runs), /var/log/srmv2.2 (srmv2.2 runs). As a reminder, log files have to be kept for 90 days. All these log files belong to dpmmgr (owner and group owner with rights 644 for the file and 755 for the directories). (Use tar cfp <directory_name> to keep the permissions)
  • the /etc/passwd and /etc/group (for all the distinct machines) must be kept so that you can retrieve afterwards the mapping between uid and username and gid and groupname (useful for the pool accounts reserved and the dpmmgr account)
  • if the yaim confg files are located on your DPM head node, back up all the yaim DPM config files you have customized such as site-info.def, users.conf and groups.conf, etc. .
Besides the configuration files, you need to check that the current DPM version matches the one you are going to install for SL4. It is important to verify this, especially if there is an update in the table schema.

4.1.2 Stopping the services

You will need to stop the following services:
  • service srmv2.2 stop
  • service srmv2 stop
  • service srmv2 stop
  • service dpm stop
  • service rfiod stop
  • service dpnsdaemon stop

4.2 DPM disk servers

In common set ups, the rfio and dpm-gsiftp daemons run on the disk servers. You need to check that the partition affected by the OS reinstallation (to SL4) doesn't contain any files stored and managed by the DPM. If you have some files then go to Handle Files stored/managed by DPM subsection (4.2.2).

4.2.1 Config files

Find below the list of files which need to be backed up and put in a safe place (not affected by the reinstallation and secure as there may be some sensitive files). Use the tar command with p option to keep the file permissions :
  • the log files located on all disk servers (/var/rfio/log, /var/log/dpm-gsiftp/, eventually /var/log/xrootd/ and other transfer protocols installed). These log files belong to root (owner + group owner). use the tar command with p option to keep the permission.
  • if the yaim confg files are located on your DPM disk servers, back up all the yaim DPM config files you have customized such as site-info.def, users.conf and groups.conf, etc. ..
  • the /etc/passwd and the /etc/group of all disk servers to retrieve the mapping between uid/username and gid and groups
  • the host certificates of all disk servers.

4.2.2 Handle Files stored/managed by DPM

If you have files which will be affected by the reinstallation because they are located on the same OS partition, one way to save them is to make tar files. The main advantage of the tar command is it is recursive and it keeps the file permissions (for that you need to put tar cfp (p option)). Then transfer this tarball in a safe place (secure as the data may be sensitive and not affected by the reinstallation of SL4).

4.2.3 Stopping the services

You will need to stop the following services:
  • service rfiod stop
  • service dpm-gsiftp stop

4.3 DPM MySQL

In this section we describe how to move MySQL DB from SL3 to SL4 safely. This step needs to be performed if the MySQL server and the DPM headnode are located on the same machine or if you plan to reinstall your MySQL server which is located on a different machine from the DPM head node. If your MySQL server runs on another machine than the DPM head node, and you don't plan to reinstall the OS of the MySQL host, then you can skip this section.

4.3.1 A brief reminder

The DB is split into 2 parts :
  • DPM DB which contains information related to the DPM configuration, to the space management and to the handling of requests
  • DPNS DB is the name server containing all information about the replicas and metadata. All paths to replicas of a file are stored in this DB.

4.3.2 DB backup

N.B : if your MySQL server is used for other services than DPM, you will need to make a back up of the necessary databases. Please ask the support of the involved services. In this subsection, we focus only on the DPM service.
  • IT IS VERY IMPORTANT TO MAKE A BACKUP OF THE DPM AND DPNS DB BEFORE ANY REINSTALLATION.
  • For that the DPM server (headnode) must be stopped as for a DPM migration see 4.1.2 for more details on how to stop properly your DPM.
  • mysqldump --databases <dpm_db_name> <dpns_db_name> -u <username> -p > DPM_MySQL_BCKUP.sql will allow to make a backup of the dpm and dpns DBs
  • Compress the backup if it is too big.
  • Put this file DPM_MySQL_BCKUP.sql in a secure place or partition which will not be affected by the reinstallation
  • Put also your /etc/my.cnf in a safe place if you have set specific parameters for MySQL
  • Optional : you can if needed, keep the MySQL log file usually named <hostname>.err
  • Stop MySQL : service mysql stop

5. SL4 Installation

After the installation,
  • Don't forget to tell to the sysadmin not to affect the storage partitions and the ones which contain some back ups of config files.
  • After the installation, copy back the right host certificates on the corresponding machines.
  • use (and don't copy it directly) the proper /etc/passwd and /etc/group for the services and people accounts. Warning : some service names may have changed from SL3 to SL4 and also don't forget to type grpconv and pwconv to recreate the shadow files for users and groups.
  • Copy back + untar all the log files you backed up via tar on their corresponding machines. Make sure your are performing the tar on the top of the tree as tar preserves the hierarchy. So for the log files, the tar xvf has to be executed on /.
  • Copy back + untar all the files stored/managed by the DPM into the right disk server. Make sure that the hierarchy is preserved and they belong to dpmmgr (user + group). You may want to change using the following command line chown -R dpmmgr.dpmmgr <directory_to_update>. Same remark as previous make sure that you untar at the proper level.
  • Copy back the MySQL backup to the machine which hosts the MySQL server.
  • Install java sdk before starting using yaim. See Yaim for more information.
  • if needed, copy back the YAIM config files you have backed up on the right machines. Put them on the right place so that YAIM can find them.

6. Installing/Configuring your DPM via YAIM

6.1 Installation

Normally you may have to change the repository to point to the one containing the SL4 rpm. The other parameters are not affected. You may have to run it on the DPM head node and DPM disk servers separetely. See Yaim documentation for more details.

6.2 Importing the data into MySQL (if your MySQL server is not affected by the reinstallation to SL4, you can skip this section)

The MySQL server will be installed via Yaim. One remark, in SL4, the MySQL process is called mysqld and not mysql. If the version of the DPM you want to install is higher than the one you used to run in SL3, it is highly recommended to first load the data into the MySQL. It is important in case of an upgrade of the table schema. Yaim will then properly upgrade the table schema you had in SL3 if necessary. To perform this operation,
   * service mysqld start
   * mysql -u root (no password is needed when it is done from the localhost)
    mysql> set password for 'root'@'localhost'=password('root_pwd');
    Query OK, 0 rows affected (0.00 sec)

   * mysql> source DPM_MySQL_BCKUP.sql 
So at this level, you have updated the root password so that you can configure it via yaim and you have put back the content of the DB as it was in SL3.

6.3 Configuration

Normally nothing to change in the site-info.def. See Yaim documentation for more details. After the configuration, all the daemons should be up. At this level, you should have the same DPM and DPNS DB content as the one in SL3. You can also check that the partition used for storage belongs to the user and group owners dpmmgr.

7. Problems, questions, comments

Q1. i don't have enough space to make a tar on the machine. What can Ido?

A1: you can type the following command (from the DPM head node/disk server/ MySQL server) provided that you have an ssh client and a host where you can store the files:


tar czpf - <directory_to_tar> | ssh <login>@<host_where_to_put_the_archive> cat ">" <name_of_the_archive.tgz> 
It may prompt you for the password of the account. You can alos specify the location of your ssh key via ssh -i and the version of ssh to be used ssh -1 (default one) or ssh -2.

To get it back and untar it, you can type the following command (from the DPM head node/disk server/ MySQL server):

ssh <login>@<host_where_to_put_the_archive_stored> cat "<" <name_of_the_archive.tgz> |tar xfz -
Don't hesitate to report problems, to ask questions or make any comments to improve the quality of this page to DPM support

-- LanaAbadie - 15 Oct 2007

Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r6 - 2007-12-14 - LanaAbadie
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback