WARNING: This web is not used anymore. Please use PDBService.T3StorageArchitecture instead!
 

Veritas Volume Manager and T3 Arrays on the PDB cluster

Overview of the architecture

The PDB cluster is a Sun cluster with two nodes, sundb07 and sundb08.
Oracle runs on these two nodes in the Real Application Cluster (RAC) configuration.
Two mirrored T3 disk arrays are used for data storage, dbsct37 and dbsct38.
Each Sun node also has its own smaller local disk, where the O/S is installed.

Each T3 unit manages its 9 disks in the following way: eight disks are arranged in a RAID-5 configuration, while the remaining disk is available as hot standby.

The disk space for data storage on the T3 arrays is seen by Oracle as provided by an intermediate layer of Veritas software products, the "Veritas Volume Manager (VxVM)" and the "Veritas Cluster Volume Manager (CVM)":

  • The Volume Manager provides advanced management for the disk space on the T3 disk arrays (on top of the functionality already offered by each T3): it gives the possibility to divide the disks into volumes, and provides mirroring of data (with one copy on each T3 array).
  • The CVM extends the functionality of the Volume Manager to the two nodes in the cluster, so that both nodes may share the same volume resources. The CVM runs on both sundb07 and sundb08: one of the nodes has the role of "Master" and the other that of "Slave".

Checking the configuration of each T3

The T3 disk arrays run a very simple UNIX-like O/S. It is possible to connect to them (as root) using telnet.

Checking the RAID-5 array configuration

As previously mentioned, each T3 is made up of nine disks, with eight arranged in a RAID-5 configuration and the ninth one available as hot standby. This can be checked by the following command

vol list
which will display
volume         capacity     raid  data      standby
v0             236.058 GB   5     u1d1-8    u1d9

RAID (Redundant Array of Inexpensive Disks) is a technology providing advanced management features on large disk arrays. There are various levels of RAID, starting with RAID-0 ("striping", where individual volumes can be created across multiple disks, but no redundancy is provided) and RAID-1 ("mirroring", where the same data is written in parallel to two disks so that redundancy provides higher availability).

The RAID-5 implemented on our T3 disk arrays ensures that reading and writing to the data volumes defined on the eight disks would continue to be possible even if one of the disks broke down. This is because data is actually written down to only seven disks (not always the same seven, though), while the eight disk is used to store a "parity" record that is the result of a calculation on the other seven data chunks and that can be used to recover any one of the seven chunks from the other six if that became unavailable.

On top of the fact that eight disks are arranged in a RAID-5 array, the T3 keeps a ninth disk as hot standby: if one of the eight disks broke down, the RAID-5 configuration would be used to reconstruct the exact contents of the disk that broke down and store it in the ninth disk, defining a new RAID-5 array of eight disks.

Mirroring of data (RAID-1) is not set up internally in each T3: instead, data mirroring across the two T3 arrays is provided by the software layer sitting on top of the two arrays, the Veritas Volume Manager.

Checking the T3 volume slices

On each T3, the total data capacity of 236 GB from the RAID-5 array of eight disks is subdivided into two "slices". These are called "orahome" and "RACdg" and are used to store the Oracle software and the Oracle data respectively. This can be seen by the command

volslice list

which will display
Slice         Slice Num     Start Blk     Size Blks     Capacity      Volume
orahome       0             0             104876800       50.008 GB   v0
RACdg         1             104876800     390141696      186.033 GB   v0
-             -             495018496     34048            0.015 GB   v0

Using the Veritas Volume Manager

All operations using the Veritas Cluster Volume Manager should be performed on the "Master" node.

Finding out which node is the master

It is not possible to manually change the configuration switching roles between the Master and Slave nodes: the CVM is configured automatically at every reboot, and it is not even possible to predict which node will be assigned the Master role. It is therefore necessary to find out which of the two nodes is the Master every time the CVM is used.

To do that, login as root on the two cluster nodes sundb07 and sundb08 (you may want to enter a bash shell instead of using the default sh shell). Let's assume, in the following, that sundb08 is the Master while sundb07 is the Slave. The following command

ssh root@sundb07
vxdctl -c mode
will return
mode: enabled: cluster active - SLAVE
while

ssh root@sundb08
vxdctl -c mode
will return
mode: enabled: cluster active - MASTER

Checking for errors

The full list of records in the Veritas volume setup can be printed using the command

vxprint | more
which will display a list like the following

TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
dg rootdg       rootdg       -        -        -        -        -       -
dm mirrordisk2  c1t1d0s2     -        71124291 -        -        -       -
dm rootdisk2    c1t0d0s2     -        71124291 -        -        -       -
v  global2      fsgen        ENABLED  615357   -        ACTIVE   -       -
pl global2-01   global2      ENABLED  615357   -        ACTIVE   -       -
sd rootdisk2-03 global2-01   ENABLED  615357   0        -        -       -
pl global2-02   global2      ENABLED  615357   -        ACTIVE   -       -
sd mirrordisk2-01 global2-02 ENABLED  615357   0        -        -       -
[...]
Unless there is a problem, all volumes should be in one of the three states "-", "ACTIVE" or "LOGONLY".

Using the graphical Volume Manager Storage Administrator

The easiest way to display and modify Veritas volumes is to use the graphical Volume Manager Storage Administrator (VMSA), which can be launched as root on the Master node using the following command.

vmsa &
After inserting your root password for the Master node, this will open up a GUI with a button menu on the top, a navigator window on the left and another window on the right (NB: While many operations using the Veritas Volume Manager can be performed using sudo from the oracle account, it is necessary to know the root password to use the VMSA: a window will ask for it as soon as the VMSA is launched.). Here's a snapshot of the VMSA GUI:

The VMSA navigator window displays three "Disk groups" on the Master host: "RACdg", "homedg" and "rootdg". By clicking on the "Enclosures" tab it is possible to see that the first two disk groups are defined on the T3 disk arrays, while the third one is defined on the local disk of the sundb08 Sun. That is to say, the Veritas disk group and volume configuration is superposed on top of the devices available to the cluster nodes, with its own naming schema (the usage of "RACdg" for both a T3 slice and a Veritas disk group is a convenience, not a necessity).

On boot, the Solaris O/S on the two Sun nodes would find these devices and puts them in /dev even if Veritas was not used. Veritas acts as a glue and defines its own additional special devices in /dev/vx on top of the standard T3 and local devices.

By clicking on the "File Systems" tab, it is possible to see that only the volumes in homedg and rootdg have mount points in the file system: those in RACdg, instead, are not mounted because they are used by Oracle as raw devices to define tablespaces.

The Veritas entities most relevant to Oracle are the Veritas "volumes" in the RACdg disk group. These can be displayed by clicking on the "Volumes" tab. In our setup, raw Veritas volumes are treated by Oracle just like datafiles. Each raw volume is associated to a different tablespace (although a tablespace can have several raw volumes, just like it can have several datafiles). Our convention is to give the raw volume a name starting by "pdb01_" and continuing with the tablespace name.

The "# copies" column in the table displayed in the window on the right is equal to 2 for all volumes in the RACdg disk group. This indicates that all volumes are mirrored across the two T3 arrays. More details about the layout of a volume can be obtained by right-clicking on the volume name and selecting the "Show Layout" tab. For the pdb_data01 volume, for instance, associated to the Oracle DATA01 tablespace, this will display the following layout:

This indicates that the pdb_data01 volume is composed of three Veritas "plexes". Two of these plexes, of equal size 1 GB (the size of the raw volume as seen by Oracle), are the mirroring data copies defined on each of the two T3 disk arrays. The third plex, only a few kB in size, is the so-called Dirty Reading Logging (DRL) plex, used for special purposes by Veritas.

Details on how to create new Veritas volumes and define Oracle tablespaces on them are given in the following section on "Veritas volume and Oracle tablespace creation on PDB01".

(Page compiled by Andrea Valassi

on 2004-02-04 from invaluable conversations with Magnus Lubeck)

Topic attachments
I Attachment History Action Size Date Who Comment
GIFgif vmsa_layout_width700.gif r1 manage 27.1 K 2005-12-07 - 15:39 UnknownUser  
GIFgif vmsa_width700.gif r1 manage 61.6 K 2005-12-07 - 15:40 UnknownUser  
Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2005-12-07 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    PSSGroup All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback