Multipathing with Qlogic HBAs - failover and LUN load balancing
This is a short guide to the configuration of multipathing with QLogic HBAs for the physics database services at CERN. The key software and hardware components are: Oracle 10gR2, RAC, ASM, Linux RHEL 3, dual CPU Intel servers, SAN network, SATA disks in FC arrays (see also
Reference Architecture). SAN and FC (2 Gb) infrastructure: Qlogic dual-ported HBAs (QLA2342), 2 x Qlogic SAN switches, dual-ported infortred FC arrays (
A16F-S1211).
Note: tested with RHEL 3 U6, QLA2342 driver=7.05.00-RH1-f0, firmware=3.03.11, bios=1.47. SANsurfer=2.0.30b69
Under the term multipathing, in this document, we refer to a SAN configuration for path failover redundancy and load balancing:
- redundant path to access the storage arrays
- failover at the LUN level for failed path from the sevrer to the SAN switch (improved HA)
- load balancing at the LUN level (improved scalability and performance)
*Note regarding RHEL 4*: The qla2xxx_conf module is missing from 'RHEL 4 rpms' at least up to update 4. The module can be downloaded and compiled from the qlogic website (tested, it works). Without qla2xxx_conf multipathing here can not be configured on RHEL4. Device mapper multipathing can be used instead (it is a completely different implementation though, see DB installation procedure in this wiki).
OS configuration files
Multipathing for Qlogic HBAs is configured via
/etc/modules.conf, and
/etc/qla2300.conf. Moreover it is important to check that both qla2300_conf and qla2300 kernel modules are loaded (ex:
lsmod |grep qla23)
After LUN load balancing configuration the
/etc/modules.conf files should contain the following entry:
alias scsi_hostadapter0 qla2300_conf
alias scsi_hostadapter1 qla2300
options qla2300 ConfigRequired=1 ql2xuseextopts=1 ql2xfailover=1 qlport_down_retry=1 ql2xretrycount=2
FC and SAN configuration
- Change the PID number of the storage array controllers for each array use the same PID for both controllers.
- Configure zoning in both SAN switches and make the storage arrays visible via both HBA ports of the linux servers.
Qlogic HBA configuration
The /etc/qla2300.conf configuration can be done using Qlogic SANsurfer for HBA (downloadable from the Qlogic website).
- install the agent of SANsurfer for HBA on the nodes you need to configure
- modprobe qla2300 if needed
- disable autostartup of the qlogic agent
- chkconfig qlremote off
- /etc/init.d/qlremote stop
- rmmod qla2300; modprobe qla2300
- /etc/init.d/qlremote start
- Install and run the SANsurfer management console and connect to the node you want to configure
- Don't run the node configuration wizard
- Click the 'configure' icon
- if needed enable failover ticking a flag under the file menu
- DEVICE menu
- LUN menu
- Save the configuration (default_pass='config')
- Check the configuration in /etc/qla2300.conf and /etc/modules.conf.
/etc/init.d/qlremote stop
rmmod qla2300
rmmod qla2300_conf
modprobe qla2300
- Test the stability of the configuration across reboot
- Recreate the init ramdisk: rm /boot/initrd-2.4.21-37.ELsmp.img; mkinitrd /boot/initrd-2.4.21-37.ELsmp.img 2.4.21-37.ELsmp (customize with current kernel name)
- if you are using devlabel, reconfig the mapping before reboot
- reboot and check that the new configuration is being picked up
Post install - sample qla2300.conf configuration
This is a sample /etc/qla2300.conf configuration file for anode configured with 3 disk arrays, using dual ported HBAs on the server and the array. Note: extra new lines and comments have been added for clarity.
scsi-qla0-adapter-port=210000e08b1481b8\; -> WWN identifier of the first local HBA port (port 1)
scsi-qla0-tgt-0-di-0-node=200000d0230e0671\; -> WWNN: storage array controller (array N.1 of 3)
scsi-qla0-tgt-0-di-0-port=210000d0230e0671\; -> WWNP: storage array 1/3 controller port 1
scsi-qla0-tgt-0-di-0-pid=020cef\; -> ?
scsi-qla0-tgt-0-di-0-preferred=0000000000000000000000000000000000000000000000000000000000015555\;
-> command related to the activation LUN load balancing
scsi-qla0-tgt-0-di-0-control=00\;
-> storage array N.1 is bound to port 0 (but LUN load balacing makes for
the desired alternate access path)
scsi-qla0-tgt-1-di-0-node=200000d02350000e\; -> WWNN array N.2/3
scsi-qla0-tgt-1-di-0-port=210000d02350000e\; -> WWNP array N.2/3
scsi-qla0-tgt-1-di-0-pid=020e33\;
scsi-qla0-tgt-1-di-0-preferred=0000000000000000000000000000000000000000000000000000000000015555\;
scsi-qla0-tgt-1-di-0-control=00\;
scsi-qla0-tgt-2-di-0-node=200000d0231e0279\; -> WWNN array N.3/3
scsi-qla0-tgt-2-di-0-port=210000d0231e0279\; -> WWNP array N.2/3
scsi-qla0-tgt-2-di-0-pid=020fe8\;
scsi-qla0-tgt-2-di-0-preferred=000000000000000000000000000000000000000000000000000000000000aaaa\;
scsi-qla0-tgt-2-di-0-control=00\;
scsi-qla1-adapter-port=210100e08b3481b8\; -> WWN identifier of the first local HBA port (port 2)
scsi-qla1-tgt-0-di-1-node=200000d0230e0671\; -> WWNN: same as above, array controller (array N.1 of 3)
scsi-qla1-tgt-0-di-1-port=220000d0230e0671\; -> WWNP: storage array 1/3 controller port 2
scsi-qla1-tgt-0-di-1-pid=010cef\; -> ?
scsi-qla1-tgt-0-di-1-preferred=000000000000000000000000000000000000000000000000000000000000aaaa\;
scsi-qla1-tgt-0-di-1-control=80\;
-> port 1 is failover for storage array N.1
scsi-qla1-tgt-1-di-1-node=200000d02350000e\;
scsi-qla1-tgt-1-di-1-port=220000d02350000e\;
scsi-qla1-tgt-1-di-1-pid=010e33\;
scsi-qla1-tgt-1-di-1-preferred=000000000000000000000000000000000000000000000000000000000000aaaa\;
scsi-qla1-tgt-1-di-1-control=80\;
scsi-qla1-tgt-2-di-1-node=200000d0231e0279\;
scsi-qla1-tgt-2-di-1-port=220000d0231e0279\;
scsi-qla1-tgt-2-di-1-pid=010fe8\;
scsi-qla1-tgt-2-di-1-preferred=0000000000000000000000000000000000000000000000000000000000015555\;
scsi-qla1-tgt-2-di-1-control=80\;
Post install - sample /proc output
A sample from the
/proc/scsi/qla2300/1 file is reported below. The list of SCSI LUNinformation on
/proc/scsi/qla2300/2 should be empty. However, the last column in the list reports for each LUN the HBA port that is used. For example '0:0:81' means port 1, while '1:1:81' means port 2. In case of failover these numbers change accordingly (see also below):
SCSI LUN Information:
(Id:Lun) * - indicates lun is not registered with the OS.
( 0: 0): Total reqs 5, Pending reqs 0, flags 0x2, 0:0:81,
( 0: 1): Total reqs 5, Pending reqs 0, flags 0x2, 1:1:81,
( 0: 2): Total reqs 5, Pending reqs 0, flags 0x2, 0:0:81,
( 0: 3): Total reqs 5, Pending reqs 0, flags 0x2, 1:1:81,
( 0: 4): Total reqs 5, Pending reqs 0, flags 0x2, 0:0:81,
( 0: 5): Total reqs 5, Pending reqs 0, flags 0x2, 1:1:81,
( 0: 6): Total reqs 5, Pending reqs 0, flags 0x2, 0:0:81,
( 0: 7): Total reqs 5, Pending reqs 0, flags 0x2, 1:1:81,
( 0: 8): Total reqs 5, Pending reqs 0, flags 0x2, 0:0:81,
( 0: 9): Total reqs 5, Pending reqs 0, flags 0x2, 1:1:81,
( 0:10): Total reqs 5, Pending reqs 0, flags 0x2, 0:0:81,
( 0:11): Total reqs 5, Pending reqs 0, flags 0x2, 1:1:81,
( 0:12): Total reqs 5, Pending reqs 0, flags 0x2, 0:0:81,
( 0:13): Total reqs 5, Pending reqs 0, flags 0x2, 1:1:81,
( 0:14): Total reqs 5, Pending reqs 0, flags 0x2, 0:0:81,
( 0:15): Total reqs 5, Pending reqs 0, flags 0x2, 1:1:81,
( 0:16): Total reqs 5, Pending reqs 0, flags 0x2, 0:0:81,
Post install - sample failover log
In case of LUN failover the following information is displayed in the
/var/log/messages file:
..... kernel: qla2x00: FAILOVER device 0 from 210000d0230e0671 -> 220000d0230e0671 - LUN 0b, reason=0x1
..... kernel: qla2x00: FROM HBA 0 to HBA 1
The failover is triggered only when the failed LUN is accessed, while the failback is triggered as soon as the failure is restored:
..... kernel: qla2x00: FAILBACK device 0 -> 200000d0230e0671 LUN 0b
..... kernel: qla2x00: FROM HBA 1 to HBA 0
Configuration changes
If a storage arrays needs to be replaced and/or the perfistent configuration stored in /etc/qla2300.conf needs to be changed, follow this procedure:
/etc/init.d/qlremote stop
mv /etc/qla2300.conf /tmp/qla2300.conf.OLD
vi /etc/modules.conf -> CHANGE in the qla2300 line -> ConfigRequired=0
rmmod qla2300
rmmod qla2300_conf
modprobe qla2300
/etc/init.d/qlremote start
- use SANSurfer to configure qla2300 conf (load balancing will not be possible at this stage) and save the configuration (as explained above).
/etc/init.d/qlremote stop
vi /etc/modules.conf -> CHANGE BACK in the qla2300 line -> ConfigRequired=1
rmmod qla2300
rmmod qla2300_conf
modprobe qla2300
/etc/init.d/qlremote start
- use SANSurfer to configure qla2300 conf including load balancing and save the configuration.
/etc/init.d/qlremote stop
rmmod qla2300
rmmod qla2300_conf
modprobe qla2300
- Recreate the init ramdisk: rm /boot/initrd-2.4.21-37.ELsmp.img; mkinitrd /boot/initrd-2.4.21-37.ELsmp.img 2.4.21-37.ELsmp
Update qla2300_conf and qla_opts utility
- the persistent LUN configuration (multipathing) is stored in /etc/qla2300.conf, when this file is changed manually, qla_opts need to be run to update the kernel module. The same is true when the kernel version is changed/updated
- Example: ./qla_opts -w --file=/lib/modules/2.4.21-47.ELsmp/kernel/drivers/addon/qla2200/qla2300_conf.o qla2300_conf
- RHEL3_qla_opts: qla_opts for RHEL3
- qla_opts: qla_opts, qlogic utility to update qla2xxx module. Compiled for RHEL 4
History: Version 1, Luca.Canali - AT- cern.ch, February 2006
V 1.1 Added - how to change config, Luca, May 2006
V 1.2 Added - how to update qla2300_conf, Luca, Nov 2006