This document is to describe the concepts and procedures used to replace SAN switches in an AIX Power environment. This includes direct attached or VIO attached storage, and with VIO both the VSCSI and NPIV cases. The article will first discuss dynamic tracking as this is important for making SAN changes. Then we’ll look at SAN switch replacement in direct attached storage, VIO VSCSI attached storage, and VIO NPIV attached storage environments, including single SAN fabric and dual SAN fabric environments. We’ll examine this from an MPIO environment, and consider how this applies to other multi-path code last.
SAN Switch Replacement in AIX Environments
The purpose of this document is to describe the concepts and procedures used to replace SAN switches in an AIX Power environment. This includes direct attached or VIO attached storage, and with VIO both the VSCSI and NPIV cases. The article will first discuss dynamic tracking as this is important for making SAN changes. Then we’ll look at SAN switch replacement in direct attached storage, VIO VSCSI attached storage, and VIO NPIV attached storage environments, including single SAN fabric and dual SAN fabric environments. We’ll examine this from an MPIO environment, and consider how this applies to other multi-path code last.
Dynamic tracking and LUN configuration
In environments with SAN switches, one will normally want to set certain attributes for the fscsi devices, specifically the dyntrk and fc_err_recov attributes. By default, these are set to no and delayed_fail respectively and assume the server is not attached to a SAN switch. This is important because the procedures to replace a switch are quite different depending on these settings.
Without dyntrk=yes, you will have to remove disk devices and reconfigure them. This means that any hdisk attribute settings you have changed will be undone, and you’ll have to change them again. With dyntrk=yes, you do not have to remove the hdisk device definitions and you won’t lose changes to the disk attributes. Disk attributes that are often changed include the reserve_policy for SCSI reserves, and the queue_depth for performance.
Here’s how to look at these attributes:
You should set these as follows:
via this command if no disks are in use:
# chdev -l <fcsi#> -a dyntrk=yes -a fc_err_recov=fast_fail
or if the disks are in use:
# chdev -l <fcsi#> -a dyntrk=yes -a fc_err_recov=fast_fail -P
and then reboot to make the changes go into effect. Thus, these changes are not dynamic for the LPAR. Preferably they attributes are set as recommended when the LPAR is installed and setup.
Note that at VIOSs (VIO Servers) if a LUN is mapped from the VIOS to a VIOC (VIO Client) as a VSCSI disk, then the disk is in use, even if the VIOC isn’t using the disks. So in a typical dual VIOS environment, one would make this change to one VIOS, reboot it, then make the change to the other VIOS and reboot it.
Lacking these attribute settings, AIX includes information about the specific port on the switch, as part of the LUN configuration. Thus, to use a different port on the switch or another switch entirely, one will have to actually remove the disk definition (via a # rmdev -dl <disk>), move the cables to the new switch and run cfgmgr. This also means stopping use of the disk and applications using it.
For more details on these settings see the documentation at http://www-1.ibm.com/support/docview.wss?uid=isg1520readmefb4520desr_lpp_bos and in the information center at http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.kernelext/doc/kernextc/fcp_overview.htm and http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.prftungd/doc/prftungd/fast_fail_dynamic_interaction.htm
Can SAN switch replacements be done dynamically?
Provided the dyntrk and fc_err_recov are properly set, the answer is yes provided one ensures that there will always be at least one working path for each hdisk. Additionally, in some cases with only one path, provided we move the cable fast enough, then we can also do this dynamically; however this is discouraged. Fast enough is such that from the time we unplug a cable from a switch port, plug it into a port on the new switch, plus the time for the SAN fabric to recognize the new cabling, is less than 15 seconds so the IOs don’t time out and fail. So when one path to the disk exists, SAN switch replacements are preferably done during maintenance windows. For example, if a cable isn’t properly seated or a port is defective, then IOs can fail leading to problems.
Know your paths and the cables they use
As the previous paragraph makes clear, you want to make sure that a working path exists when dynamically migrating from one SAN switch to another. So you need to know what paths exist to your disks and the cables involved. To that end, it’s important to understand that a path is uniquely described via the host port and the storage port used by the path, and that ports are uniquely identified via a WWPN (World Wide Port Name) which is 16 digits. How one determines this depends on the multi-path code used for the storage, and this article initially focuses on MPIO environments (which includes storage using SDDPCM as the multi-path code since SDDPCM uses MPIO under the covers). To list the paths for your disks with MPIO, use the lspath command as follows (here for hdisk2):
This shows hdisk2 has 4 paths, two from fcs0 (the parent device of fscsi0) two from fcs1, and going to two separate ports on the storage and identified via the WWPN of the storage port: 203900a0b848dda or 20180a0b8478dda. From this we can also conclude that we’re only using one SAN fabric for this LUN (and probably for the LPAR as well – and this can be verified by looking at the paths for all LUNs and in which case they’d look similar) since both host ports connect to both storage ports. Thus, the cabling would look like:
In VIO VSCSI environments, you’d run the lspath commands on the VIOS (VIO Server) in the oem_setup_env shell, as we’re concerned about paths from the VIOS to the storage. In a VIO NPIV environment, one would run the lspath commands on the VIOC plus one will need to know the vFC to real FC adapter mapping
Identifying adapter and port locations
It will be important to know which cables connect to which ports on the storage, host. and SAN switches. Since this document discusses SAN switch replacement, perhaps the easiest method is to obtain the host and storage WWPNs for the ports and then from the switch management interface, determine the ports to which they are connected. From the host side, you can list the port location code and WWPN via the following command:
Or for a description of the adapter and its location code:
These are a dual port adapters, and checking the Finding Parts Locations and Addresses manual for the specific system model (these manuals are available in the Power Hardware Information Center at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/index.jsp) one can determine the specific slot location for the adapter. In this case the system model is a 9179-MHB and the P2-C2 in the location field indicates that the adapter is in slot 2 of the system unit. Then T1 refers to the top port and T2 to the bottom port. These adapters also have an identify light, so one can go into the diagnostics menu and get the light to flash to more easily locate the adapter.
Also note that it is possible that the fcs0 and fscsi0 don’t refer to the same port, so you can’t rely on the numbers for the devices. You can see the relationship via the location codes, e.g.:
This shows fcs0 is related to fscsi0 via the location codes.
Be aware if you are you using an active/passive disk subsystem
An active/passive disk subsystem is one in which one controller of a pair is used to handle all IOs to a LUN except in failure conditions. Examples of active/passive IBM storage include the DS3000, DS4000, DS5000, SVC and Storwize V7000. The reason this is important, is that it’s preferable to not lose access to the primary controller for the LUNs during a switch replacement, or to at least be aware that LUNs will switch controllers if all paths to the primary controller are lost during the switch replacement. Usually half the LUNs have one storage controller as the preferred controller, with the other half of the LUNs using the other controller. So if only one cable is used per controller, this means that LUNs will fail over to the other controller if the server is doing IOs to the storage during the switch replacement.
It’s also possible to use RDAC for DS4000 storage which requires that host adapterA is connected to storage controllerA, and host adapterB is connected to storage controllerB, without any cross connections. Please be aware that from AIX 6.1 and on, MPIO is strategic and preferred. One can choose the multi-path code used for the DS3/4/5000 via the manage_disk_drivers command available in AIX (note that there are also requirements from the storage side for MPIO).
Later when all paths are restored, one will normally want the storage administrator to switch the LUN back to the preferred controller. If all paths to one controller will be lost during a switch replacement, then it’s recommended that the storage administrator move all IO handling to the controller that will be accessible prior to moving a cable.
Should you use ISLs to facilitate SAN switch replacement?
Connecting two switches via Inter Switch Links (ISLs) joins two switches into a fabric. Before going into ISLs, it’s important to know that one should have dynamic tracking enabled prior to adding a switch to a fabric via an ISL as lacking that setting might cause IOs to be lost.
In the case where we have ISLs, we can move the cables in any order, and provided we do so quickly enough and we properly seat the cables, IOs will be slightly delayed. For the non-ISL environment, we have to use more care. First, we can’t move the cables in any order. We have to move a server cable, then a storage cable; otherwise, the server will lose access to the storage. When we move cable E, any IOs using that cable will fail, and we’ll have to rely on the multi-path code at the server to redirect the IO to use cable F. This delay will be longer as the IO must time out. If we move cable G first, and fc_err_recov=fast_fail and the switch supports fast fail, then the switch will inform the adapter driver that the port no longer has access to the storage and the multi-path code will immediately redirect the IOs to use cable H, so this will be less delay than if moving a server side cable.
Of course, if we stop the application and IOs, then we can move the cables without worrying about doing it quickly or regarding the order cables are moved. So we can see that ISLs facilitate switch replacement here, though the option of stopping IO entirely avoids some of the work required.
Why it’s better to disable paths prior to cable movement
While we can use the path availability facilities to handle lost paths during a switch replacement, it’s preferable to disable paths prior to moving cables for two reasons. First, in-flight IOs will be delayed if we don’t disable the paths first. This delay might result in the application stalling while IOs time out and re-initiated down available paths, or with active/passive disk subsystems, while the storage moves IO processing from one controller to another. Secondly, and perhaps more importantly, bugs in the recovery portions of the code might have bugs which could result in IO failures. Given the matrix of multi-path code versions, storage firmware/microcode, and adapter firmware, it’s difficult to test all possible combinations of code and failures. If you’ve tested path failure, observing failure detection, handling of IOs in-flight, and path recovery, then one can be assured that the code will work correctly.
How to disable and re-enable paths
The command to disable or enable paths for IO is the chpath command, e.g.:
# chpath -l <hdisk#> -p <parent> -w <connection> -s [enable|disable]
The lspath command previously mentioned will provide the parent and connection information. Alternatively to enable or disable all paths from a specific port for a hdisk, one can use:
# chpath -l <hdisk#> -p <parent> -s [enable|disable]
VIO VSCSI environments
Here are two diagrams of a VIO Client (VIOC) using VSCSI to access SAN attached storage through a pair of VIO Servers (VIOSs) in a dual SAN fabric environment showing two cabling strategies:
It’s important to realize there are 2 layers of multi-path code here. MPIO is always used at the VIOC for choosing a path to the VIOSs. The multi-path code at the VIOS depends on what the storage requires. From the VIOC, each LUN has two paths (to each VIOS). From the VIOS, there are potentially 8 paths to a LUN in example 1, and potentially 4 paths to a LUN for example 2. Besides having more paths, there is an availability difference between the two diagrams. Example 1 can continue running with the failure of a VIOS and a SAN fabric. Example 2 can also, provided the right pair of VIOS and fabric fail. Thus, typically you’ll see cabling similar to the diagram on the left.
A difference in how one would disable paths here exists. For example 1, when replacing a switch, one preferably disables/enables paths at the VIOSs when replacing a switch. For example 2, one can simply disable the paths to the VIOS attached to the switch being replaced. And one can just disable all paths for a fibre channel port attached to the SAN switch being replaced, if the multi-path code provides this capability.
VIO NPIV environments
Here are two examples of a VIOC using NPIV thru two VIOSs in a dual SAN fabric to access SAN attached storage:
Here there is only one layer of multi-path code, and that is in the VIOC. In both examples, there are 8 potential paths per LUN. However example 3 has superior availability characteristics in that we can have a VIOS fail and a SAN fabric failure without losing access to the storage, while in example 4 we could lose access if a VIOS and the SAN fabric the other VIOS uses fails. So all path management commands are done from the VIOC. And in both cases one can just disable all paths for a fibre channel port attached to the SAN switch being replaced, if the multi-path code provides this capability.
Multi-path code other than MPIO
There are other multi-path code sets besides MPIO, and often MPIO isn’t a choice as the storage vendor dictates what must be used for their storage, and MPIO is often not an option. Each multi-path code set has its own command for handling path management, and the concepts previously mentioned still apply.
For example, one can use SDDPCM (which is compliant with the MPIO architecture) and still use the MPIO commands; however, you may find using the pcmpath command to be easier to accomplish your objectives. SDD is another multi-path code set from IBM (though SDDPCM is strategic) and one can use the datapath command for path management. PowerPath is a common option for customers attaching EMC storage to Power, in which case one typically uses the powermt command for path management.