Archive for

June 22nd, 2012

...

SAN Switch Replacement in AIX Environments

Comments Off

This document is to describe the concepts and procedures used to replace SAN switches in an AIX Power environment. This includes direct attached or VIO attached storage, and with VIO both the VSCSI and NPIV cases. The article will first discuss dynamic tracking as this is important for making SAN changes. Then we’ll look at SAN switch replacement in direct attached storage, VIO VSCSI attached storage, and VIO NPIV attached storage environments, including single SAN fabric and dual SAN fabric environments. We’ll examine this from an MPIO environment, and consider how this applies to other multi-path code last.

 

SAN Switch Replacement in AIX Environments
Overview

The purpose of this document is to describe the concepts and procedures used to replace SAN switches in an AIX Power environment. This includes direct attached or VIO attached storage, and with VIO both the VSCSI and NPIV cases. The article will first discuss dynamic tracking as this is important for making SAN changes. Then we’ll look at SAN switch replacement in direct attached storage, VIO VSCSI attached storage, and VIO NPIV attached storage environments, including single SAN fabric and dual SAN fabric environments. We’ll examine this from an MPIO environment, and consider how this applies to other multi-path code last.

Dynamic tracking and LUN configuration

In environments with SAN switches, one will normally want to set certain attributes for the fscsi devices, specifically the dyntrk and fc_err_recov attributes. By default, these are set to no and delayed_fail respectively and assume the server is not attached to a SAN switch. This is important because the procedures to replace a switch are quite different depending on these settings.

Without dyntrk=yes, you will have to remove disk devices and reconfigure them. This means that any hdisk attribute settings you have changed will be undone, and you’ll have to change them again. With dyntrk=yes, you do not have to remove the hdisk device definitions and you won’t lose changes to the disk attributes. Disk attributes that are often changed include the reserve_policy for SCSI reserves, and the queue_depth for performance.

Here’s how to look at these attributes:

You should set these as follows:

dyntrk=yes
fc_err_recov=fast_fail

via this command if no disks are in use:

# chdev -l <fcsi#> -a dyntrk=yes -a fc_err_recov=fast_fail

or if the disks are in use:

# chdev -l <fcsi#> -a dyntrk=yes -a fc_err_recov=fast_fail -P

and then reboot to make the changes go into effect. Thus, these changes are not dynamic for the LPAR. Preferably they attributes are set as recommended when the LPAR is installed and setup.

Note that at VIOSs (VIO Servers) if a LUN is mapped from the VIOS to a VIOC (VIO Client) as a VSCSI disk, then the disk is in use, even if the VIOC isn’t using the disks. So in a typical dual VIOS environment, one would make this change to one VIOS, reboot it, then make the change to the other VIOS and reboot it.

Lacking these attribute settings, AIX includes information about the specific port on the switch, as part of the LUN configuration. Thus, to use a different port on the switch or another switch entirely, one will have to actually remove the disk definition (via a # rmdev -dl <disk>), move the cables to the new switch and run cfgmgr. This also means stopping use of the disk and applications using it.

For more details on these settings see the documentation at http://www-1.ibm.com/support/docview.wss?uid=isg1520readmefb4520desr_lpp_bos and in the information center at http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.kernelext/doc/kernextc/fcp_overview.htm and http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.prftungd/doc/prftungd/fast_fail_dynamic_interaction.htm

Can SAN switch replacements be done dynamically?

Provided the dyntrk and fc_err_recov are properly set, the answer is yes provided one ensures that there will always be at least one working path for each hdisk. Additionally, in some cases with only one path, provided we move the cable fast enough, then we can also do this dynamically; however this is discouraged. Fast enough is such that from the time we unplug a cable from a switch port, plug it into a port on the new switch, plus the time for the SAN fabric to recognize the new cabling, is less than 15 seconds so the IOs don’t time out and fail. So when one path to the disk exists, SAN switch replacements are preferably done during maintenance windows. For example, if a cable isn’t properly seated or a port is defective, then IOs can fail leading to problems.

Know your paths and the cables they use

As the previous paragraph makes clear, you want to make sure that a working path exists when dynamically migrating from one SAN switch to another. So you need to know what paths exist to your disks and the cables involved. To that end, it’s important to understand that a path is uniquely described via the host port and the storage port used by the path, and that ports are uniquely identified via a WWPN (World Wide Port Name) which is 16 digits. How one determines this depends on the multi-path code used for the storage, and this article initially focuses on MPIO environments (which includes storage using SDDPCM as the multi-path code since SDDPCM uses MPIO under the covers). To list the paths for your disks with MPIO, use the lspath command as follows (here for hdisk2):

 

This shows hdisk2 has 4 paths, two from fcs0 (the parent device of fscsi0) two from fcs1, and going to two separate ports on the storage and identified via the WWPN of the storage port: 203900a0b848dda or 20180a0b8478dda. From this we can also conclude that we’re only using one SAN fabric for this LUN (and probably for the LPAR as well – and this can be verified by looking at the paths for all LUNs and in which case they’d look similar) since both host ports connect to both storage ports. Thus, the cabling would look like:

In VIO VSCSI environments, you’d run the lspath commands on the VIOS (VIO Server) in the oem_setup_env shell, as we’re concerned about paths from the VIOS to the storage. In a VIO NPIV environment, one would run the lspath commands on the VIOC plus one will need to know the vFC to real FC adapter mapping

Identifying adapter and port locations

It will be important to know which cables connect to which ports on the storage, host. and SAN switches. Since this document discusses SAN switch replacement, perhaps the easiest method is to obtain the host and storage WWPNs for the ports and then from the switch management interface, determine the ports to which they are connected. From the host side, you can list the port location code and WWPN via the following command:

Or for a description of the adapter and its location code:

These are a dual port adapters, and checking the Finding Parts Locations and Addresses manual for the specific system model (these manuals are available in the Power Hardware Information Center at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/index.jsp) one can determine the specific slot location for the adapter. In this case the system model is a 9179-MHB and the P2-C2 in the location field indicates that the adapter is in slot 2 of the system unit. Then T1 refers to the top port and T2 to the bottom port. These adapters also have an identify light, so one can go into the diagnostics menu and get the light to flash to more easily locate the adapter.

Also note that it is possible that the fcs0 and fscsi0 don’t refer to the same port, so you can’t rely on the numbers for the devices. You can see the relationship via the location codes, e.g.:

This shows fcs0 is related to fscsi0 via the location codes.

Be aware if you are you using an active/passive disk subsystem

An active/passive disk subsystem is one in which one controller of a pair is used to handle all IOs to a LUN except in failure conditions. Examples of active/passive IBM storage include the DS3000, DS4000, DS5000, SVC and Storwize V7000. The reason this is important, is that it’s preferable to not lose access to the primary controller for the LUNs during a switch replacement, or to at least be aware that LUNs will switch controllers if all paths to the primary controller are lost during the switch replacement. Usually half the LUNs have one storage controller as the preferred controller, with the other half of the LUNs using the other controller. So if only one cable is used per controller, this means that LUNs will fail over to the other controller if the server is doing IOs to the storage during the switch replacement.

It’s also possible to use RDAC for DS4000 storage which requires that host adapterA is connected to storage controllerA, and host adapterB is connected to storage controllerB, without any cross connections. Please be aware that from AIX 6.1 and on, MPIO is strategic and preferred. One can choose the multi-path code used for the DS3/4/5000 via the manage_disk_drivers command available in AIX (note that there are also requirements from the storage side for MPIO).

Later when all paths are restored, one will normally want the storage administrator to switch the LUN back to the preferred controller. If all paths to one controller will be lost during a switch replacement, then it’s recommended that the storage administrator move all IO handling to the controller that will be accessible prior to moving a cable.

Should you use ISLs to facilitate SAN switch replacement?

Connecting two switches via Inter Switch Links (ISLs) joins two switches into a fabric. Before going into ISLs, it’s important to know that one should have dynamic tracking enabled prior to adding a switch to a fabric via an ISL as lacking that setting might cause IOs to be lost.  

In the case where we have ISLs, we can move the cables in any order, and provided we do so quickly enough and we properly seat the cables, IOs will be slightly delayed. For the non-ISL environment, we have to use more care. First, we can’t move the cables in any order. We have to move a server cable, then a storage cable; otherwise, the server will lose access to the storage. When we move cable E, any IOs using that cable will fail, and we’ll have to rely on the multi-path code at the server to redirect the IO to use cable F. This delay will be longer as the IO must time out. If we move cable G first, and fc_err_recov=fast_fail and the switch supports fast fail, then the switch will inform the adapter driver that the port no longer has access to the storage and the multi-path code will immediately redirect the IOs to use cable H, so this will be less delay than if moving a server side cable.

Of course, if we stop the application and IOs, then we can move the cables without worrying about doing it quickly or regarding the order cables are moved. So we can see that ISLs facilitate switch replacement here, though the option of stopping IO entirely avoids some of the work required.

Why it’s better to disable paths prior to cable movement

While we can use the path availability facilities to handle lost paths during a switch replacement, it’s preferable to disable paths prior to moving cables for two reasons. First, in-flight IOs will be delayed if we don’t disable the paths first. This delay might result in the application stalling while IOs time out and re-initiated down available paths, or with active/passive disk subsystems, while the storage moves IO processing from one controller to another. Secondly, and perhaps more importantly, bugs in the recovery portions of the code might have bugs which could result in IO failures. Given the matrix of multi-path code versions, storage firmware/microcode, and adapter firmware, it’s difficult to test all possible combinations of code and failures. If you’ve tested path failure, observing failure detection, handling of IOs in-flight, and path recovery, then one can be assured that the code will work correctly.

How to disable and re-enable paths

The command to disable or enable paths for IO is the chpath command, e.g.:

# chpath -l <hdisk#> -p <parent> -w <connection> -s [enable|disable]

The lspath command previously mentioned will provide the parent and connection information. Alternatively to enable or disable all paths from a specific port for a hdisk, one can use:

# chpath -l <hdisk#> -p <parent> -s [enable|disable]

VIO VSCSI environments

Here are two diagrams of a VIO Client (VIOC) using VSCSI to access SAN attached storage through a pair of VIO Servers (VIOSs) in a dual SAN fabric environment showing two cabling strategies:

It’s important to realize there are 2 layers of multi-path code here. MPIO is always used at the VIOC for choosing a path to the VIOSs. The multi-path code at the VIOS depends on what the storage requires. From the VIOC, each LUN has two paths (to each VIOS). From the VIOS, there are potentially 8 paths to a LUN in example 1, and potentially 4 paths to a LUN for example 2. Besides having more paths, there is an availability difference between the two diagrams. Example 1 can continue running with the failure of a VIOS and a SAN fabric. Example 2 can also, provided the right pair of VIOS and fabric fail. Thus, typically you’ll see cabling similar to the diagram on the left.

A difference in how one would disable paths here exists. For example 1, when replacing a switch, one preferably disables/enables paths at the VIOSs when replacing a switch. For example 2, one can simply disable the paths to the VIOS attached to the switch being replaced. And one can just disable all paths for a fibre channel port attached to the SAN switch being replaced, if the multi-path code provides this capability.

VIO NPIV environments

Here are two examples of a VIOC using NPIV thru two VIOSs in a dual SAN fabric to access SAN attached storage:

Here there is only one layer of multi-path code, and that is in the VIOC. In both examples, there are 8 potential paths per LUN. However example 3 has superior availability characteristics in that we can have a VIOS fail and a SAN fabric failure without losing access to the storage, while in example 4 we could lose access if a VIOS and the SAN fabric the other VIOS uses fails. So all path management commands are done from the VIOC. And in both cases one can just disable all paths for a fibre channel port attached to the SAN switch being replaced, if the multi-path code provides this capability.

Multi-path code other than MPIO

There are other multi-path code sets besides MPIO, and often MPIO isn’t a choice as the storage vendor dictates what must be used for their storage, and MPIO is often not an option. Each multi-path code set has its own command for handling path management, and the concepts previously mentioned still apply.

For example, one can use SDDPCM (which is compliant with the MPIO architecture) and still use the MPIO commands; however, you may find using the pcmpath command to be easier to accomplish your objectives. SDD is another multi-path code set from IBM (though SDDPCM is strategic) and one can use the datapath command for path management. PowerPath is a common option for customers attaching EMC storage to Power, in which case one typically uses the powermt command for path management.  

Comments Off

IBM Power Systems Ethernet ports and AIX 6.1 EtherChannel for Oracle RAC private interconnectivity

Comments Off

This white paper is to document the setup and configuration of IBM Power Systems™ 10 Gigabit Ethernet ports. The environment used for the tests documented in this paper consist of the Oracle Database 11g with Real Application Clusters (RAC) software to configure nodes with private network intercommunication in the AIX® operating system environment. Tests were run on a POWER6™ processor-based Power® 570 server. The 10 Gigabit Ethernet cards are an option for the Power 570.This document additionally covers the setup and configuration of AIX EtherChannel for Oracle RAC interconnectivity and the Ethernet switches.

Download PDF: IBM Power Systems Ethernet ports and AIX 6.1 EtherChannel for Oracle RAC private interconnectivity

Source: IBM

Comments Off

WPAR Tasks Roles & Responsibility

Comments Off

The following table lists the Admin access role required to perform Workload Management Tasks. It also provides the Management Tool that be used for monitoring and managing the tasks such as Systems Director, WPAR Manager.

Task Command Global Root WPAR Root Systems Director WPAR Manager Notes
Plug-ins:  VMControl, WPAR Mgr
Devices in WPARs
Deploying a device mkwpar -D or chwpar -D * * A device can be allocated to a WPAR when the WPAR is created or later
Allocating a device mkwpar -D or chwpar -D * * Process provides a storage device that can be used by the WPAR.
Configuring a device in a WPAR mkdev chdev rmdev lsdev cfgmgr lsattr lspath * * * Devices configured in the WPAR has an ODM entry in the WPAR.  WPAR root can configure allocated devices only.
Managing file systems for a device various  see website   <<== * * * Devices in a rootvg WPAR, commands used to create and manage volume groups, logical volumes, and file systems operate like in the global environment
 
Configuring system WPARs
Naming the system WPAR mkwpar -n * * * You must provide a name for the system WPAR.
Creating a system WPAR mkwpar * * * You can create a new system WPAR
Starting system WPARs startwpar * * * You can start a system WPAR from the global environment
Configuring directories and file systems for system WPARs
File system customization for system WPARs mkwpar + -M option * * * A WPAR may use namefs mounts from any type of file system which supports POSIX file system semantics when mounted with a namefs mount.
Creating a writable directory under a shared directory mkwpar * * You can create a writable directory beneath a shared directory using a symbolic link from the global environment.
Override the default of filesystems for system WPARS mkwpar + -d option * * * Override the default location of the file systems for a system WPAR
Configuring networks for system WPARs
Changing the host name for a system WPAR mkwpar -h  or chwpar * * * By default, the name for a system WPAR is used as its host name.
Removing a network from a system WPAR chwpar -K * * * Remove a network from a system WPAR
Configuring domain resolution for system WPARs mkwpar -r * * * You can configure the domain resolution for system WPARs
Configuring WPAR-specific routing mkwpar -i or wparexec or chwpar * * * You can configure a WPAR to use its own routing table
Configuring resource controls for system WPARs mkwpar -R or chwpar * * * You can configure the resource controls to limit the physical resources a system WPAR has access to
Using specification files for system WPARs mkwpar -f * * * You can create a WPAR with all of the options from a specification file
Using an image.data file for system WPARs mkwpar -L image_data= flag * * Use an image.data file to specify additional logical volume options and file system options when you create a system WPAR
start and stop processes SRC * * * *
Configuring application WPARs
Creating an application WPAR wparexec * * * You can create and configure an application WPAR
Configuring directories and file systems for application WPARs wparexec -M * * * Application WPARs share file systems with the global environment. configure directories and file systems.
Configuring networks for application WPARs wparexec -h -N or chwpar * * * You can configure the network for an application WPAR
Configuring resource controls for application WPARs wparexec -R or chwpar * * * You can configure the resource controls to limit the physical resources an application WPAR has access to
Working with specification files for application WPARs wparexec -f * * * You can create a specification file with all of the options
 © Copyright IBM Corporation, 2012
Managing WPARs
Listing WPARs lswpar * * * You can list summary data for system WPARs and application WPARs
Listing WPAR identifiers lparstat or uname * * You can list the identifiers for a WPAR using the lparstat command or the uname command.
Logging in to a WPAR clogin * * After you configure and activate a system WPAR that does not have a network connection, you can log in to it locally
Backing up WPARs savewpar, mkcd , or mkdvd * * * You can back up a WPAR
Restoring WPARs restwpar * * * You can restore a WPAR
Removing WPARs rmwpar * * * You can remove a WPAR
Stopping WPARs stopwpar * * * You can stop a WPAR from the global environment
Recovering incompatible detached WPARs syncwpar, inuwpar * * * It is possible that the system software in a detached workload partition (WPAR) might become incompatible with the levels of system software in the global environment. This occurs if software installation and maintenance tasks are performed independently in the global environment and the WPAR, or if a WPAR backup image from an incompatible system level was installed.
Managing software with detached WPARs syncwpar, inuwpar * * * System WPARs exist in two basic forms as either shared or detached (non-shared /usr) workload partitions, though the file system characteristics can vary.
Installing Apache in a WPAR rpm * Installing Apache allows you to take advantage of the portability and scalability of WPARs
Using the Advanced Accounting subsystem with WPARs
* You can use the Advanced Accounting subsystem to produce WPAR accounting reports
Using the trace facility with WPARs
mkwpar or chwpar -S privs+=PV_KER_RAS flag. * You can use the trace facility to isolate system problems by monitoring selected system events in a WPAR.  Enabled from Global Environment
Making software available to other WPARs
syncwpar or syncroot syncwpar only  syncroot only Software installed in the global environment, not always automatically available for use within your system WPAR
Modified and enhanced AIX commands for WPARs
various  see website   <<== * * Some commands have different or enhanced behavior in a WPAR environment.
Live Application Mobility  * * You can configure either WPAR type for mobility, which allows you to move running WPAR instances between physical systems using the AIX Workload Manager.
Comments Off