Posts tagged ·

debug

·...

Debug and Fix RMC Connection Errors

Comments Off

Question

How does one debug and correct an issue when they get a “No RMC Connection” error message when using the HMC?

Cause

The no RMC connection error message can occur on the HMC when attempting to dynamically configure AIX or VIOS , when attempting LPAR mobility, or when configuring virtual resources? RMC is an encrypted communication channel between the HMC and a LPAR that uses port 657 and both TCP and UDP protocols . Changing IP addresses, cloning AIX LPARs, or a host of other administrative tasks can cause RMC to breakdown.

Answer

 

There are some basic commands that can be run to check status of RMC configurations and there are some dependancies on RSCT versions as to which commands you use. RSCT 3.1.x.x levels are the newest and included in AIX 6.1 TL6 or higher and RSCT 2.x.x.x are included in AIX 6.1 TL5 or lower. Following queries provide a quick method to assess RMC health.
- As root on AIX LPAR

– IF AIX 6.1 TL5 or lower

lslpp -l csm.client —> This fileset needs to be installed

– IF AIX 6.1 TL6 or higher

lslpp -l rsct.core.rmc —> This fileset needs to be 3.1.0.x level or higher
– For all AIX versions

/usr/sbin/rsct/bin/ctsvhbac —> Are all IP and host IDs trusted?

– For AIX 6.1 TL5 or lower

lsrsrc IBM.ManagementServer —> Is HMC listed as a resource?

– For AIX 6.1 TL6 or higher

lsrsrc IBM.MCP —> Is the HMC listed as a resource?

- On HMC (as hscroot)

lspartition -dlpar —> Is LPAR’s DCaps value non-zero ?

If you answer no to any of the above then corrective action is required.

- Fix It Commands (run as root on LPAR, HMC, or both)

/usr/sbin/rsct/install/bin/recfgct
/usr/sbin/rsct/bin/rmcctrl -p

You would need a pesh password for your HMC if you need to run the above fix commands on the HMC.
You can try the following command first as hscroot:

lspartition -dlparreset

If that does not help you will need to request pesh passwords from IBM Support for your HMC so you can run the recfgct and rmcctrl commands listed above.

After running the above commands it will take several minutes before RMC connection is restored. The best way to monitor is by running the lspartition -dlpar command on the HMC every few minutes and watch for the target LPAR to show up with a non-zero DCaps value.

- Things to consider before using the above fix commands or if the reconfigure commands don’t help.

Do not use the recfgct command on AIX LPARs that are running CSM, PLM, or RSCT High Performance Computing (HPC) related applications. HPC clusters are typically used in scientific computing rather than commercial applications and a system administrator should know if they are supporting a HPC cluster. If you are wondering what a HPC is you are probably safe to continue, but if you are unsure, ask a colleague first.

There are some network configuration issues and perhaps even some APAR issues that might need to be addressed if the commands that reconfigure RSCT don’t restore DLPAR functions and those issues will require additional debug steps not covered in this tech note. However, there are some common network issues that can prevent RMC communications and they include the following.

- Firewalls blocking bidirectional RMC related traffic for UDP and TCP on port 657.
- Mix of jumbo frames and standard Ethernet frames between the HMC and LPARs.
- Multiple interfaces with IP addresses on the LPARs that can route traffic to the HMC.

Comments Off

Capturing Debug Output of padmin CLI

Comments Off
Question
How to gather debug output of a failing, padmin command.
This applies to VIOS 1.5 and 2.x
Answer Login to VIOS as padmin
$ oem_setup_env
# script -a /home/padmin/<PMR#.Branch#>clidebug33.out
# uname -L
# su – padmin
$ ioslevel
$ export CLI_DEBUG=33
Run offending command to reproduce error
$ exit (padmin)
# exit (script)

Where to send the fileftp testcase.software.ibm.com
login: anonymous
password: <your email address>
ftp> cd /toibm/aix
ftp> prompt
ftp> binary
ftp> put <filename>.out
ftp> quit

Comments Off

Using NIM to Capture a Boot Debug

Comments Off

Technote (FAQ)


Question

How do I use NIM to debug a boot process?

Answer

I. Using NIM to put a SPOT into debug mode

II. Capturing debug output with an HMC

III. Capturing debug output in a non-HMC environment



I. Using NIM to put a SPOT into debug mode



1. To put a SPOT into debug mode through the smit menus, you can run the following:

# smitty nim_res_op
-select your spot
-select the check option from the menu

Network Install Operation to Perform

Move cursor to desired item and press Enter.

showres = show contents of a resource
reset = reset an object's NIM state
cust = perform software customization
sync_roots = synchronize roots for all clients using specified SPOT
maint = perform software maintenance
lslpp = list LPP information about an object
fix_query = perform queries on installed fixes
showlog = display a log in the NIM environment
check = check the status of a NIM object
lppchk = verify installed filesets
update_all = update all currently installed filesets

-Change the ‘Build Debug Boot Images’ to yes
-Change the ‘Force” option to yes

2. To put a SPOT into debug mode directly from the command line, you can run the following:

# nim –Fo check –a debug=yes <SPOT_Name>
*the SPOT_Name will be replaced by the specific SPOT being used


3. Attach a tty device to your client system (port 1)

4. To perform a network boot, you will want to boot the box up into SMS mode. When the box is booting up, you will see the following output on the console:
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

1 = SMS Menu 5 = Default Boot List
8 = Open Firmware Prompt 6 = Stored Boot List

Memory Keyboard Network SCSI Speaker
** Most new pSeries systems do not require you to hit the 'F1' key anymore, but you can hit both '1' and 'F1' alternately if you are unsure of which key your system requires you to hit in order to enter SMS. The firmware display should tell you which button you need to hit. **

5. At the SMS Main Menu, you will need to “Select Boot Options”, and “Select Install/Boot Device” to be “Network” (Note: You will want to make sure that you have your Remote IPL set up with the correct addresses.)

SMS 1.6 (c) Copyright IBM Corp. 2000,2005 All rights reserved.
-------------------------------------------------------------------------------
Main Menu
1. Select Language
2. Setup Remote IPL (Initial Program Load)
3. Change SCSI Settings
4. Select Console
5. Select Boot Options
-------------------------------------------------------------------------------
Navigation Keys:
X = eXit System Management Services
-------------------------------------------------------------------------------
Type menu item number and press Enter or select Navigation key:
-------------------------------------------------------------------------------
Multiboot
1. Select Install/Boot Device
2. Configure Boot Device Order
3. Multiboot Startup <OFF>
-------------------------------------------------------------------------------
Select Device Type
1. Diskette
2. Tape
3. CD/DVD
4. IDE
5. Hard Drive
6. Network
7. List all Devices

6. Once the box boots up off of the network boot image, you will see a debug prompt come up. At this prompt, you will want to enter the following information:

Enter the following at the debugger prompt:

mw enter_dbg <enter>
>42 <enter>
>. <enter>
>g <enter>

7. Once you have completed your boot debug, you will want to take the spot out of debug. If the boot image is left in debug mode, every time a client is booted from these boot images, the machine will stop and wait for a command at the debugger ">" prompt. If you attempt to use these debug-enabled boot images and there is not a tty attached to the client, the machine will appear to be hanging for no reason. The command to take the spot out of debug is:

# nim –Fo check <SPOTName>


II. HMC Environments:

Using SSH session to capture console output

1. Configure an SSH client (eg putty) to log session output to a local file. In putty, this can be accomplished by clicking on the putty icon and selecting “Logging” underneath the “Session” category on the left of the window. You will select “All session output”. You will also want to provide a “Log file name” to log to.

2. Open a connection to the HMC and login as user 'hscroot'.

3. Run the following command:

# vtmenu

4. At the vtmenu, select the server to which you desire a console session.

5. Select the LPAR from which you need boot debug.

6. Wait for "Open Completed" message (if LPAR was running you would get a Console: login)


III. Non-HMC environment


Setting up the hardware

During installation with debug mode enabled, debug output is sent to the S1 serial port of the machine. This output can then be captured to a tty or other serial connection. The preferred method for capturing debug mode output is to have another system near to the client machine that can interface with it via an rs232 serial connection.

* Two RS232 serial cables
* One gender changer
* One interposer (null modem)

Connect the hardware as shown in the schematic below:

[S1]--[X]----[R]----[I][G]----[R]----[X]--[Sx]

Note:
S1 = first serial port on the machine being installed
X = any extra cables needed to connect the DB25-RS232 cable to the serial port
R = DB25-RS232 cable
I = interposer
G = gender changer (female) to get the right connection to Sx
Sx = any serial port on the connecting AIX box

Setting up the interfacing system

1. Use the lslpp command to determine if bos.net.uucp is installed. Enter:

# lslpp -l bos.net.uucp

Note: If not If you do not have this installed, you must install it from the AIX installation media.

2. Set up the Sx port and create a tty on the port. To create the tty, enter:
1. smitty tty
2. Select Add a TTY
3. Select tty rs232 Asynchronous Terminal
4. Select Sx serial port
5. Select the port number
Note: Defaults

3. Create a uucp entry for created tty by editting the file /etc/uucp/Devices with your preferred text editor. Add the following:

Direct tty0 - 9600 direct

4. Run the 'cu' command piped to the 'tee' command to capture the debug output. Enter:

# cu -ml tty0 | tee /tmp/debug1.log

5. Disconnect from the client and stop logging by doing the following:

1. ~.

2. exit


Comments Off