How does one debug and correct an issue when they get a “No RMC Connection” error message when using the HMC?
The no RMC connection error message can occur on the HMC when attempting to dynamically configure AIX or VIOS , when attempting LPAR mobility, or when configuring virtual resources? RMC is an encrypted communication channel between the HMC and a LPAR that uses port 657 and both TCP and UDP protocols . Changing IP addresses, cloning AIX LPARs, or a host of other administrative tasks can cause RMC to breakdown.
There are some basic commands that can be run to check status of RMC configurations and there are some dependancies on RSCT versions as to which commands you use. RSCT 3.1.x.x levels are the newest and included in AIX 6.1 TL6 or higher and RSCT 2.x.x.x are included in AIX 6.1 TL5 or lower. Following queries provide a quick method to assess RMC health.
- As root on AIX LPAR
– IF AIX 6.1 TL5 or lower
lslpp -l csm.client —> This fileset needs to be installed
– IF AIX 6.1 TL6 or higher
lslpp -l rsct.core.rmc —> This fileset needs to be 3.1.0.x level or higher
– For all AIX versions
/usr/sbin/rsct/bin/ctsvhbac —> Are all IP and host IDs trusted?
– For AIX 6.1 TL5 or lower
lsrsrc IBM.ManagementServer —> Is HMC listed as a resource?
– For AIX 6.1 TL6 or higher
lsrsrc IBM.MCP —> Is the HMC listed as a resource?
- On HMC (as hscroot)
lspartition -dlpar —> Is LPAR’s DCaps value non-zero ?
If you answer no to any of the above then corrective action is required.
- Fix It Commands (run as root on LPAR, HMC, or both)
You would need a pesh password for your HMC if you need to run the above fix commands on the HMC.
You can try the following command first as hscroot:
If that does not help you will need to request pesh passwords from IBM Support for your HMC so you can run the recfgct and rmcctrl commands listed above.
After running the above commands it will take several minutes before RMC connection is restored. The best way to monitor is by running the lspartition -dlpar command on the HMC every few minutes and watch for the target LPAR to show up with a non-zero DCaps value.
- Things to consider before using the above fix commands or if the reconfigure commands don’t help.
Do not use the recfgct command on AIX LPARs that are running CSM, PLM, or RSCT High Performance Computing (HPC) related applications. HPC clusters are typically used in scientific computing rather than commercial applications and a system administrator should know if they are supporting a HPC cluster. If you are wondering what a HPC is you are probably safe to continue, but if you are unsure, ask a colleague first.
There are some network configuration issues and perhaps even some APAR issues that might need to be addressed if the commands that reconfigure RSCT don’t restore DLPAR functions and those issues will require additional debug steps not covered in this tech note. However, there are some common network issues that can prevent RMC communications and they include the following.
- Firewalls blocking bidirectional RMC related traffic for UDP and TCP on port 657.
- Mix of jumbo frames and standard Ethernet frames between the HMC and LPARs.
- Multiple interfaces with IP addresses on the LPARs that can route traffic to the HMC.