Posts tagged ·

TSM

·...

RC=0x0000006A=106 error during DB2 backup using Tivoli Storage Manager

Comments Off

Problem(Abstract)

Tivoli Storage Manager return code 106 seen in the db2diag.log during backup:

DATA #1 : TSM RC, PD_DB2_TYPE_TSM_RC, 4 bytes
TSM RC=0x0000006A=106 — see TSM API Reference for meaning.

Cause

The file permissions on the error log file (dsmierror.log/dsmerror.log)

Resolving the problem

The db2diag.log file can contain error messages during a failed Tivoli Storage Manager for Mail for SAP backup. For Tivoli Storage Manager API specific return codes, search for the text “TSM RC=” then reference the messages manual for the API return code meaning.

The following error was seen in the db2diag.log file during the archive logs backup:
2009-08-22-16.50.58.161181-300 I5302549A362       LEVEL: Warning
PID     : 53798                TID  : 4114        PROC : db2sysc 0
INSTANCE: db2pbw               NODE : 000
EDUID   : 4114                 EDUNAME: db2logmgr (PBW) 0
FUNCTION: DB2 UDB, data protection services, sqlpgArchiveLogFile,  probe:3108
MESSAGE : Started archive for log file S0082777.LOG.

2009-08-22-16.50.58.164580-300 E5302912A347       LEVEL: Error
PID     : 290858               TID  : 1           PROC : db2vend
INSTANCE: db2pbw               NODE : 000
EDUID   : 1
FUNCTION: DB2 UDB, database utilities, sqluvint, probe:374
DATA #1 : TSM RC, PD_DB2_TYPE_TSM_RC, 4 bytes
TSM RC=0x0000006A=106 -- see TSM API Reference for meaning.

2009-08-22-16.50.58.164968-300 E5303260A861       LEVEL: Error
PID     : 53798                TID  : 4114        PROC : db2sysc 0
INSTANCE: db2pbw               NODE : 000
EDUID   : 4114                 EDUNAME: db2logmgr (PBW) 0
FUNCTION: DB2 UDB, data protection services, sqlpInitVendorDevice, probe:1030

MESSAGE : ZRC=0x86100025=-2045771739=SQLP_MEDIA_VENDOR_DEV_ERR
          "A vendor device reported a media error."
DATA #1 : String, 29 bytes
Init failed!  Vendor rc info:
DATA #2 : Vendor RC, PD_DB2_TYPE_VENDOR_RC, 4 bytes
Vendor RC=0x0000000B=11 -- see DB2 API Guide for meaning.
DATA #3 : Hexdump, 48 bytes
0x0700000030698AD0 : 0000 006A 3337 3420 3130 3600 0000 0000
  ...j374 106.....
0x0700000030698AE0 : 0000 0000 0000 0000 0000 0000 0000 0000
  ................
0x0700000030698AF0 : 0000 0000 0000 0000 0000 0000 0000 0000
  ................

The API return code 106 means DSM_RC_ACCESS_DENIED and is referring to the error log file in the DSMI_LOG environment variable. To find out where this variable is pointing, use the following commands:

ps -elf | grep -i “db2sysc” | grep -i “<instance owner>”
ps eww <pid of db2sysc from the output above>

The output from the second command will return all the variables under which DB2 is running. Make sure that the error log has the proper R/W permissions for the user running the backup. If the variable is set incorrectly and needs to be changed, a restart of DB2 is required in order for the changed to take effect.

Please note that if there is an ERRORLOGNAME specified in the dsm.sys then this will take higher precedence and is the error log that needs to have write permissions.

If there is no ERRORLOGNAME, then the DSMI_LOG variable is used and should point to a directory that has write permissions. A dsierror.log file would be created in this DSMI_LOG directory location. If DB2 cannot be restarted to pickup a new setting for the DSMI_LOG variable, the ERRORLOGNAME option can be set to a writeable location/filename to work around the failure.

Comments Off

Redefining TSM Library and Drives for UNIX OS

Comments Off

Question

Frequently when hardware or firmware has changed it is necessary to remove the tape library and drive definitions from the (IBM Tivoli Storage Manager) TSM Server, then re-define them.

Cause

Sometimes there are specific errors, such as:
ANR0523W Transaction failed – error on output storage device
ANR8300E I/O error on library (OP=xx, CC=xx, KEY=xx, ASC=xx, ASCQ=xx, SENSE=xx)
ANR8301E I/O Error on library
ANR8355E I/O error reading label for volume NNNNNN on drive XXXXX
ANR8359E Media fault detected on volume NNNNNN in drive XXXXX
ANR8441E Initialization failed for SCSI library
ANR8779E Unable to open drive XXXXX, error number=ZZZ
ANR8944E Hardware or media error on drive
ANR8963E Unable to find path to match the serial number defined for drive

Frequently the TSM Server can automatically rediscover devices when using “SANDISCOVERY ON” or by using “UPDATE PATH” with “AUTODETECT=YES” to refresh the values.

However, there are times when that may not be successful. For example, if a tape drive, tape library, fibre/SCSI HBA, or SAN has experienced changes (such as hardware, firmware or device drivers) it may require rebuilding the TSM “special files” to re-establish connectivity to the library and drives. To rebuild the “special files,” we must delete and re-define the hardware devices to the TSM Server (UPDATE does not rebuild).

 

Answer

Perform these tasks in this sequence to totally re-define the tape devices to TSM. These steps should be taken only if attempts to update the devices/paths using the autodetect features have failed:
1) Before deleting anything, gather the output from these commands, so you can use the same naming conventions when re-defining the tape devices:
  QUERY STATUS (get SERVERNAME value for “<tsm_server>”)
  QUERY DEVCLASS
  QUERY LIBRARY FORMAT=DETAIL
  QUERY DRIVE FORMAT=DETAIL
  QUERY PATH FORMAT=DETAIL

2) Run the appropriate OS command to produce a list of the configured HW ‘special file’ device names.
      AIX   ==>   lsdev -Cc tape           (-or- 'cfgmgr')
                  lsdev -Cc adsmtape       (for TSM devices)
                  lsdev -Cc library
   Solaris  ==>   ls -l /dev/rmt/*st       (-or- 'sysdef')
                  ls -l /dev/rmt/*smc
     HP-UX  ==>   /usr/sbin/ioscan -funC tape
                  (-or  'ioscan -kfn')
     Linux  ==>   ls -l /dev/IBM*
                  ls -l /dev/tsmscsi/*
                  (-or- 'more /etc/sysconfig/hwconf')

If the tape devices are not defined to the OS, please work with your OS or SCSI/SAN hardware support to configure them. Until the OS can use the drives (can write to them, for example using ‘tar’ or ‘dd’) the tape devices cannot be defined to TSM.

3) From the ‘/dev’ directory, write down the OS-level device definitions for the library and drives:
                  AIX     Linux         Solaris    HP-UX
  TSM Drives      mt#     tsmscsi/mt#   rmt/#      rmt/tsmmt#
  IBM Drives      rmt#    IBMtape#      rmt/#st    rmt/#m
  TSM Library     lb#     tsmscsi/lb#   rmt/#lb    tsmchgr#
  358x Library    smc#    IBMchanger#   rmt/#smc   rmt/#chng
  3494 Library    lmcp#   3494lib       libmgrc#   libmgrc#

4a) First the drives and drive paths must be deleted. From a TSM Server admin commandline, for all the drives:
   DELETE PATH  <tsm_server>  <drive_name>  SRCTYPE=SERVER  DESTTYPE=DRIVE  LIBRARY=<library_name>

4b) Then delete all the TSM drive definitions:
   DELETE DRIVE  <library_name>  <drive_name>

5a) Next, delete the path for the tape library:
   DELETE PATH  <tsm_server>  <library_name>  SRCTYPE=SERVER  DESTTYPE=LIBR

5b) And finally delete the TSM library definition:
   DELETE LIBRARY  <library_name>

If the OS cannot access the tape drives at this point, stop. Check hardware, device drivers, update firmware, swap cables; consider power-cycling the tape library then deleting and re-defining to the OS. There is no point attempting to get TSM to write to the devices if they are not recognized by the OS; work with OS and/or hardware vendors to resolve HW issues before proceeding.

6a) Now the tape library and library path can be re-defined. Use the TSM QUERY outputs from “Step 1″ as a guide for the library name and LIBTYPE; no additional parameters are necessary in the syntax below. Redefine the library:
   DEFINE LIBRARY  <library_name>  LIBTYPE=<library_type>  SERIAL=AUTODETECT

Note: If this TSM Server is hosting a tape library for other systems, for example any “TSM Server Library Clients” or “TSM Storage Agents” then you also need “SHARED=YES” on the “DEFINE LIBRARY”.

6b) Redefine the path to the library. For SCSI libraries, confirm the DEVICE value matches the latest OS-level info gathered from “Step 2″. For 3494, ACSLS, and other types of libraries using software configuration files, use the previous values from “Step 1″ to redefine the DEVICE or ACSID, and so on:
   DEFINE PATH  <tsm_server>  <library_name>  SRCTYPE=SERVER  DESTTYPE=LIBRARY  DEVICE=</dev/lb#>

7a) Redefine the drives and drive paths. Redefine all the drives using names from “Step 1″ for example:
   DEFINE DRIVE  <library_name>  <drive_name>  SERIAL=AUTODETECT  ELEMENT=AUTODETECT

7b) Redefine paths to all drives, using the OS-level info gathered from “Step 2″ for the DEVICE values. Keep in mind the OS-level DEVICE values may have changed since the they were previously defined.
   DEFINE PATH  <tsm_server>  <drive_name>  SRCTYPE=SERVER  DESTTYPE=DRIVE  LIBRARY=<library_name>  DEVICE=</dev/mt#>

Note: If this TSM Server is hosting a tape library for other systems, for example any “TSM Server Library Clients” or “TSM Storage Agents” then in addition to the “TSM Server Library Manager” DRIVE PATH, you also need to define a new PATH for each drive for those systems, substituting the SERVERNAME (shown by “Q SERVER”) for the value of “<tsm_server>” and the local DEVICE value for the drive as seen by that other system.

8) Verify the library, drives, and paths are online:
  QUERY LIBRARY  <library_name>  FORMAT=DETAIL
  QUERY DRIVE   <library_name>  *  FORMAT=DETAIL
  QUERY PATH  *  *  FORMAT=DETAIL

9) Since the library is “new” to TSM, the volumes must be checked in again to re-create the inventory (AUDIT LIBRARY does not CHECKIN). Use *this* sequence, first SCRATCH, then PRIVATE:
  CHECKIN LIBVOL  <library>  SEARCH=Y STATUS=SCR CHECKL=BARC
  CHECKIN LIBVOL  <library>  SEARCH=Y STATUS=PRIV CHECKL=BARC

NOTE: For ACSLS libraries, use “CHECKLABEL=NO” on the CHECKIN commands, because “CHECKLABEL=BARCODE” is not supported for an ACSLS Library.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

If that doesn’t resolve the issue, the tape drive problem seems beyond the control of the TSM Server (software). Review the output from the OS-level logs for additional hardware error information:
                          Remove OS    Install
      OS   Diagnostics    Devices      OS Devices
 -------   ------------   ---------    ------------
     AIX   errpt –a       rmdev        cfgmgr
   Linux   dmesg                       /dev/MAKEDEV
 Solaris   mbin/prtdiag   rem_drv      drvconfig
   HP-UX   dmesg          rmsf         insf -e

If you cannot reach HW support immediately, you could take the additional action of power-cycling in this order:
1) Tape library.
2) SAN switch (if any).
3) Consider updating to latest device drivers and/or firmware.
4) Halt TSM and reboot system with TSM Server.
5) Re-define the tape device to the OS (see commands above).
6) If tape device definitions have changed, DELETE & re-DEFINE to TSM.

And that is all we can do from a software perspective, if errors persist it points to an issue at a layer which TSM cannot repair.

 

 

Comments Off

db2vend crash or generating trap during DB2 backup or restore

Comments Off

Problem(Abstract)

DB2 backup using Tivoli Storage Manager API client can crash or generating traps

Cause

Problem with bundled gskit code. The fix is in gskit 8.0.13.4. Tivoli Storage Manager API added gskit 8.0.13.4 in 6.2.2 API. This resolves the crash issue described in apar IC67672 . With the fix for IC67672 applied, the DB2 backup and restore may still generate a trap file. This is because DB2 uses its own gskit and to date, DB2 has not bundled gskit 8.0.13.4 in their code yet.

Environment

DB2 backups using Tivoli Storage Manager API client v6.2 to a Tape Storage.

Diagnosing the problem

DB2 backup or restore from a Tivoli Storage Manager creates traps. Example of a trap:


DB2 build information: DB2 v9.7.0.2 s100514 SQL09072
timestamp: 2010-11-02-10.48.28
Process name: db2vend (db2med - 66910 (TEDB1))
Signal #4
uname: S:AIX R:1 V:6 M:0005A5E3D600 N:HDQPS104
process id: 913410
thread id : 1 (0x1)
kthread id : 6377571
</Header>
<SignalDetails>
<Siginfo_t length="64">
00000004 00000000 0000001E 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
</Siginfo_t>
Signal #4 (SIGILL): si_addr is 0x0000000000000000, si_code is 0x0000001E
(ILL_ILLOPC:Illegal opcode.)

 

Resolving the problem

DB2 plans to add gskit 8.0.13.4 in DB2 9.7 FixPack 5, not yet released. The following option can be used with the Tivoli Storage Manager 6.2 client to work around the issue:

  • Add the following line in the “dsm.opt ” file:testflag noaes
    • /<opt or usr>/tivoli/tsm/client/api/bin directory if DB2 is 32bit
    • /<opt or usr>/tivoli/tsm/client/api/bin64 directory if DB2 is 64bit
  • Note: The dsm.opt should be the one defined as “DSMI_CONFIG” or by default:

Note that with the “testflag noaes” option, the Tivoli Storage Manager API client is not able to do 128-bit encryption or client deduplication.

Comments Off