QMGR Recovery when the Disaster happens due to the Disk failures, Server crashes,Server restart without proper stopping QMGR or Human error’s like MQ administrator mistakenly delete QMGR active logs or Corrupted for some reason and unable to start the QMGR due to such failures . During such scenarios QMGR cold start is a very good technique to bring back the MQ from the Disaster failure.
We should verify the QMGR logs ,FFDC logs carefully to conclude the actives logs are corrupted due to any of the above mentioned reasons then only plan for Cold start .Other wise the check the logs for the reason and rectify based on the error . However we are discussing the situation where the active logs are no more or corrupted .
This document helps to understand how to recover qmgr when its active logs are corrupted. Take the backup of Entire /var/mqm/ before doing any changes It will not take more space .
strmqm PROD.QM1 This will result below error
From the FDC logs/QMGR logs we see below error .
Take a backup copy of the qmgr data, log files any error logs/FDCs/dumps
Backup of /var/mmq/log/
Verify the existing QMGR /var/mqm/qmgrs/PROD.QM1/qm,ini file and take the below values from the QMGR. We will be creating TEMP QMGR with the exact same attributes . We will not be starting this QMGR .
crtmqm –lc –lf 2048 –lp 100 –ls 50 TEMP
Once the TEMP QMGR is created we will not be starting this . We just need the active logs from /var/mqm/log/TEMP/active and /var/mqm/qmgrs/TEMP/amqalchk.fil file from TEMP QMGR to original QMGR .
/var/mqm/log/TEMP/active/* —> /var/mqm/log/PROD.QM1/active
/var/mqm/qmgrs/TEMP/amqalchk.fil —> /var/mqm/qmgrs/PROD.QM1/
We have replaced corrupted logs with the new logs .
Now start the QMGR with strmqm PROD.QM1
verify the status of QMGR using dspmq
Verify if there are any FDC logs , check the QMGR logs .If all the fine then verify the channel status . If all the channel status are good then we have successfully recovered the QMGR .
Please verify the connectivity with the Applications connecting to this qmgr.
Data in the queues is preserved if messages are persistent.
You can delete TEMP QMGR now . dltmqm TEMP