After an abnormal shutdown of Post or Shareplex, and during a subsequent restart, I am seeing that my Post is not processing and the # of messages in the Post queue and backlog remain static with "show post detail" showing the Post State as "State Recovery".
The post may be in a state of recovery.
A shareplex process can go into recovery as a part of the checkpoint recovery mechanism that ensures the data integrity during a restart after an abnormal shutdown of a process or of Shareplex. There are solutions that delve on this topic in detail for all processes in general. This solution delves specifically on how to identify if Post is in recovery and how to deal with it.
After an abnormal shutdown of Post or of Shareplex on target, you may run into a Post going into recovery and the following symptoms show up:
1. The "qstatus" output shows the # of messages and backlog are either static (if import is stopped) or keep growing in one direction.
2. The "status" output shows that Post is running.
3. ls -l *recover* on the $SP_SYS_VARDIR/log shows that the recover log(s) keep growing
4. When "show post detail" is issued (on the named queue, if applicable, or on Post), then the Post State is "State Recovery" but on each screenshot of this command, you see the offset and the # of messages read released increasing or remaining static as shown below (from two different cases):
A. The offset and the # of messages read released are static
sp_ctrl (crmrepdb02:2103)> show post queue queuename detail
Host : hostname
Source : o.SID1 Queue : queuename
Operations
Target Status Posted Since Total
Backlog
---------- --------------- ---------- ------------------ ----------
----------
o.SID2 Running 2 25-Sep-07 17:21:08 2210308
2141021
Last operation posted:
Redo log: 214408 Log offset: 1515433672
INSERT in owner.tablename at 09/25/07
13:22:41
Post state : State recovery
Activation Id : 62
Current transaction Id : 9.
ID of blocking transaction : 0
Number of open transactions : 26
Number of messages read released : 42143147
Operations posted : 2
Transactions posted : 0
Insert operations : 2
Update operations : 0
Delete operations : 0
.
.
sp_ctrl (crmrepdb02:2103)> /
Host : hostname
Source : o. SID1 Queue : queuename
Operations
Target Status Posted Since Total
Backlog
---------- --------------- ---------- ------------------ ----------
----------
o. SID2 Running 2 25-Sep-07 17:21:08 2211332
2140978
Last operation posted:
Redo log: 214408 Log offset: 1515433672
INSERT in owner.tablename at 09/25/07
13:22:41
Post state : State recovery
Activation Id : 62
Current transaction Id : 9
ID of blocking transaction : 0
Number of open transactions : 26
Number of messages read released : 42143147
Operations posted : 2
Transactions posted : 0
Insert operations : 2
Update operations : 0
Delete operations : 0
.
.
B. The offset and the # of messages read released keep increasing with each screenshot:
sp_ctrl (bhicrepairdb:2200)> show post detail
Host : hostname
Source : o.ssordrpt Queue : queuename1
Operations
Target Status Posted Since Total
Backlog
---------- --------------- ---------- ------------------ ----------
----------
o.crepdb Running 10 03-May-05 16:39:37 8090846
8087962
Last operation posted:
Redo log: 37968 Log offset: 30059304
INTERNAL OPERATION
Post state : State recovery
Activation Id : 3
Current transaction Id : 359
ID of blocking transaction : 0
Number of open transactions : 2504
Number of messages read released : 185290576
Operations posted : 10
Transactions posted : 10
Insert operations : 0
Update operations : 0
Delete operations : 0
.
.
sp_ctrl (bhicrepairdb:2200)> /
Host : hostname
Source : o.ssordrpt Queue : queuename1
Operations
Target Status Posted Since Total
Backlog
---------- --------------- ---------- ------------------ ----------
----------
o.crepdb Running 13 03-May-05 16:39:37 8090187
8087387
Last operation posted:
Redo log: 37968 Log offset: 32430984
INTERNAL OPERATION
Post state : State recovery
Activation Id : 3
Current transaction Id : 67
ID of blocking transaction : 0
Number of open transactions : 2984
Number of messages read released : 185291235
Operations posted : 13
Transactions posted : 13
Insert operations : 0
Update operations : 0
Delete operations : 0
.
If any of the above mentioned symptoms exist, it is worth waiting for the recovery to finish before Post can resume normal operations. Trying to bounce Shareplex or Post in a hurry in a bid to make it work will be futile. After the *recover* log(s) stop growing and after values of offset and/or number of messages read released in "show post detail" appear static static for a long time, then only one can conclude that the recovery did not complete for any reason. At that point the best course of action may be to seek intervention of Support so that other workarounds can be tried.