On source and target, we set SP_DEQ_TIMEOUT to 180 seconds which translates to 3 minutes.
On target we set SP_OPO_POSTER_DELAY to 10 minutes.
The following errors are seen when the parameter SP_OPO_POSTER_DELAY is set to a value higher than SP_DEQ_TIMEOUT.
Here are the event logs and compare logs from source and target when the failure occurs:
Source:
sp_ctrl (alvsupl14:9239)> show log reverse
Info 2019-05-21 15:02:55.966880 27601 1783700144 Compare server exited normally, pid = 27810 [sp_desvr]
Error 2019-05-21 15:02:50.919308 27810 1117784384 [1] Compare server: Error 50001 sending version info, see desvr_1_o.ORA11GR2_p27810 [sp_desvr] [module deq]
Info 2019-05-21 14:58:47.086010 27810 3801786320 Compare server completed
Notice 2019-05-21 14:58:47.085746 27810 3801786320 Compare server: Opened Compare session log file /home/paul/p92139-11g/var/log/desvr_1_o.ORA11GR2_p27810 [sp_desvr] [module deq]
Notice 2019-05-21 14:58:47.047394 27810 3801786320 Compare server: Replicating according to target compatibility of "9.2.0" [sp_desvr] [module sys]
Info 2019-05-21 14:58:46.783721 27810 1783700144 Compare server launched, pid = 27810 [sp_desvr]
desvr log tail portion:
desvr 2019-05-21 15:02:50.906255 27810 1117784384 [1] Error: compare client got timed out error (failed to receive consistent view marker)
desvr 2019-05-21 15:02:50.908044 27810 1117784384 [1] DEBUG:schema=SHA92, table name=TABLE1, column_count=2 (de_table_list_table_info,L2333)
desvr 2019-05-21 15:02:50.908475 27810 1117784384 [1] DEBUG:colName=NUM_COL, column[0].number=1, type=2 (NUMBER), dsize=22, cform=0, csid=0
desvr 2019-05-21 15:02:50.908511 27810 1117784384 [1] DEBUG:colName=CHAR_COL, column[1].number=2, type=1 (CHAR), dsize=10, cform=1, csid=873
desvr 2019-05-21 15:02:50.911926 27810 1117784384 [1] Status: WaitMarker Elapsed time: 239.214 seconds
desvr 2019-05-21 15:02:50.917189 27810 1117784384 [1] Status : Error
desvr 2019-05-21 15:02:50.919630 27810 1117784384 [1] Error: compare client timed out.
desvr 2019-05-21 15:02:50.919708 27810 1117784384 [1] Compare SHA92.TABLE1 to SHA92.TABLE1 failed. See error message above.
desvr 2019-05-21 15:02:53.962549 27810 1117784384 [0] Exiting........
Target:
sp_ctrl (alvsupl17:9239)> show log reverse
Info 2019-05-21 14:54:38.893998 32309 2396256096 Compare client exited normally, pid = 32423 [sp_declt]
Error 2019-05-21 14:54:28.887986 32423 2331826016 Compare client: Error, sqlite record not found (sp_declt.cpp,L1240) [sp_declt] [module deq]
Error 2019-05-21 14:54:28.887626 32423 2331826016 Compare client: Error timed out occurred after 240 seconds (../src/deqtr/sp_declt.cpp,L909) [sp_declt] [module deq]
Info 2019-05-21 14:53:59.575468 32309 2396256096 Poster exited normally, pid = 32368 (posting from o.ORA11GR2, queue asdf, to o.ORA11GR2)
Notice 2019-05-21 14:53:58.930752 32368 2949576544 Poster: Highest committed SCN 0 (posting from o.ORA11GR2, queue asdf, to o.ORA11GR2) [module opo]
Notice 2019-05-21 14:53:58.830259 32368 2949576544 Poster: Shutting down by request (posting from o.ORA11GR2, queue asdf, to o.ORA11GR2) [module opo]
Notice 2019-05-21 14:53:58.828896 32446 1060800384 User command: paul stop post queue asdf (from alvsupl17.prod.quest.corp)
Notice 2019-05-21 14:50:28.820772 32423 2331826016 Compare client: Opened DataEquator client session log file /home/paul/p92139-11g/var/log/declt_1-1_o.ORA11GR2_a01174f_p32423.log [sp_declt] [module deq]
Info 2019-05-21 14:50:27.237942 32423 2396256096 Compare client launched, pid = 32423 [sp_declt]
declt log tail portion:
declt 2019-05-21 14:54:28.887534 32423 2331826016 Error timed out occurred after 240 seconds (sp_declt_wait_for_sqlite_record,L907)
declt 2019-05-21 14:54:28.887935 32423 2331826016 Status : Error
declt 2019-05-21 14:54:28.887976 32423 2331826016 Error, sqlite record not found (sp_declt.cpp,L1239)
declt 2019-05-21 14:54:28.888185 32423 2331826016 Exiting . .
The posting is delayed by the amount of time as specified in the parameter SP_OPO_POSTER_DELAY while the parameter SP_DEQ_TIMEOUT is set lower. This results in timeout of compare/repair.
When a compare/repair is issued, it sends a marker message (a kind of token message) from source to target for the compare/repair client to launch on target. If the marker message does not reach the target within the time as specified by SP_DEQ_TIMEOUT, then compare/repair will time out. When no SP_OPO_POSTER_DELAY is configured, a backlog in Post queue can result in the compare/repair timing out if the compare marker message takes time to reach the target which is higher than SP_DEQ_TIMEOUT. This only happens when the queues are heavily backlogged. However, if the SP_OPO_POSTER_DELAY is also configured and it is configured to a value higher than SP_DEQ_TIMEOUT, then every compare/repair will timeout within a time equal to the setting of SP_DEQ_TIMEOUT.
So the first thing to do is to configure SP_DEQ_TIMEOUT to be higher than SP_OPO_POSTER_DELAY. But while doing this one needs to factor the time required for the marker to traverse thru the Post queues before reaching the target. Hence, one should set it adequately higher than SP_OPO_POSTER_DELAY. An ideal setting may be a matter of trial and error. But that being said, one will always get an opportunity to set it even higher if a compare/repair fails. So one can start with a particular setting of SP_DEQ_TIMEOUT, and then increase it in future if any compare/repair fails. The parameter SP_DEQ_TIMEOUT needs to be set on both source and target and identical value on source and target is preferred. In our example, we can set the SP_DEQ_TIMEOUT to, say 1800 seconds, which is 30 minutes. This value is higher than the value of SP_OPO_POSTER_DELAY which is only 10 minutes. If the compare/repair still fails, we can then set SP_DEQ_TIMEOUT to even a higher value.
© ALL RIGHTS RESERVED. Feedback Términos de uso Privacidad Cookie Preference Center