This problem is especially observed when issuing more than one copy serially without waiting for the earlier copy to finish. For example, in the following sample output, the first copy succeeded while the remaining two failed:
sp_ctrl > copy status all
110 Copy wds77.ldd_table Done 01-OCT-09 13:44 01-OCT-09 14:9
111 Copy wds77.po_table Failed 01-OCT-09 14:03 01-OCT-09 14:3
112 Copy wds77.prd_table Failed 01-OCT-09 14:03 01-OCT-09 14:3
The copy server log provides the following hints about incurring an error:
Thu Oct 01/14:03:37.950:: 003: Unlocking Table Group 0
Thu Oct 01/14:03:37.961:: All tables have been unlocked
Thu Oct 01/14:33:34.566:: ERROR: Timeout waiting for client to acknowledge SYNC marker at sync/svr/sync_svr_messager.cpp:276
Thu Oct 01/14:33:34.570:: Abort remaining threads
Thu Oct 01/14:33:34.570:: WARNING: Stopping export thread, id = 3 at sync/svr/sync_svr_messager.cpp:1191
Thu Oct 01/14:33:34.570:: Waiting for threads to finish
Thu Oct 01/14:33:34.586:: 003: WARNING: Thread asked to stop, stopping export at sync/svr/sync_export_thread.cpp:266
Thu Oct 01/14:33:34.587:: 003: WARNING: Sent a server control stop message (line=268) at sync/svr/sync_export_thread.cpp:679
Thu Oct 01/14:33:34.587:: 003: External process pid = 15458 sent SIGTERM
Thu Oct 01/14:33:34.616:: 003: External process pid = 15458 sent SIGKILL
Thu Oct 01/14:33:34.856:: 003: External process pid = 15458 exited due to signal SIGTERM
Thu Oct 01/14:33:34.870:: 003: Export process returned status: 0
Thu Oct 01/14:33:34.890:: Export thread 3 aborted
Thu Oct 01/14:33:35.156:: ERROR: Error processing SYNC at sync/svr/sync_server.cpp:151 at sync/svr/sync_server.cpp:159
The corresponding copy client log also provides some information about the error:
Thu Oct 01/14:16:53.151:: 001: Process 13227 started successfully
Thu Oct 01/14:16:54.156:: Checking for marker
Thu Oct 01/14:16:54.156:
The server timed out waiting for client to acknowledge receipt of sync marker.
1.Issue list param all sync to view the settings of the copy/sync parameters, and look specifically for the setting of SP_OSY_POST_TIMEOUT. In below example it is as follows:
sp_ctrl > list param all sync
.
.
SP_OSY_POST_TIMEOUT 1800 Seconds Live
.
2.The parameter is in seconds. Increase its value to a very high number that would be larger than the approximate time required to Export and Import the first table. For example:
sp_ctrl>set param SP_OSY_POST_TIMEOUT 1800000
3.Issue the copy for the remaining tables. Though the parameter is stated as live, the changed setting of the parameter will only apply to the future copy commands issued.
The following paragraphs explain in detail the reason for the parameter change.
Though discussion of the exact mechanism of the way copy functions is beyond the scope of this solution and is a Quest proprietary info, the following para provides a higher level view of the process. When the copy command is issued from source sp_ctrl, there is a token message (also referred to as marker) that traverses from source Shareplex instance to the target instance. Once the Post process receives the marker, the posting is suspended and the synchronization of the target table starts. The client process on target opens another connection with the server process on source via another port (different from the port at which the replication is functioning) and the other port is usually 2501 unless changed by the parameter SP_OSY_PORT. Then the various steps needed to sync occur in background, for example, starting Oracle Export with CONSISTENT=Y, etc.
Now if another copy is issued from the source and if the first copy is operating on a very large table, then the Post will remain stopped till the first copy finishes the sync of the table. In the meantime the marker message for the second copy will remain stuck in the Shareplex queues (most likely the Post queue assuming that the queues were not backlogged when the first copy was issued). The copy server will wait before timing out and the wait time is determined by the setting of the parameter SP_OSY_POST_TIMEOUT which in our case was 1800 seconds. Incidentally the first copy did not finish before 1800 seconds so the remaining copy failed by timing out.
© 2025 Quest Software Inc. ALL RIGHTS RESERVED. Terms of Use Privacy Cookie Preference Center