This section reviews many common problems and solutions encountered when replication of Oracle DDL is active. For more information about SharePlex DDL support, see Configure DDL replication.
If the issue you are experiencing is not listed in this documentation, search the SharePlex Knowledge Base at:
The Knowledge Base provides filtering options and links to other resources that can help you use and troubleshoot SharePlex.
Only some Oracle DDL is enabled by default. Other DDL support must be explicitly enabled with parameter settings.
The SP_OPO_STOP_ON_DDL_ERR parameter is defaulted to direct the Post process to stop if there is an error applying the DDL, to enable you to correct the problem to keep the databases synchronized. This parameter should be set to 1 (on). When enabled, messages such as the following notify you that DDL was skipped.
Skipping generic 9i DDL operation, schema (bob) could not be set.
FAILED DDL Replication for "create user bob."
Solution: Make certain the following parameters are set to appropriately. See Control Oracle DDL replication for more information about how to use these parameters to configure DDL replication.
SharePlex prints replicated DDL to the Event Log, but it truncates statements longer than 2,000 characters. Only the first 2,000 characters are recorded in the log.
Solution: None required.
This topic helps you solve replication problems that may be related to the SharePlex queues.
If the issue you are experiencing is not listed in this documentation, search the SharePlex Knowledge Base at:
The Knowledge Base provides filtering options and links to other resources that can help you use and troubleshoot SharePlex.
Data accumulating in the queues can cause SharePlex to run out of disk space. The following are some possible causes and solutions.
Problem | Description | Solution |
---|---|---|
Stopped replication processes |
When replication processes stop, data accumulates in the queues. |
If a process was stopped by a user, find out why and then start it as soon as possible so that SharePlex can process the accumulated data. If a process stopped because of an error, view the Event log with the show log command to find out what happened, then resolve the problem so that processing can continue and the queue backlog can be reduced. |
Large operations |
The Post queue may become large when storing large transactions for which there has not yet been a COMMIT. To allow for rollbacks and data recovery, SharePlex retains data in the post queue until it receives the COMMIT. |
View the post queue with the qstatus command. If the value in the Backlog (messages) field remains constant or is shrinking, while the value in the Number of messages field is increasing, Post is waiting for a COMMIT before releasing the data. Use the show post detail command to verify that Post is processing transactions normally. If possible, set the COMMIT point for your application to 500 to generate smaller SQL statements for SharePlex to process. Also, consider creating a configuration that uses named post queues which isolate tables that are known for long transactions. For more information, see Configure named post queues. If the accumulation of messages in the Post queue is threatening to cause disk space issues, you can stop Import until Post can clear out some of the operations from other transactions. For more information, see How to resolve disk space shortage. |
(Oracle) Capture is processing archive logs |
If Capture is processing archive logs, the capture queue consumes disk space while the archived records are being processed. |
If the location of Capture in the logs is far behind that of the database, the latency between the source and target data may be too large to be acceptable to the target users. Rather than add more disk space to the variable-data directory, it might be more practical to run ora_cleansp to clean up the replication environment and queues, resynchronize the data, and reactivate the configuration. If SharePlex continues to run out of disk space because Capture is processing the archive logs, the SharePlex Support team can help you tune the performance of Capture and adjust the redo configuration. |
Unplanned increase in source transaction activity |
An unplanned increase in activity on the source system can cause data to accumulate in the Post queue and approach the maximum amount of allocated disk space. |
For more information, see How to resolve disk space shortage. |
If there is a sequence of messages in the Event Log similar to the following, stop and start sp_cop.
1 08-06-12 13:20:17.089485 2384 1 sp_ordr(rim) (for o.user queue o.user) Error opening output queue rv=9 que_open(-,writeuser+ X,0x0a02d364+PR+o.user+sp_ordr+o.user)
Notice 08-06-12 13:20:17.089485 2384 1 sp_ordr (for o.user queue o.user) data route to a02d364.48.7e failed err=100
Error 08-06-12 13:20:17.089485 2384 1 sp_ordr (for o.user queue o.user) 11004 - sp_ordr failure writing to queue(s)
Notice 08-06-12 13:20:17.089485 2384 1 sp_ordr (for o.user queue o.user) Exit sp_ordr to retry rim-write.
Info 08-06-12 13:20:17.089485 2384 1 Process exited sp_ordr (for o.user queue o.user) [pid = 8183] - exit(1)
If the system that hosts SharePlex fails, one or more SharePlex queues can become corrupted, and there will be errors reading from and writing to them. In that case, the purge config and abort config commands cannot be used, because they rely on the queues to operate.
Solution: Contact SharePlex Support to resolve queue corruption issues.
If the size of the post queue seems too large relative to the number of messages it contains, that is not unusual. The SharePlex post queue actually consists of a number of sub-queues, each approximately corresponding to a user session on the source system. A sub-queue can have one or more files associated with it, each with a default size of 8 MB. If the entire 8 MB of size is not used, the file remains on the system even after the data has been posted and read-released.
Solution: The size for the post queue in a status display is the actual disk space used by all of the queue files. You can eliminate obsolete files that have been read-released by using the trim command in the SharePlex qview utility. Thecommand preserves files containing data not yet posted and committed to the target database. See the SharePlex Reference Guide for more information about the qview utility.
This section reviews the causes and solutions for common synchronization problems. If you try these solutions and are still having problems, contact Quest Support.
For more information, see Understand the concept of synchronization.
If the issue you are experiencing is not listed in this documentation, search the SharePlex Knowledge Base at:
The Knowledge Base provides filtering options and links to other resources that can help you use and troubleshoot SharePlex.
For all objects except those involved in transformation, SharePlex verifies that the source and target data in a given operation are synchronized before posting the replicated data to the target. SharePlex does not verify synchronization if transformation is being used because:
When SharePlex determines that source and target data are different, it generates error conditions but continues to post other data from the post queue. To direct Post to stop processing altogether when it detects an out-of-sync condition, change the SP_OPO_OUT_OF_SYNC_SUSPEND (Oracle) or SP_OPX_OUT_OF_SYNC_SUSPEND (Open Target) parameter. See the Post parameter documentation in the SharePlex Reference Guide.
When an out-of-sync condition occurs, the Post process logs a message in the Status Database and also to the Event Log. To view these files in sp_ctrl:
Use these commands frequently to monitor for out-of-sync errors.
The following is an example of how SharePlex reports an out-of-sync condition.
sp_ctrl (irvspxu14:8567)> show sync
Out Of Sync Status Database irvspxu14
Count Details
----- --------------------
3 Table "SCOTT"."TG_TEST1" out of sync for queue irvspxu14 since 16-Jun-
08 17:06:33
3 Table "SCOTT"."TG_TEST2" out of sync for queue irvspxu14 since 17-Jun-
08 15:47:58
1 Table "SCOTT"."TG_TEST3" out of sync for queue irvspxu14 since 17-Jun-
08 15:52:03
When data goes out of synchronization, SharePlex logs the failed SQL statements to the database_ID_errlog.sql file in the data sub-directory of the SharePlex variable-data directory.
Important: If you see an out-of-sync message in the Status Database and in the Event Log, but there is no record in the database_ID_errlog.sql file for the transaction, do not ignore those messages. They could be associated with a ROLLBACK. Regardless of whether or not a transaction is rolled back, SharePlex still compares the pre-images of the source and target rows. If they are different, that indicates that the data is out of synchronization. Only when a transaction is committed on the source but fails on the target does SharePlex log it to the database_ID_errlog.sql file, to give you a record of the statement that should have been applied as a tool for problem solving and for manually applying the statement if appropriate. Rolled back statements are canceled operations, and therefore not logged on the target.
Sometimes an out-of-sync message can be false, and the data is not out-of-sync. You can compare the source and target data by using the compare command, or you can perform the following comparison manually.
To compare with the compare command
See the compare commands in the SharePlex Reference Guide.
To verify that data is out-of-sync
The following are common ways in which data goes out of synchronization. In most cases, SharePlex detects out-of-sync conditions and returns an error message, but there are some situations where an out-of-sync condition is hidden, and SharePlex will not return an error.
Out of sync condition | Description | Solution |
---|---|---|
Incorrect cleanup procedure |
If one of the database_cleansp cleanup utilities was run on some, but not all, systems associated with an active configuration, SharePlex perceives an out-of-sync condition. |
Determine whether or not the cleanup utility was run on all systems by viewing the Event Log. The log tells you if and when it was run on each system. It also tells you if and when the configuration was activated. You can compare the times for those events to determine what happened. If the cleanup was not completed on all replication systems for this configuration, run the cleanup utility on all systems. Because the cleanup removes replication queues and processes and deactivates the configuration, you must perform initial synchronization again. |
DDL changes |
Some DDL-related causes of out-of-sync conditions are:
|
For a list of supported DDL, refer to the SharePlex Release Notes. To undo duplicate DDL changes made manually and also by SharePlex:
|
DML changes made directly to the target | If applications or users make DML changes to the tables on the target, the results of those changes will cause hidden out-of-sync conditions until Post attempts to apply a replicated change to the affected rows. When the change is applied, SharePlex returns an out-of-sync error. |
Prevent all DML access to the target tables that are in replication. You can use the compare and repair commands to compare tables for out-of-sync rows and then repair those rows. For more information, see the command documentation in the SharePlex Reference Guide. |
Triggers on target objects |
Triggers must be disabled on target objects. The triggers fire on the source system and SharePlex replicates their effects to the target. |
If triggers have fired on the target system, the objects changed by the triggers are out of synchronization and must be resynchronized. For more information, see How to resynchronize source and target tables. To disable the effects of triggers on an Oracle target To disable the effects of triggers on target Oracle objects after the data is resynchronized, you can either of the following:
|
Unnecessary constraints |
The only constraints that are necessary on target tables in a one-way replication configuration are primary and unique key constraints. CHECK constraints are not necessary on the target because they are satisfied on the source. FOREIGN KEY and ON DELETE CASCADE constraints are also satisfied on the source, and SharePlex replicates the child operations to the correct tables on the target. For an Oracle target, you can leave ON DELETE CASCADE constraints enabled if you configure SharePlex to ignore them. |
See Set up Oracle database objects for replicationSet up Oracle database objects for replication |
Duplicate entries in the configuration file. | Duplicate entries, where the source, target, and routing map are identical, cause double posting on the target. |
Copy the configuration file to a new file. Find and remove any duplicates in the new file. Perform a resynchronization and reactivation. For more information, see How to resynchronize source and target tables. |
Lack of disk space |
Data goes out of synchronization if user transactions continue when SharePlex does not have enough room to accommodate them in the queues. This can happen if:
|
For more information, see How to resolve disk space shortage. |
Changes to column conditions in horizontally partitioned replication |
Out-of-sync conditions can result from the use of horizontally partitioned replication in the following cases:
|
Create column conditions so that partition shift does not occur. For more information, see Configure horizontally partitioned replication. |
Not reactivating after a configuration change. |
If a table was added to the configuration, but the configuration was not reactivated, operations on that table are not being replicated. Note: For Oracle source tables, the auto-add feature is enabled by default, and any new table whose name satisfies a wildcard in the configuration file gets automatically added to replication. For more information, see Control Oracle DDL replication. |
Resynchronize the affected tables, then reactivate the configuration so that SharePlex can update its object cache. For more information, see How to resynchronize source and target tables. |
Queue corruption |
If the SharePlex queues are corrupted, such as if there is a system failure, the data in them can be lost. This requires a resynchronization. |
For more information, see How to resynchronize source and target tables. (Oracle) To avoid queue corruption during system failure, you can use the parameter SP_QUE_SYNC. See the Queue parameters documentation in the SharePlex Reference Guide. |
The following are common synchronization issues that relate specifically to replication between Oracle databases. In most cases, SharePlex detects out-of-sync conditions and returns an error message, but there are some situations where an out-of-sync condition is hidden, and SharePlex will not return an error.
Out of sync condition | Description | Solution |
---|---|---|
DDL changes |
Some DDL-related causes of out-of-sync conditions are:
|
For a list of supported DDL, refer to the SharePlex Release Notes. To undo duplicate DDL changes made manually and also by SharePlex:
|
Inadequate or non-existing conflict resolution routines |
Conflict resolution procedures are required in a peer to peer (active-active) configuration. SharePlex uses the conflict resolution procedures to determine which operation to post when the same data change is received from different systems. |
Revise and test the conflict resolution procedures, then resynchronize the data. For more information, see How to resynchronize source and target tables. |
Log wrap |
If the redo logs wrap before Capture can process the data it needs, the data can go out of synchronization if archived logging is not enabled or if the archive logs needed by Capture were removed. (Normally, Capture would access the archive logs and continue replicating.) |
|
LONG columns |
If a table has no primary or unique key, SharePlex builds a simulated key based on all of the columns except the LONG and LOB columns. If the LONG columns in the target rows are the only columns that contain unique values, multiple rows could meet the criteria for the simulated key. SharePlex could apply the UPDATE or DELETE to the wrong row without the error being detected, causing the table to go out of synchronization without an error message. |
If you can create a key from columns that ensure uniqueness on the target table(s), you can avoid this kind of out-of-sync condition. After you create the key, resynchronize and re-activate those objects so that SharePlex can update its object cache. If adding a primary or unique key is not possible (such as when packaged applications are in use), uniqueness of rows on the target system cannot be ensured. |
Changes to keys |
If tables use automatically generated numerical sequences as keys, and a key value changes, this may cause duplicity on the target system. If the new value already exists as a key in another row on the target system, SharePlex returns a unique-key constraint violation and out-of-sync error. This can happen when you update values using an x +n formula, where n is an incremental increase. It is possible that one of the x +n values will equal an existing value. |
Create the sequences so that the updates cannot result in a duplication of keys on the target system. |
DBMS_SCHEDULER procedures running on source and/or target system | The effects of these procedures, such as objects being created, manipulated and dropped outside replication, may not be visible to replication or may not be supported, resulting in data changes that cause out-of-sync conditions. | Exclude source and target objects from the jobs. |
Virtual private database | If replicating data that is configured as a virtual private database, the SharePlex database user may not have the access rights to capture the data. Any changes to that data will not be reflected on the target. |
If you do not need that data replicated to the target, you can filter it out of the SharePlex configuration by means of partitioned replication. If you want the data to be replicated, assign the SharePlex user the correct access rights. |
PK/UK logging not enabled | Certain replication problems can be prevented by logging key values. SharePlex fetches key values based on the rowid. Any operation that changes the rowid, such as ALTER TABLE...MOVE, can cause the wrong key values to be used for subsequent DML operations. | SharePlex recommends that both primary key and unique key supplemental logging be set, or that a supplemental log group on unique columns be defined for every table in replication. |
If data is out-of-synchronization, do the following before you resynchronize the data:
For more information, see How to resynchronize source and target tables.
If you are having trouble with any of the compare or repair commands, refer to the following for assistance.
If the issue you are experiencing is not listed in this documentation, search the SharePlex Knowledge Base at:
The Knowledge Base provides filtering options and links to other resources that can help you use and troubleshoot SharePlex.
Problem | Description | Solution |
---|---|---|
Oracle error 904 |
Error 904 calling oexec in de_select_prepare_to_fetch. A comparison failed because the source and target tables being compared are structurally different. The compare and repair commands do not detect and repair out-of-sync conditions caused by unsynchronized DDL operations or tables that are not structurally identical. |
Do a DESCRIBE on both tables. It probably will show that the tables do not have an equal number of columns or the data types are different. After you correct the DDL problem, you can run a repair to resynchronize the values in the rows. |
"Too many users” error |
Can not add DataEquator queue reader que_TOOMANYUSERS: User table is full. The maximum number of processes reading from and writing to the SharePlex queues was exceeded. No more than 20 processes can read from and write to the post queue at the same time, including the replication processes and the compare and repair processes. An error is most likely to occur when a repair option is used, because a repair accesses the queue much longer than a comparison without a repair. |
There is no workaround or way to adjust the limit, nor is there a way to determine how many compare processes can run concurrently without exceeding the limit. TIP: To compare multiple tables at the same time, without being restricted by process limitations, use the compare using command. To limit the tables being compared, create a new configuration containing only the ones that you want to compare, and use it for the comparison. (Do not activate that configuration.) All of the tables are compared with one compare using process. |
A client process fails to die |
When a sp_desvr server process dies, the associated sp_declt client processes usually die. If a process does not die, you can kill it. |
For more information, see Kill compare processes. |
A server process fails to die | When you kill a sp_declt client process or it dies on its own — or if the sp_desvr server process has not communicated with the client — the sp_desvr server process usually exits after a certain amount of time, which is controlled by the SP_DEQ_TIMEOUT parameter. | For more information, see Kill compare processes. |
To kill a client process
If you need to kill a sp_declt client process, and there are multiple compares running, you can determine which one to kill in one of the following ways:
By viewing the Event Log — The Event Log records the startup of each sp_declt client process and its PID. A subsequent entry in the log records the compare log file to which the process is writing. Within the compare log file’s name in that entry is the PID of the server process. For example, in the following sample entry, the sp_declt process PID is 2450. The process writes to compare log ../o734v32a_declt- 1228-01.log. The 1228 is the PID of the server process.
05/04/01 17:01 Process launched: sp_declt (for o.o734v32a-o.o734v32a- 87056 queue all) [pid = 2450]
05/04/01 17:01 Notice: sp_declt(deq) (for o.o734v32a-o.o734v32a-87056 queue all) Opened DataEquator session log file /u10/julia30014/var7/ log/o734v32a_declt-1228-01.log
You can search the log file names for the server process that died, and look for the client process associated with that log file to determine the correct PID to kill.
On Windows systems, the logs also record the startup of the associated sp_cop process.
To kill a server process
If you need to kill a sp_desvr server process when a sp_declt client process dies, look in the Event Log to find out which log the sp_declt client process was writing to. The Event Log records the startup of each client process and its PID. A subsequent entry in the log records the compare log file to which the process is writing. Within the compare log file’s name in that entry is the PID of the server process. For example, in the following sample entry, the sp_declt process PID is 2450. The process writes to log ../o734v32a_declt-1228-01.log. The 1228 is the PID of the server process, and that is the process to kill.
05/04/01 17:01 Process launched: sp_declt (for o.o734v32a-o.o734v32a- 87056 queue all) [pid = 2450]
05/04/01 17:01 Notice: sp_declt(deq) (for o.o734v32a-o.o734v32a-87056 queue all) Opened DataEquator session log file /u10/julia30014/var7/ log/o734v32a_declt-1228-01.log
On Windows systems, the logs also record the startup of the associated sp_cop process.
© 2024 Quest Software Inc. ALL RIGHTS RESERVED. Terms of Use Privacy Cookie Preference Center