SharePlex 11.4 - Administration Guide

You can use the SharePlex queue contention reduction feature to ensure that shared memory is not swapped to disk when the post queue is becoming full. This feature is enabled by the SP_IMP_QUEUE_PAUSE parameter.

This parameter pauses the writing of data to the post queue when that queue contains the specified number of messages. Post stores queue messages in shared memory until it issues a checkpoint, after which it releases the data from memory.

If the post queue runs out of shared memory, the read and write functions will start incurring file IO to free up the memory buffers. By pausing the queue writing, this parameter helps Post maintain its performance by avoiding the need for disk storage and the resultant slowdown in IO.

Use the SP_IMP_QUEUE_RESUME parameter to set the number of messages at which Import resumes writing to the post queue. This parameter works in conjunction with SP_IMP_QUEUE_PAUSE. If the number of messages in the post queue is lower or equal to the value set with this parameter, Import resumes writing to the post queue.

To use this feature, both SP_IMP_QUEUE_PAUSE and SP_IMP_QUEUE_RESUME must be greater than zero, and SP_IMP_QUEUE_PAUSE must be greater than SP_IMP_QUEUE_RESUME.

Tune subqueue indexing

You can improve Post queue performance by enabling subqueue indexing to access the subqueue structures that represent a transaction session. A message "Subqueue index enabled queuename" is written to the Event Log for every Post queue for which this parameter is enabled.

To enable this feature, set the SP_QUE_USE_SUBQUE_INDEX parameter to 1. This parameter does not support VARRAYs. If you are replicating VARRAYs and this parameter is enabled, the parameter is ignored.

Tune hash-based horizontally partitioned replication

Tune the Post process > Tune hash-based horizontally partitioned replication

Hash-based horizontally partitioned replication uses a hash algorithm that is based on the rowid by default. You may be able to improve the processing of tables that use hash-based horizontally partitioned replication by switching the hash algorithm to one that is based on the block where the row resides.

Because changing the algorithm has the same effect as a routing change (the potential to switch partitions), you must reactivate the configuration file. The activation locks the tables that are affected by this change so that the hash change is applied when there are no open transactions. The locking eliminates the potential for out-of-sync conditions by preventing data that is processed under the new hashing algorithm from being posted before in-flight data that was processed under the old algorithm.

To switch to a block-based hash:

Set the SP_OCF_HASH_BY_BLOCK parameter to 1.
Reactivate the configuration file.

Recover replication after Oracle failover

Recover Replication after Oracle Failover

This chapter contains instructions for recovering Oracle replication along with the database and applications during a failover in a high-availability environment. To support these procedures, SharePlex must be properly configured to support high availability. See Configure Replication to Maintain High Availability.

Recover replication if the primary system fails

Recover replication if the secondary Oracle instance fails

Move replication during planned failover and failback

Resume replication after failure and recovery

Recover replication if the primary system fails

Recover replication after Oracle failover > Recover replication if the primary system fails

In an unplanned failure of the primary (source) machine, replicated data remaining in the SharePlex queues on that system will be unrecoverable as a result of buffering and the likelihood that the queues are corrupted. In a high-availability environment, you can move replication to the secondary (target) machine along with the database users to maintain data availability. When the primary system is restored, you can move users and replication back to that system with minimal downtime using a hot backup of the secondary instance.

For this procedure, you will activate a configuration file and then use the reconcile command to synchronize the results of the backup with ongoing replicated user changes after the copied instance is recovered.

Supported databases

Oracle database on Unix or Linux

Requirements

SharePlex must be configured correctly to support high availability. For more information, see Configure Replication to Maintain High Availability.
You must have a backup of the SharePlex replication environment.
You must know how to run SharePlex. For more information, see Run SharePlex.
You must understand the activate config, reconcile, and delete queue commands. See theSharePlex Reference Guide.

Procedure 1: Move replication to the secondary system

To move replication to the secondary system after an unplanned failure of the primary system:

Verify that Export is stopped on the secondary system.

sp_ctrl> stop export
Use the qstatus command to view the post queue, and keep issuing this command until the number of backlogged messages is 0. Note: Do not wait for the actual numberof messages to be 0. If the primary system failed before the commit for some transactions were received, messages for these partial transactions will remain in the queue until cleared later in this procedure.

sp_ctrl> qstatus
Run the script that grants INSERT, UPDATE and DELETE access to all users on the secondary system.
Run the script that enables triggers and constraints on the secondary system when users begin using this system.
Implement the failover procedure for relocating users to the secondary system, including starting the applications.
Move users to the secondary system and let them resume working, but do not start Export. Their transactions will now be accumulating in the export queue awaiting restoration of the source database.

Note: If started, Export will repeatedly try to connect to the primary system, wasting system resources.

Go to Procedure 2: Move replication to the restored primary system.

Procedure 2: Move replication to the restored primary system

This procedure moves users back to the primary machine after it is recovered from an unplanned failure. Follow each segment in the order presented.

Restore the replication environment on the primary system

To restore the replication environment on the primary system:

On the primary system, recover the SharePlex directories, system files and Oracle files from the backups and archives.
On the primary system, start sp_cop using the -s option to prevent the SharePlex processes (Capture, Read, Export, Import, Post) from starting.

$ /productdir/bin/sp_cop -s &
On the primary system, start sp_ctrl.
On the primary system, deactivate the configuration file. When you copied the archived SharePlex variable-data directory to the primary system, you copied the configuration that was active before the system failed. This causes the Capture process to set the transaction number to “1” when replication from the primary system resumes.

sp_ctrl> deactivate config filename

Purge the queues

To purge the queues:

Run sp_ctrl on the primary and secondary systems.
On the primary system, delete the capture queue.

sp_ctrl> delete capture queue for datasrc [on host]

Example: sp_ctrl> delete capture queue for o.oraA
On the primary system, delete the export queue.

sp_ctrl> delete export queue quename [on host]

Example: sp_ctrl> delete export queue sysA

On the secondary system, delete the post queue.

sp_ctrl> delete post queue quename for datasrc-datadst [cleartrans] [on host]

Example: sp_ctrl> delete post queue sysA for o.oraA-o.oraB

Note:

You are issuing the delete queue command on the primary system because you restored the old capture and export queues when you restored the archived SharePlex directories.
You are issuing the delete queue command on the secondary system because data remaining in the post queue cannot be posted. The primary system failed before Post received a COMMIT for remaining transactions. SharePlex will rebuild the queues when you reactivate the configuration and the two systems are reconciled.

Start replication from secondary to primary system

To start replication from secondary to primary system:

On the primary system, verify that all SharePlex processes are stopped.

sp_ctrl> status
On the primary system, stop Post.

sp_ctrl> stop post
On the secondary system, start Export. This establishes communication between the primary and secondary systems.:

sp_ctrl> start export

Synchronize the source and target data

To synchronize the source and target data:

On the secondary system, run an Oracle hot backup.
When the hot backup is finished, switch log files on the secondary system and make a note of the highest archive-log sequence number.
On the primary system, recover the primary database from the backup by using the UNTIL CANCEL option in the RECOVER clause, and cancel the recovery after the log with the number you recorded has been fully applied.
Open the database with the RESETLOGS option.

Note: This resets the sequences on the primary system to the top of the cache upon startup.
On the primary system, issue the reconcile command using the sequence number of the log that you noted previously. If you are using named post queues, issue the command for each one. If you do not know the queue names, issue the qstatus command first.

sp_ctrl> qstatus

sp_ctrl> reconcile queue queuename for datasource-datadest seq sequence_number

Example: reconcile queue SysB for o.oraA-o.oraA seq 1234
On the primary system, disable triggers on the tables, or run the sp_add_trigger.sql utility script so that the triggers ignore the SharePlex user.
On the primary system, disable check constraints and scheduled jobs that perform DML.
On the primary system, log onto SQL*Plus as the SharePlex Oracle user and run the cleanup.sql utility from the bin sub-directory of the SharePlex product directory. This truncates the SharePlex tables and updates the SHAREPLEX_ACTID table.
On the secondary system, truncate the SHAREPLEX_TRANS table. This table contains transaction information that the Post process was using before the primary system failed, and therefore that information is obsolete. Truncating the table restores transaction consistency between the two systems.

Activate replication on the primary system

To activate replication on the primary system:

On the primary system, activate the configuration file.

sp_ctrl> activate config filename
On the primary system, start Post.

sp_ctrl> start post

Restore the object cache

To restore the object cache:

On the primary system, view the status of the SharePlex processes.

sp_ctrl> status

Note: This command shows that Post stopped due to an error, which occurred because the object cache is missing.
On the primary system, issue the show log command with the filter option, and filter on the keyword "objcache."

sp_ctrl> show log filter=objcache
Look for a Post error message that refers to a file with a name similar to the following example:

0x0a0100c5+PP+sys4+sp_opst_mt+o.quest-o.ov-objcache_sp_opst_mt.18

You are looking for a string that contains the string “objcache_sp_opst_mt,” followed by a number. This is the object-cache file that the Post process needs. If you are using named post queues, there will be more than one error message, each referring to a different object-cache file but ending with the same number, such as the number .18 in the example.
Make a note of the full pathname of the Post object-cache file(s) named in the error message. The path will be the state directory in the SharePlex variable-data directory, for example:

splex_vardir/state/0x0a0100c5+PP+sys4+sp_opst_mt+o.quest-o.ov-objcache_sp_opst_mt.18
On the secondary system, shut down SharePlex.

sp_ctrl> shutdown
On the secondary system, change directories to the state sub-directory of the SharePlex variable-data directory and locate the Capture object-cache file. This file will have a name similar to the one in the following example.

o.quest-objcache_sp_ocap.18

Important! If there are more than one of these files, use the one with the most recent number at the end of it — this number should match the number at the end of the Post object-cache file, such as .18 in the example.
Copy the Capture object-cache file to the primary system and rename it to the full pathname of the Post object-cache file that you noted previously.
On the primary system, start Post.

sp_ctrl> start post

Switch users back to the primary system

To switch users back to the primary system:

On the secondary system, stop user access to the database.
On the primary system, view the post queue and continue to issue the command until the number of messages is 0.

sp_ctrl> qstatus
On the secondary system, shut down the Oracle instance.

svrmgr1> shutdown
On the secondary system, start the Oracle instance.

svrmgr1> startup

Note: This resets the sequence on the secondary system to the top of the cache to synchronize with the primary system.
On the primary system, enable database objects that were disabled.
On the secondary system, start SharePlex.
On the secondary system, stop Post.

sp_ctrl> stop post
On the secondary system, disable triggers on the tables, or run the sp_add_trigger.sql utility script so that the triggers ignore the SharePlex user.
On the secondary system, disable check constraints and scheduled jobs that perform DML.
On the secondary system, start Post.

sp_ctrl> start post
On the primary system, view the number of messages in the post queue, and keep checking until the number of messages is 0.

sp_ctrl> qstatus
When the number of messages is 0, switch users back to the primary system.
On the secondary system, stop Export to prevent accidental changes made on that system from being replicated to the primary system.

sp_ctrl> stop export

Note: The secondary system is now in a failover-ready state again, with:

no users
an active configuration
disabled or modified triggers, constraints, and scheduled jobs
a stopped Export process

Please select your product:

To serve you better, please complete the Purpose of your Chat:

Recommended Solutions for Your Problem

SharePlex 11.4 - Administration Guide

Tune queue performance

Reduce queue contention

Tune subqueue indexing

Tune hash-based horizontally partitioned replication

Recover replication after Oracle failover

Contents

Recover replication if the primary system fails

Supported databases

Requirements

Procedure 1: Move replication to the secondary system

Procedure 2: Move replication to the restored primary system

Restore the replication environment on the primary system

Purge the queues

Start replication from secondary to primary system

Synchronize the source and target data

Activate replication on the primary system

Restore the object cache

Switch users back to the primary system