Chat now with support
Chat mit Support

SharePlex Connector for Hadoop 8.5.5 - SharePlex Connector for Hadoop Installation Guide

Use Cases

Set Up and Start Replication

Use case: Set Up and Start Replication

Note: Ensure you complete all Initial Setup instructions first.

1. Start SharePlex for Oracle and sp_ctrl

Ensure SharePlex for Oracle and sp_ctrl are running. The prompt should be sp_ctrl (host:port)>. Refer to the SharePlex for Oracle documentation for more information.

/u01/app/shareplex/prod/bin > ./sp_ctrl

sp_ctrl ()>

2. Define the Oracle tables to replicate/capture change data

Use the SharePlex for Oracle create config command to create the file ConfigFile. The file is opened in vi. Declare all the Oracle tables you want captured into Hadoop, one table per line.

sp_ctrl ()> create config ConfigFile

#######################

datasource: O.OracleSID

OracleSchema.OracleTable !jms[:TargetSchema.TargetTable] IPHostPostQueue[:PostQueueName]

#####################

Example line: soo70.G_AUTHORS !jms 10.20.26.28:q2

IPHostPostQueue is the name or IP address of the host on which the SharePlex post queue is running.

For more information on PostQueueName see Configure ActiveMQ to work with SharePlex.

TIP: To verify there are no errors in the config file run command sp_ctrl ()> verify config ConfigFile.

3. Stop the post queue

SharePlex for Oracle uses the post queue to send messages to the JMS queue.

sp_ctrl ()> stop post

4. Run activate config

Use the SharePlex for Oracle activate config command to activate the file ConfigFile.

sp_ctrl ()> activate config ConfigFile

If you see the error "minimal supplemental logging should be enabled," then see Prepare the Oracle Data Source

5. SharePlex™ Connector for Hadoop® - Run conn_snapshot.sh and/or conn_cdc.sh

 

I) Run conn_snapshot.sh

To enable HDFS Near Real Time Replication or HBase Real Time Replication execute the SharePlex Connector for Hadoop conn_snapshot.sh script.

Specify an Oracle table to replicate. This makes a copy of that Oracle table.

If you declared more than one Oracle table in ConfigFile in step 2, then run conn_snapshot.sh for only those tables which are added for replication.

Note: Take a snapshot of the tables that are added for replication before you start the post queue.

The conn_snapshot.sh script is fully customizable. It is fully documented in conn_snapshot.sh.

conn_snapshot.sh -t Schema.Table -s ';'

You will be prompted to enter the Oracle password. This is the password to the Oracle username supplied during configuration. Run conn_setup.sh

 

II) Run conn_cdc.sh

To start capturing change data on HDFS for an Oracle table, execute the SharePlex Connector for Hadoop conn_cdc.sh script.

If you declared more than one Oracle table in ConfigFile in step 2, then run conn_cdc.sh for only those tables which are added for capturing change data.

Note: Run the conn_cdc.sh script before you start the post queue.

The conn_cdc.sh script is fully customizable. It is fully documented in conn_cdc.sh (conn_cdc.sh).

Captures change data on HDFS for insert, update and delete operations.

conn_cdc.sh -t Schema.Table

6. SharePlex for Oracle - Start the post queue

Start the post queue so SharePlex for Oracle can send messages from the post queue to the JMS queue.

sp_ctrl ()> start post

7. Start SharePlex Connector for Hadoop

Return to SharePlex Connector for Hadoop. For more on the conn_ctrl.sh command see conn_ctrl.sh.

conn_ctrl.sh start

Replication Paused or Data Inconsistent (Out of Sync)

Use Case: Replication Paused or Data Inconsistent (Out of Sync)

SharePlex Connector for Hadoop compares the values stored in HBase / HDFS with the lookup values received from SharePlex. If they don't match then data inconsistency is reported on the console and in the shareplex-connector-alert.log.

REPLICATION PAUSED

Scenarios that may lead to data inconsistency where SharePlex Connector for Hadoop pauses replication of a table include:

  • For HDFS Near Real Time replication the entire merging job at Hadoop fails. Replication of the table is paused as there will be inconsistencies going ahead.
  • Following changes to the schema (Alter).

Data Inconsistent (Out of Sync) - Take a fresh snapshot

Scenarios that may lead to data inconsistency include: receiving a delete for a row which does not exist, receiving an update on a deleted row, receiving an insert on an already inserted row.

SharePlex Connector for Hadoop suggests you take a new snapshot of the table

Follow these instructions.

1. SharePlex for Oracle - Stop the post queue

SharePlex for Oracle uses the post queue to send messages to the JMS queue. Stop the post queue before you take a snapshot.

TIP: Enter command /u01/app/shareplex/prod/bin > ./sp_ctrl to open the sp_ctrl ()> prompt.

sp_ctrl ()>stop post

For more information on PostQueueName see Configure ActiveMQ to work with SharePlex.

2. SharePlex Connector for Hadoop - Run conn_snapshot.sh

Execute the SharePlex Connector for Hadoop conn_snapshot.sh script. In SharePlex Connector for Hadoop, run the conn_snapshot.sh script for the Oracle table associated with the REPLICATION PAUSED or DATA_INCONSISTENT message. This makes a fresh copy of that Oracle table.

This script is fully documented in conn_snapshot.sh.

conn_snapshot.sh -t Schema.Table -s ';'

Note: You will be prompted to enter the Oracle password. This is the password to the Oracle user name supplied during configuration. For more information, see Run conn_setup.sh.

3. SharePlex for Oracle - Start the post queue

Return to SharePlex for Oracle. Start the post queue so SharePlex for Oracle can send messages from the post queue to the JMS queue.

sp_ctrl ()>start post

Edit the List of Tables being Replicated

Use Case: Edit the List of Tables To Be Replicated

SharePlex Connector for Hadoop replicates the tables listed in the file ConfigFile. Follow these steps to edit this list of tables.

1. SharePlex for Oracle - Copy the config file

Use the SharePlex for Oracle copy config command to make a copy of ConfigFile.

TIP: Enter command /u01/app/shareplex/prod/bin > ./sp_ctrl to open the sp_ctrl ()> prompt.

sp_ctrl ()> copy config ConfigFile to NewConfigFile

2. Edit the new config file

Use the edit config command to edit the file NewConfigFile. The file is opened in vi. List the Oracle table(s) you want captured into Hadoop, one table per line.

sp_ctrl ()>edit config NewConfigFile

#######################

datasource: O.OracleSID

OracleSchema.OracleTable !jms[:TargetSchema.TargetTable] IPHostPostQueue[:PostQueueName]

#####################

Example line: soo70.G_AUTHORS !jms 10.20.26.28:q2

IPHostPostQueue is the name or IP address of the host on which the SharePlex post queue is running.

For more information on PostQueueName see Configure ActiveMQ to work with SharePlex.

TIP: To verify there are no errors in the config file run command sp_ctrl ()> verify config NewConfigFile.

3. Stop the post queue

SharePlex for Oracle uses the post queue to send messages to the JMS queue. Stop the post queue.

sp_ctrl ()>stop post

4. Activate the new config file

Activate the new config file.

sp_ctrl ()>activate config NewConfigFile

5. SharePlex™ Connector for Hadoop® - Run conn_snapshot.sh and/or conn_cdc.sh

I) Run conn_snapshot.sh

To enable HDFS Near Real Time Replication or HBase Real Time Replication execute the SharePlex Connector for Hadoop conn_snapshot.sh script.

Specify an Oracle table to replicate. This makes a copy of that Oracle table.

If you declared more than one Oracle table in ConfigFile in step 2, then run conn_snapshot.sh for only those tables which are added for replication.

Note: Take a snapshot of the tables that are added for replication before you start the post queue.

The conn_snapshot.sh script is fully customizable. It is fully documented in conn_snapshot.sh.

conn_snapshot.sh -t Schema.Table -s ';'

You will be prompted to enter the Oracle password. This is the password to the Oracle username supplied during configuration. Run conn_setup.sh

 

II) Run conn_cdc.sh

To start capturing change data on HDFS for an Oracle table, execute the SharePlex Connector for Hadoop conn_cdc.sh script.

If you declared more than one Oracle table in ConfigFile in step 2, then run conn_cdc.sh for only those tables which are added for capturing change data.

Note: Run the conn_cdc.sh script before you start the post queue.

The conn_cdc.sh script is fully customizable. It is fully documented in conn_cdc.sh (conn_cdc.sh).

Captures change data on HDFS for insert, update and delete operations.

conn_cdc.sh -t Schema.Table

Note: You will be prompted to enter the Oracle password. This is the password to the Oracle user name supplied during configuration. For more information, see Run conn_setup.sh.

6. Start the post queue

Start the post queue so SharePlex for Oracle can send messages from the post queue to the JMS queue.

sp_ctrl ()>start post
Verwandte Dokumente

The document was helpful.

Bewertung auswählen

I easily found the information I needed.

Bewertung auswählen