Tchater maintenant avec le support
Tchattez avec un ingénieur du support

SharePlex Connector for Hadoop 8.5.5 - SharePlex Connector for Hadoop Installation Guide

conn_cdc.sh

Use this script to start capturing the change data for insert, update, delete and before update operations on HDFS for a specified table name.

Note: For conn_cdc script to run, change data capture feature must be enabled in setup script. Capturing change data for a table will start only when user executes conn_cdc script for that table.

Shell Script Usage

conn_cdc.sh -t <TABLE_OWNER.TABLE_NAME> [-i] [-u] [-d] [-b] [--help] [--version]

Options

Parameter

Description

-t <TABLE_OWNER.TABLE_NAME>

Name and owner of the table to import from Oracle. Required.

-i If specified, enables change data capture for insert operation.
-u If specified, enables change data capture for update operation.
-d If specified, enables change data capture for delete operation.
-b If specified, enables change data capture for before update operation values.

--help

Show this help and exit.

--version

Show version information and exit.

Example

[user@host bin]$ ./conn_cdc.sh -t Schema.Table

Use Cases

 
If none of these parameters [-i] [-u] [-d] [-b] are provided, then all captured changes (insert, update and delete) will be recorded by default. Before updates will not be recorded. conn_cdc.sh -t Schema.Table
Capture change data for only insert operation. Rest of the operations will not be recorded. conn_cdc.sh -t Schema.Table -i
Capture change data for only update and delete operation. Insert and before update operation will not be recorded. conn_cdc.sh -t Schema.Table –u -d
Capture change data for update and before update operation. Rest of the operations will not be recorded. conn_cdc.sh -t Schema.Table –u -b

Note: Execute conn_cdc script again for the same table if you want to change the preference of capturing different operations.

 

Accessing captured change data (CDC) data written on HDFS

Change data written on HDFS for a particular table can be accessed using two ways as mentioned below.

  1. Hive external tables pointing to CDC data on HDFS.

    SharePlex Connector for Hadoop creates external tables in Hive for CDC feature. External Hive table(s) gets created under Hive database same as Oracle table owner / schema. Each external Hive table points to data stored for that table on CDC HDFS destination directory.

    SharePlex Connector for Hadoop does the mapping of Oracle data types to appropriate data types in external Hive table as shown below. This allows user to make use of built-in Hive UDFs available with the specific Hive data types.

    Oracle data type Hive data type
    CHAR STRING
    VARCHAR2 STRING
    NCHAR STRING
    NVARCHAR2 STRING
    NUMBER DOUBLE
    FLOAT DOUBLE
    DATE TIMESTAMP
    TIMESTAMP TIMESTAMP
    TIMESTAMP WITH TIME ZONE STRING
    TIMESTAMP WITH LOCAL TIME ZONE STRING
    INTERVAL YEAR TO MONTH STRING
    INTERVAL DAY TO SECOND STRING
    ROWID STRING

    Note: If any of Oracle table’s column data type is modified, Hive data type for that particular column is mapped to STRING.

  2. Custom Map-Reduce programs

    SharePlex Connector for Hadoop generates change-data-records (Java utility classes) per CDC table using which custom Map-Reduce programs can read CDC data stored on HDFS. Change a data record (utility classes) gets generated under “lib” directory as well as these classes gets added in lib\change-data-records.jar file.

conn_ctrl.sh

conn_ctrl.sh

Use this script to start / restart / shutdown SharePlex Connector for Hadoop and reload SharePlex Connector for Hadoop configurations.

Shell Script Usage

conn_ctrl.sh [start|restart|shutdown|reload-conf|--help|--version|--status]

Options

Parameter

Description

start

Start SharePlex Connector for Hadoop.

restart

Stop and start SharePlex Connector for Hadoop.

shutdown

Stop SharePlex Connector for Hadoop.

reload-conf

Reload some of the SharePlex Connector for Hadoop configurations without restarting SharePlex Connector for Hadoop. These configurations include

  • HDFS Near Real Time Replication changes and time interval.

--help

Show this help and exit.

--version

Show version information and exit.

--status Show the status (started / shutdown) of SharePlex Connector for Hadoop.

Example

[user@host bin]$ ./conn_ctrl.sh

conn_monitor.sh

conn_monitor.sh

Use this script to show a list of the tables being replicated. Monitor the status of a table under replication.

Shell Script Usage

conn_monitor.sh [--summary|--status <TABLE_OWNER.TABLE_NAME> [--hdfs|--hbase|--cdc]|--help|--version]

Options

Parameter

Description

--summary

Show a summary of the tables under replication.

--status <TABLE_OWNER.TABLE_NAME> [--hdfs|--hbase|--cdc]

Show the status of the given table.

Include the --hdfs option (optional) to limit the display to HDFS Near Real Time Replication.

Include the --hbase option (optional) to limit the display to HBase Real Time Replication.

Include the --cdc option (optional) to limit the display to Change Data Capture (CDC) feature.

--help

Show this help and exit.

--version

Show version information and exit.

conn_cleanup.sh

conn_cleanup.sh

Use this script to delete all data (HDFS, HBase and Hive) from a table under replication by SharePlex Connector for Hadoop.

Shell Script Usage

conn_cleanup.sh -t <TABLE_OWNER.TABLE_NAME> [-h <HIVE_HOME_DIR>] [--help] [--version]

Options

Parameter

Description

-t <TABLE_OWNER.TABLE_NAME>

Name and owner of the table data to cleanup.

-h <HIVE_HOME_DIR>

Path to the Hive home directory.

If not specified the value of the HIVE_HOME environment variable is used. If this option is not set and HIVE_HOME environment variable is also not set, this parameter will be set as relative to HADOOP_HOME.

--help

Show this help and exit.

--version

Show version information and exit.

Example

[user@host bin]$ ./conn_cleanup.sh -t Schema.Table

Documents connexes

The document was helpful.

Sélectionner une évaluation

I easily found the information I needed.

Sélectionner une évaluation