立即与支持人员聊天
与支持团队交流

SharePlex 12.1 - Administration Guide

About this Guide Conventions used in this guide Overview of SharePlex Run SharePlex Run multiple instances of SharePlex Execute commands in sp_ctrl Set SharePlex parameters Configure data replication Configure replication to and from a container database Configure named queues Configure partitioned replication Configure replication to a change history target Configure a replication strategy Configure DDL replication Configure error handling Configure data transformation Configure security features Assign SharePlex users to security groups Start replication on your production systems Monitor SharePlex Prevent and solve replication problems Repair out-of-sync data Tune the Capture process Tune the Post process Recover replication after Oracle failover Make changes to an active replication environment Apply an Oracle application patch or upgrade Back up Oracle data on the source or target Troubleshooting Tips Appendix A: Peer-To-Peer Diagram Appendix B: SharePlex environment variables

Build a configuration file using a script

SharePlex provides the following scripts to automate the building of a configuration file to specify Oracle source objects.

  • config.sql: configure all tables and optionally all sequences in the database.
  • build_config.sql: configure multiple or all tables in a schema

Supported databases

Oracle

Use config.sql

The config.sql script enables you to build a configuration that lists all of the tables, and optionally all of the sequences, in all of the schemas of a database. This script saves time when establishing a high-availability replication strategy or other scenario where you want the entire database to be replicated to an identical secondary database.

Conditions for using config.sql
  • Source and target table names must be the same.

  • The script does not configure objects in the SYS, SYSTEM, and SharePlex schemas. These schemas cannot be replicated since they are system and/or instance-specific.
  • The script does not support partitioned replication. You can use the copy config command to copy the configuration file that the script builds, then use the edit config command to add entries for tables that use partitioned replication. Activate the new configuration file, not the original one.
  • You can use the edit config command to make any other changes as needed after the configuration is built.

To run config.sql:

  1. Change directories to the config sub-directory of the SharePlex variable-data directory. The config.sql script puts configurations in the current working directory, and SharePlex configurations must reside in the config sub-directory.

    cd /vardir/config

  2. Log onto SQL*Plus as SYSTEM.
  3. Run config.sql using the full path from the util sub-directory of the SharePlex product directory.

    @ /proddir/util/config.sql

Refer to the following table when following the prompts:

Prompt What to enter
Target machine The name of the target machine, for example SystemB.
Source database SID The ORACLE_SID of the source (primary) Oracle instance, for example oraA. Do not include the o. keyword. The ORACLE_SID is case-sensitive.
Target database SID The ORACLE_SID of the target (destination) Oracle instance, for example oraB. Do not include the o. keyword. The ORACLE_SID is case-sensitive.
Replicate sequences Enter y to replicate sequences or n not to replicate sequences.
SharePlex oracle username The name of the SharePlex user in the source database. This entry prevents the SharePlex schema from being replicated, which would cause replication problems. If a valid name is not provided, the script fails.

Note: The name assigned by SharePlex to the configuration is config.file. If you run the script again to create another configuration file, it overwrites the first file. To preserve the original file, rename it before you create the second one.

Next steps:

  • If any tables or owners are case-sensitive, open the configuration file with the edit config command in sp_ctrl, then use the text editor to enclose case-sensitive table and owner names within double-quote marks, for example “scott”.“emp”. The script does not add the quote marks required by Oracle to enforce case-sensitivity.

    sp_ctrl> edit config filename

  • To ensure that the configuration is in the correct location, issue the list config command. If the name of the configuration is not shown, it was created in the wrong directory. Find the file and move it to the config sub-directory of the variable-data directory.

    sp_ctrl> list config

Use build_config.sql

The build_config.sql script enables you to build a configuration that contains multiple (or all) tables in a schema. It is an interactive script that prompts for each component of the configuration step by step. Instead of entering the information for each object and the routing individually, you can use a wildcard to select certain tables at once, or you can select all of the tables in the schema.

Conditions for using build_config.sql
  • Source and target table names must be the same.
  • The script does not support sequences. Before you activate the configuration that the script builds, you can use the edit config command in sp_ctrl to add entries for sequences.
  • The script does not support partitioned replication. You can use the copy config command to copy the configuration that the script builds, then use the edit config command to add entries for the tables that use partitioned replication. Activate the new configuration, not the original.
  • The script does not configure objects in the SYS, SYSTEM, and SharePlex schemas. These schemas cannot be replicated since they are system and/or instance-specific.
  • You can run build_config.sql for different schemas, then combine those configurations into one configuration by using a text editor. Make certain to eliminate all but one Datasource:o.SID line, which is the first non-commented line of the file. Do not move the file out of the config sub-directory.
  • You can use the edit config command to make any other changes as needed after the configuration is built.

To run build_config.sql:

  1. Change directories to the config sub-directory of the SharePlex variable-data directory. The build_config.sql script puts configurations in the current working directory, and SharePlex configurations must reside in the config sub-directory.

    cd /vardir/config

  2. Log onto SQL*Plus as SYSTEM.
  3. Run build_config.sql using the full path from the util sub-directory of the SharePlex product directory.

    @ /proddir/util/build_config.sql

Refer to the following table when following the prompts.

Prompt What to enter
Target machine The name of the target machine, for example SystemB.
Source database SID The ORACLE_SID of the source (primary) Oracle instance, for example oraA. Do not include the o. keyword. The ORACLE_SID is case-sensitive.
Target database SID The ORACLE_SID of the target (destination) Oracle instance, for example oraB. Do not include the o. keyword. The ORACLE_SID is case-sensitive.
Owner of the source database tables The owner of the source tables.
Owner of the target database tables The owner of the target tables.
Table name to include (blank for all)

Do one of the following:

  • Press Enter to accept the default, which selects all tables that belong to the source owner.
  • Enter a wildcard (%) character and a string to select certain tables, for example %e_salary%.
  • Enter an individual table name.
Name of the output file to create A name for the configuration. The script gives the file a .lst suffix, for example Scott_config.lst.

Next steps:

  • If any tables or owners are case-sensitive, open the configuration with the edit config command in sp_ctrl, then use the text editor to enclose case-sensitive table and owner names within double-quote marks, for example “scott”.“emp”. The script does not add the quote marks required by Oracle to enforce case-sensitivity.

    sp_ctrl> edit config filename

  • To ensure that the configuration is in the correct location, issue the list config command. If the name of the configuration is not shown, it was created in the wrong directory. Find the file and move it to the config sub-directory of the variable-data directory.

    sp_ctrl> list config

Replicate data in AVRO format with schema registry and JDBC sink connector

SharePlex supports data replication in an AVRO format with schema registry and JDBC sink connector.

Supported source and target

Oracle to Kafka

Pre-requisites:

  • Enable supplemental logging for all columns.

  • Tables in replication should have key(s) defined.

  • Set the SP_OCT_REDUCED_KEY parameter to 0 to ensure that the before-image of all columns, except for LONGs/LOBs, is available with each UPDATE or DELETE record.

  • For a basic implementation, both key and value schemas must be used to minimize or eliminate the need for transformation in the JDBC sink connector.

  • Each table should have a corresponding Kafka topic, with a separate JDBC sink connector configured for each table.

  • The table name should match the Kafka topic name. The naming convention for Kafka topics should follow the format schema.tablename.

  • SharePlex supports the Kerberos (SASL_PLAINTEXT) and SASL PLAIN authentications for Oracle to Confluent Kafka replication with AVRO format.

Data replication procedure in AVRO format

  1. SharePlex captures data from the source Oracle database and replicates it to Kafka.

  2. A JDBC Sink Connector reads the data from the Kafka topic and replicates it to the target Oracle database.

Schema registry and format

  • Support for the JDBC Sink Connector with AVRO format requires using a schema registry.

  • Instead of a "before" and "after" value format:

    • The Key Schema contains an AVRO record with the table's key columns as fields.

    • The Value Schema includes the table's non-key columns as fields.

    • This format allows direct consumption with minimal or no transformation in the JDBC Sink Connector.

Kafka property schema.registry.url with default value empty will be introduced. It need to be set with its proper value using target x.kafka set kafka schema.registry.url = <url value>.

Using the JDBC Sink connector

Kafka topic configuration

  • Each table should have a separate Kafka topic, and each topic must have a corresponding JDBC Sink Connector.

  • The Kafka topic name should match the table name, following the naming convention schema.tablename.

Support for Delete and Update operations

Delete operations: To enable delete operations with the JDBC Sink Connector, the record key is required. It can be retrieved directly from the Kafka topic using the key schema, or extracted from the value schema using built-in transformations if the key schema is not used. The delete.enabled parameter must be enabled in the JDBC Sink Connector configuration.

Update operations: Since updates are not represented in "before" and "after" values, updates on key columns are handled with two events: first, a delete event with the old key values, followed by an insert event with the new key values.

LOB columns and field filtering

LOB columns: LOB columns cannot be part of keys in SharePlex. If a non-LOB column is updated, the data for LOB columns will not be available in the before or after image. In such cases, a placeholder value _splex_default will be used (excluding binary LONGs/LOBs mapped to AVRO bytes such as LONG RAW, BLOB). These columns can be conditionally filtered out using the ReplaceField transform with predicates.

Following is an example of transform and predicate configuration for JDBC Sink connector:

transforms.col_clob.type = org.apache.kafka.connect.transforms.ReplaceField$Value

transforms.col_clob.exclude = COL_CLOB

transforms.col_clob.predicate = col_clob

transforms.col_long.type = org.apache.kafka.connect.transforms.ReplaceField$Value

transforms.col_long.exclude = COL_LONG

transforms.col_long.predicate = col_long

predicates.col_clob.type = de.denisw.kafka.connect.jmespath.MatchesJMESPath$Value

predicates.col_clob.query = COL_CLOB == '_splex_default'

predicates.col_long.type = de.denisw.kafka.connect.jmespath.MatchesJMESPath$Value

predicates.col_long.query = COL_LONG == '_splex_default'

Filtering fields: Filtering specific fields is not supported by built-in transforms alone. The kafka-connect-jmespath plugin, an add-on, must be installed to enable this functionality.

Supported DDL operations

SharePlex supports the DDL operations like `ALTER TABLE ADD or DROP COLUMN`. The schema evolves based on these operations. However, schema changes should not be automatically replicated to the target using the JDBC Sink Connector to avoid incorrect datatype creation. Thus, auto.create and auto.evolve settings in the JDBC Sink Connector should be disabled. Changes must be applied manually to the target table, and the connector should be restarted afterward.

Configuring Kafka properties

The Kafka property schema.registry.url (default value empty) must be set correctly using target x.kafka set kafka schema.registry.url = <url value>.

The AVRO property datetime_logical (default is `yes`) determines whether datetime datatypes are mapped to AVRO logical datatypes or to AVRO strings. If set to `no`, formats need to be specified according to Oracle DB's NLS formats (Examples: NLS_DATE_FORMAT, NLS_TIMESTAMP_FORMAT).

AVRO property datetime_logical with default yes will be introduced. If set to yes, datetime datatypes will be mapped to respective AVRO logical datatypes otherwise will be mapped to AVRO string datatype. If it is set to no, respective data formats need to be set. These formats need to be derived from Oracle DBs NLS formats like NLS_DATE_FORMAT, NLS_TIMESTAMP_FORMAT and NLS_TIMESTAMP_TZ_FORMAT. Formatting keywords are case insensitive.

The following keywords can be part of these formats, which can be in uppercase or lowercase:

  • DD: Represents a two-digit date value (example: 25).

  • MM: Represents a two-digit month value (example: 12).

  • MON: Represents a three-letter month value (example: JAN).

  • YYYY: Represents a four-digit year value (example: 2024).

  • YY: Represents the last two digits of the year value (example: 24).

  • HH: Represents a two-digit hour value. If AM/PM is not present, it will be represented in a 24-hour format; otherwise, in a 12-hour format (example: 10).

  • MI: Represents a two-digit minute value.

  • SS: Represents a two-digit seconds value.

  • F: Represents a fractional value; the number of digits will depend on the source datatype precision (example: 123456).

  • AM/PM: Represents the meridian indicator (example: AM).

  • Z: Represents the time zone. It can be in the form of an offset value or a region name, depending on what is received from the source (example: +05:30 or ASIA/CALCUTTA).

Example of setting parameters for AVRO formatting

  • datetime_logical = no

  • date = dd-mon-yyyy

  • timestamp = dd-mon-yyyy hh:mi:ss.f am

  • timestamptz = dd-mon-yyyy hh:mi:ss.f am z

JDBC Sink connector configuration example

Once the Kafka topic and key-value AVRO schemas are created, configure the JDBC Sink Connector. Below is an example configuration file that can be uploaded to Confluent Kafka Control Center's Connect section:

File name: `connector_DEMO_BASIC_DT_config.properties`

Content:

name = DEMO_BASIC_DT

connector.class = io.confluent.connect.jdbc.JdbcSinkConnector

key.converter = io.confluent.connect.avro.AvroConverter

value.converter = io.confluent.connect.avro.AvroConverter

topics = <schema name>.DEMO_BASIC_DT

connection.url = jdbc:oracle:thin:@10.250.14.177:1521:p19c

connection.user = <user name>

connection.password = <password>

dialect.name = OracleDatabaseDialect

insert.mode = upsert

delete.enabled = true

table.name.format = ${topic}

pk.mode = record_key

pk.fields = COL_SMALLINT, COL_INT

quote.sql.identifiers = never

value.converter.schema.registry.url = http://localhost:8081

key.converter.schema.registry.url = http://localhost:8081

If column names are different on Oracle source and Oracle target, then need to add transform for that particular column in Control Center

Example: "transforms.col_char.renames": "COL_CHAR:col_char"

Limitations:
  • An UPDATE on a key column results in a DELETE operation followed by the INSERT. If the UPDATE does not include data for LOB columns, those columns will become empty.

  • The Confluent kafka-connect-jmespath predicate plugin, used for filtering fields, does not support the AVRO BYTES datatype. Therefore, binary LONGs/LOBs (such as LONG RAW and BLOB) mapped to AVRO BYTES cannot be filtered out. When such columns lack data, they are populated with the value _splex_default during updates, which must be filtered out using appropriate transform and predicate configurations.

  • Commit operations are not directly posted as commits. They are handled by the JDBC Sink Connector and are configurable with JDBC Sink Connector configurations, which commits after a configurable limit is reached.

  • TRUNCATE operations are not supported by the JDBC Sink Connector. The Poster will ignore any TRUNCATE transactions.

  • DateTime Handling with AVRO parameter datetime_logical set to yes using the Target command datetime_logical = yes:

    • Timestamp precision is supported up to milliseconds.

    • Timestamps with timezones do not retain timezone information and must be mapped to TIMESTAMP.

    • DATE and TIMESTAMP can store values only after 01-Jan-1970. If earlier values are received, processing stops with an error.

  • If an Oracle Date datatype includes TIME, the JDBC Sink Connector will fail to insert the record in the Oracle target database when datetime_logical is set to `no` and the date format includes time.

Configure Cloud storage targets for SharePlex

This section explains how to set up and tune SharePlex for the cloud storage targets, including prerequisites, supported storage providers, compression and encryption options, and the necessary configuration parameters.

Configure Parquet record format as target

Pre-requisite

To enable efficient processing of Parquet files by columnar query engines, the Update and Delete operations must include image data for all columns. Therefore, it is recommended to set the parameters SP_OCT_REDUCED_KEY and SP_OPX_REDUCED_KEY to 0. This ensures that SharePlex captures all data written to the Oracle redo log, excluding LONG and LOB data.

Additionally, Oracle supplemental logging for all columns must be enabled to ensure that complete column data is written to the redo log.

Supported storage locations

Cloud storage

  • Azure Blob storage

  • AWS- S3

Supported compression

SNAPPY, UNCOMPRESSED, GZIP, BROTLI, ZSTD, LZ4, LZ4_HADOOP

Supported encryption

Algorithm AES_GCM

Note: The encryption key must be 16, 24, or 32 bytes long.

Supported Parquet column encoding

PLAIN, RLE, BIT_PACKED, DELTA_BINARY_PACKED, DELTA_LENGTH_BYTE_ARRAY, DELTA_BYTE_ARRAY, BYTE_STREAM_SPLIT

Note: Writing LOB data in Parquet format requires accumulating all the data for a particular LOB column in memory first, and then writing it all at once to the file. This causes a spike in memory consumption, especially for source tables containing large LOB data.

Parameters for Parquet format

Setting the format to Parquet:

To set the format record to Parquet, use the following command:

target x.file set format record=parquet

Setting the above command enables the following parameters for the Parquet file

Queue (default)

File Target Parameters:

location = opx

max_records = 50000

max_size = 50

max_time = 300

record_length = 132

Data Formatting Parameters:

date = yyyy-MM-ddTHH:mm:ss

decimal = .

enotation = 14

record = parquet

Parquet Formatting Parameters:

cloud_storage_provider = <cloud storage provider>

compression_type = SNAPPY

encoding_type = PLAIN

encryption = no

encryption_key = ******

json_metadata = yes

ora_date_precision = millisecond

tz_to_utc_convert = yes

Note: record_length is not used for Parquet files.

Parameters for Azure Blob Storage:

To set Azure as the cloud storage provider, use the following command:

target x.file set parquet cloud_storage_provider=azure

This enables the section to set the required parameters for establishing a connection:

Azure Blob Storage Parameters:

azure_access_key = ******

azure_container_name = <Azure storage container name>

azure_storage_account = <Azure storage account name>

Note: The Azure access key is located under the Access keys section in Security + Networking in Azure Blob Storage.

Parameters for AWS S3 Bucket:

To set AWS as the cloud storage provider, use the following command:

target x.file set parquet cloud_storage_provider=aws

Setting the above command enables the section to set the required parameters for establishing a connection.

AWS S3 Bucket Parameters:

aws_s3_bucket_region = <AWS S3 bucket region>

aws_secret_key = ******

aws_access_key = ******

aws_s3_bucket_name = <AWS S3 bucket name>

 

Notes:

  • The AWS access key and secret key belong to an IAM user with access to the required S3 bucket.

  • SharePlex stores key values in an encrypted format using standard security algorithms.

Schema record template

required group field_id=-1 schema {

optional double field_id=-1 COL_INT;

optional double field_id=-1 COL_FLOAT;

optional binary field_id=-1 COL_CHAR (String);

optional binary field_id=-1 COL_VARCHAR (String);

optional int64 field_id=-1 COL_DATE (Timestamp(isAdjustedToUTC=false, timeUnit=milliseconds, is_from_converted_type=false, force_set_converted_type=false));

optional int64 field_id=-1 COL_TIMESTAMP (Timestamp(isAdjustedToUTC=false, timeUnit=microseconds, is_from_converted_type=false, force_set_converted_type=false));

optional binary field_id=-1 COL_CLOB (String);

optional binary field_id=-1 COL_NCLOB (String);

optional binary field_id=-1 COL_STRING (String);

optional binary field_id=-1 COL_BYTE_ARRAY_RAW;

optional binary field_id=-1 COL_BLOB;

optional binary field_id=-1 COL_LONG (String);

required binary field_id=-1 spx_op (String);

optional binary field_id=-1 spx_update_fields (String);

To set up replication from Oracle to a Parquet file, see the Set up Replication from Oracle to a Parquet file section in the SharePlex Installation and Setup Guide.

Configure HA failover with PostgreSQL 17 with logical replication

With the introduction of failover slots in PostgreSQL 17, logical replication can now be effectively utilized in high-availability environments. This feature ensures that logical replication continues seamlessly even if the publisher node fails and its physical standby is promoted as the new publisher.

When a primary server fails, the failover slot on the standby can take over, ensuring that logical replication continues without data loss. By setting the following parameters, the SharePlex capture process should proceed seamlessly without manual intervention in the event of a failover, and the slot will be managed by the cluster setup. To configure HA failover with SharePlex, users need to set the SP_CAP_MAX_RETRY_COUNT_PG and SP_CAP_RETRY_INTERVAL_PG parameters.

The following steps are useful for enabling failover in high-availability (HA) environments, both in AWS RDS and on-premises deployments:

  • Define the logical slots property: failover = true`.

  • Add the streaming replication slot to the new parameter synchronized_standby_slots.`

  • Set up a PostgreSQL cluster manager, such as Patroni, repmgr, or CrunchyData, to manage and orchestrate the PostgreSQL cluster. (Applicable only for on-premises environment)

Note: The HA failover capability is also compatible with AWS RDS PostgreSQL Multi-AZ DB deployments.

Auto-restart the PostgreSQL Capture

This feature ensures that the PostgreSQL capture process continues seamlessly in the event of disconnections from the database, by retrying based on parameter values without manual intervention.

SP_CAP_MAX_RETRY_COUNT_PG

This parameter controls the maximum number of times the PostgreSQL Capture process will attempt an automatic restart when it encounters a disconnection with the PostgreSQL database. The same parameter value will be used to retry the database recovery check during the CAPTURE automatic restart.

Default: 0

Range of valid values: any positive integer between 0 to 65535

Takes effect: Immediately

SP_CAP_RETRY_INTERVAL_PG

This parameter controls how often CAPTURE attempts an auto-restart.

NOTE: Set the retry count and interval values based on how long the PostgreSQL instance takes to become ready for use.

Default: 2

Range of valid values: any positive integer between 0 to 65535

Takes effect: Immediately

Capture and Event Log Example for Capture Auto-Restart

On the SharePlex source, when the SP_CAP_MAX_RETRY_COUNT_PG parameter is set to 20 and the SP_CAP_RETRY_INTERVAL_PG parameter is set to 5, users can observe the following log entries in the Capture and Event logs indicating the auto-restart of Capture.

Note: The values of the SP_CAP_MAX_RETRY_COUNT_PG and SP_CAP_RETRY_INTERVAL_PG parameters can be configured according to the specific needs of the customer's environment.

相关文档

The document was helpful.

选择评级

I easily found the information I needed.

选择评级