Pre-requisite: When replicating data to the Kafka (Apache and Confluent) target, target SharePlex should be installed only on the Linux platform.
The SharePlex Post process can connect and write to a Kafka broker. The data can be written in JSON or XML output as a sequential series of operations as they occurred on the source, which can then be consumed by a Kafka consumer.
These instructions contain setup instructions that are specific to this target. Install SharePlex on the source and target according to the appropriate directions in this manual before performing these setup steps.
For the versions, data types and operations that are supported when using SharePlex to replicate to this target, see the SharePlex Release Notes.
You need to setup SharePlex and the database on the Oracle source system. For detailed setup steps, see Configure SharePlex on the source.
These instructions configure the SharePlex Post process to connect to Kafka. You must have a running Kafka broker.
To configure post to Kafka:
Issue the target command to configure posting to a Kafka broker and topic. The following are example commands.
sp_ctrl> target x.kafkaset kafka broker=host1:9092,host2:9092,host3:9092
sp_ctrl> target x.kafka set kafka topic=shareplex
See View and change Kafka settings for command explanations and options.
Note: Specify more than one broker so that SharePlex will attempt to connect to the other brokers in the list if any one of them is down.
SharePlex can output to either XML or JSON format as input to Kafka. XML is the default. To set the input format and specify format options, use one of the following target commands:
target x.kafka set format record=json
or:
target x.kafka set format record=xml
To view samples of these formats, see the format category of the target command documentation in the SharePlex Reference Guide.
Note: When replicating data from Oracle to Kafka in JSON format, SharePlex does not support the varray data type or the varray type inside the SDO_GEOMETRY data type.
To view current property settings for output to Kafka, use the following target command:
target x.kafka show
To change a property setting, use the following command.
target x.kafka [queue queuename] set kafka property=value
where:
Table 3: Kafka target properties
Property | Input Value | Default | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
broker=broker |
Required. The host and port number of the Kafka broker, or a comma delimited list of multiple brokers. This list is the bootstrap into the Kafka cluster. So long as Post can connect to one of these brokers, it will discover any other brokers in the cluster. |
localhost:9092 | ||||||||
client_id=ID |
Optional. A user-defined string that Post will send in each request to help trace calls. |
None | ||||||||
compression.code={none, gzip, snappy} | Optional. Controls whether data is compressed in Kafka. Options are none, gzip or snappy. | None | ||||||||
partition={number | rotate | rotate trans| messagekey} |
Required. One of the following:
|
0 | ||||||||
request.required.acks=value |
Optional. This is a Kafka client parameter. By default it is set to a value of -1, which means all. Consult the Kafka documentation about this subject, because all really means all in-sync replicas. This parameter can be used in conjunction with the min.insync.replicas broker parameter to tune behavior between availability and data consistency. Important: It is possible for data to be lost between a Kafka producer (SharePlex in this case) and a Kafka cluster, depending on these settings. |
-1 | ||||||||
topic=topic_name |
Required. The name of the target Kafka topic. This string may contain the special sequences %o or %t. The %o sequence is replaced by the owner name of the table that is being replicated. The %t sequence is replaced by the table name of the table that is being replicated. This feature may be used in conjunction with a Kafka server setting of auto.create.topics.enabled set to 'true'. Also view your server settings for default.replication.factor and num.partitions because these are used as defaults when topics are auto created.
|
shareplex |
* To avoid latency, if Post detects no more incoming messages, it sends the packet to Kafka immediately without waiting for the threshold to be satisfied.
If the Kafka process aborts suddenly, or if the machine that it is running on aborts, row changes may be written twice to the Kafka topic. The consumer must manage this by detecting and discarding duplicates.
Every record of every row-change operation in a transaction has the same transaction ID and is also marked with a sequence ID. These attributes are id and msgIdx, respectively, under the txn element in the XML output (see Set up replication from Oracle to Kafka).
The transaction ID is the SCN at the time the transaction was committed, and the sequence ID is the index of the row change in the transaction. These two values are guaranteed to be the same if they are re-written to the Kafka topic in a recovery situation.
If desired, you can configure Post to include additional metadata with every row-change record by using the following command:
target x.kafka [queue queuename] set metadata property[, property]
Table 4: Optional metadata properties
Property | Description |
---|---|
time | The time the operation was applied on the source. |
userid | The ID of the database user that performed the operation. |
trans | The ID of the transaction that included the operation. |
size | The number of operations in the transaction. |
Example:
target x.kafka set metadata time, userid, trans, size
To reset the metadata:
target x.kafka [queue queuename] reset metadata
To view the metadata:
target x.kafka [queue queuename] show metadata
SharePlex can post replicated Oracle data to a file formatted as SQL or XML. This data is written as a sequential series of operations as they occurred on the source, which can then be posted in sequential order to a target database or consumed by an external process or program.
These instructions contain setup instructions that are specific to this target. Install SharePlex on the source and target according to the appropriate directions in this manual before performing these setup steps.
For the versions, data types and operations that are supported when using SharePlex to replicate to this target, see the SharePlex Release Notes.
On the source, create a SharePlex configuration file that specifies capture and routing information. The structure that is required in a configuration file varies, depending on your replication strategy, but this shows you the required syntax for routing data to a SQL or XML file.
Datasource:o.SID | ||
src_owner.table | !file[:tgt_owner.table] | host |
where:
src_owner.table is the owner and name of the source table.
Note: For more information, see Configure SharePlex to Replicate Data in the SharePlex Administration Guide.
The following example replicates the parts table in schema PROD from Oracle instance ora112 to a file on target system sysprod.
Datasource:o.ora112
PROD.parts !file sysprod
By default, SharePlex formats data to a file in XML format, and there is no target setup required unless you want to change properties of the output file (see Set up Replication from Oracle to a SQL or XML File.) To output in SQL format, use the target command to specify the SQL output as follows.
To output data in SQL format:
Issue the following required target commands to output the records in SQL.
Note: Use all lower-case characters.
target x.file [queue queuename] set format record=sql
target x.file [queuequeuename] set sql legacy=yes
where: queue queuename constrains the action of the command to the SharePlex Post process that is associated with the specified queue.
See Set up Replication from Oracle to a SQL or XML File for descriptions of these settings and other optional properties that you can set.
To view samples of the SQL and XML formats, see the target command documentation in the SharePlex Reference Guide.
To view current property settings for output to a file, use the following command:
target x.file show
To change a setting, use the following target command.
target x.file [queue queuename] set [category] property=value
For more information, see the target command in the Target.
Post writes to a series of files. The active working file is prepended with the label of current_ and is stored in the opx/current subdirectory of the variable-data directory.
Output Format | Name of Current File |
---|---|
SQL |
current_legacy.sql |
XML |
current_prodsys.XML |
Important: Do not open or edit the current_ file.
Post uses the max_records, max_size and max_time parameters to determine the point at which to start a new active file. When this switch occurs, Post moves the processed data to a sequenced file in the opx subdirectory of the variable-data directory. The file names include the name of the post queue, the time and date, and an incrementing ID.
SQL files:
/installed/vardir> ls -1 opx
0000000000_20140305130858_legacy.sql
0000000001_20140305131130_legacy.sql
0000000002_20140305131212_legacy.sql
0000000003_20140305133835_legacy.sql
0000000004_20140305134028_legacy.sql
XML files:
/installed/vardir> ls -1 opx
0000000000_20140305130858_prodsys.XML
0000000001_20140305131130_prodsys.XML
0000000002_20140305131212_prodsys.XML
0000000003_20140305133835_prodsys.XML
0000000004_20140305134028_prodsys.XML
To force a file switch:
The current file cannot be viewed or consumed without stopping Post. To access the data in the current file, you can use the target command with the switch option to move the data to a sequenced file, from which it can then be consumed or viewed. After issuing this command, the switch occurs after Post processes a new record.
target x.file [queue queuename] switch
SharePlex supports databases installed as services of Amazon Web Services (AWS) and Microsoft Azure. To view the cloud databases that SharePlex supports, refer to the Supported Cloud Platforms section in the SharePlex Release Notes document for the respective databases.
There are some differences in the way that SharePlex installs in an IaaS cloud environment, a PaaS cloud environment, and a SaaS cloud environment. These differences are only in the installation and configuration of SharePlex. Once installed and configured, SharePlex operates in the cloud the same way that it operates in on-premise installations.
If your cloud database service is a true IaaS virtual computing environment, you can install and run a custom application environment, access the operating system, and manage access permissions and storage. In this environment, SharePlex is installed directly on the cloud server just as you would install it locally, without any special setup requirements.
In this environment, the following applies:
SharePlex can capture from an Oracle source database in an IaaS cloud.
SharePlex can Post to any supported target database in an IaaS cloud.
You can proceed to the standard installation instructions in this manual.
If your cloud database is installed in a true PaaS environment, you do not have access to the underlying operating system, and you must install SharePlex on a server that is external to the cloud deployment. You then configure SharePlex to interact with the target database through a remote connection.
SharePlex can capture data from supported sources and post it to databases in PaaS environment using remote capture and remote post capabilities.
You can install SharePlex for a PaaS source and target in one of the following ways:
With remote post, you can use your on-premise production source server to run all of the SharePlex replication components. In this setup, both source and target replication processes (and their queues) are installed on one server. The SharePlex Post process connects through a remote connection to the target cloud database.
For more information, see Post to PaaS cloud from the source system.
NOTE: In a high-volume transactional environment, the buildup of data in the post queues and the presence of multiple Post processes may generate unacceptable overhead for a production system. In that case, you should use an intermediary server.
You can use an on-premise intermediary server to run the Import and Post components (and the post queues). Post connects to the cloud target through a remote connection. This method removes most of the replication overhead from the source server. For more information, see Post to PaaS cloud from an intermediary server.
If your cloud database is hosted in a true SaaS environment, you do not have access to the underlying operating system, and you must install SharePlex on a server that is external to the cloud deployment. You then configure SharePlex to interact with the source and target databases through a remote connection.
You can install SharePlex for a SaaS target in one of the following ways:
With remote capture, you can utilize an on-premise or cloud VM intermediary server to install and configure SharePlex for running the Capture and Export processes. Capture establishes a remote connection to the source SaaS database, while Export communicates with Import and Post on the target system.
With remote Capture and remote Post, you can employ an on-premise or cloud VM intermediary server to install and configure SharePlex. Both Capture and Post processes will run on the same server. Capture establishes a remote connection to the source SaaS database, and Post establishes a remote connection to the target SaaS database.
Install SharePlex on Linux/Unix for PostgreSQL Database as a Service.
© 2024 Quest Software Inc. ALL RIGHTS RESERVED. Feedback Conditions d’utilisation Confidentialité Cookie Preference Center