Chat now with support
Chat with Support

SharePlex Connector for Hadoop 8.5.6 - Installation Guide

Set the Number of Operations in the Transaction

Set the total number of operations in the transaction

Execute the following SharePlex for Oracle command. SharePlex Connector for Hadoop will use the information from this command to ensure the integrity of the data inside of checking for duplicate and missing messages.

sp_ctrl ()> target x.jms set metadata size

TIP: You can confirm the setting by running the target x.jms show command which will show size as a metadata parameter set on the queue. Refer to the SharePlex for Oracle documentation for more information. 

Tune JMS

Tune JMS to improve performance of SharePlex Connector for Hadoop

JMS (ActiveMQ) parameters can be tuned to improve the overall performance of SharePlex Connector for Hadoop. Results may vary depending on the further tuning and actual availability of resources.

Following are some of the parameters in “ACTIVEMQ_HOME/conf/activemq.xml” that can be tuned to get better performance:

  1. Increase “queue memory limit” and turn off “producer flow control”.

    Example: <policyEntry queue=">" producerFlowControl="false" memoryLimit="512mb">

    Note: The value of the “queue memory limit” parameter should be less than the “memory usage limit” parameter.

  2. Increase “memory usage limit” of the broker.

    Example: <memoryUsage limit="1 gb"/>

    Note: Take into consideration the available system memory while setting these parameters. The Memory Usage Limit parameter should not exceed the max memory allowed for JMS (ActiveMQ) process. You can change the Max memory allowed for JMS ActiveMQ by altering the “–Xmx” parameter in “activemq-admin” or “activemq” scripts under “ACTIVEMQ_HOME/bin” with which ActiveMQ gets started (look for ACTIVEMQ_OPTS or ACTIVEMQ_OPTS_MEMORY).

  3. Use “File based Cursor”: <pendingQueuePolicy> <fileQueueCursor/> </pendingQueuePolicy> The File based Cursor can page messages to temporary files on the disk when memory in the broker reaches its limit.
  4. If default KahaDB is getting used as a persistent store, then setting enableJournalDiskSyncs to “false” can result in reduction in time while sending messages from SharePlex Poster to ActiveMQ JMS queue.

    Example: <persistenceAdapter> <kahaDB directory="${activemq.data}/kahadb" enableJournalDiskSyncs="false"/> </persistenceAdapter>

Replicate Tables in Hive

Replicate tables in Hive over HBase

Replicate tables in Hive over HBase

If you intend to replicate tables in Hive over HBase then complete the following additional setup steps.

IMPORTANT NOTE: When using connector on IBM BI Hadoop and CDH 5 distributions, jar files specified in this scenario must be appended to HIVE_AUX_JARS_PATH in HIVE_CONF_DIR/hive-env.sh file.

Ensure the Zookeeper Quorum and client port are configured in $HIVE_HOME/conf/hive-site.xml.

<property>

    <name>hbase.zookeeper.quorum</name>

    <value> ---- PLEASE SPECIFY ---- </value>

    <description>A comma separated list (with no spaces) of the IP addresses of all ZooKeeper servers in the cluster.</description>

</property>

<property>

    <name>hbase.zookeeper.property.clientPort</name>

    <value> ---- PLEASE SPECIFY ---- </value>

    <description>The Zookeeper client port. Default clientPort is 2181.</description>

</property>

Use --auxpath while entering into Hive

hive --auxpath <HIVE_HOME>/lib/hive-hbase-handler-<version>.jar,<HBASE_ HOME>/hbase.jar,<HBASE_HOME>/lib/zookeeper.jar,<HBASE_HOME>/lib/guava-<version>.jar

When using CDH4 or CDH 5 with Kerberos authentication

Export the HIVE_OPTS environment variable with Kerberos parameters as shown below to replicate Hive over HBase.

export HIVE_OPTS="-hiveconf hbase.security.authentication=kerberos -hiveconf hbase.rpc.engine=org.apache.hadoop.hbase.ipc.SecureRpcEngine -hiveconf hbase.master.kerberos.principal=hbase/_HOST@KerberosRelm -hiveconf hbase.regionserver.kerberos.principal=hbase/_HOST@KerberosRelm -hiveconf hbase.zookeeper.quorum=zookeeperQuorum"

TIP: The Kerberos utility kinit allows you to identify yourself to the Kerberos server. “kinit” needs to be invoked (only once) if you are starting a new session.

Related Documents

The document was helpful.

Select Rating

I easily found the information I needed.

Select Rating