サポートと今すぐチャット
サポートとのチャット

SharePlex Connector for Hadoop 1.0 - Installation and Setup Guide

Additional - To replicate tables in Hive over HBase

Additional - To replicate tables in Hive over HBase

If you intend to replicate tables in Hive over HBase then complete the following additional setup steps.

Ensure the Zookeeper Quorum and client port are configured in $HIVE_HOME/conf/hive-site.xml.

<property>

    <name>hbase.zookeeper.quorum</name>

    <value> ---- PLEASE SPECIFY ---- </value>

    <description>A comma separated list (with no spaces) of the IP addresses of all ZooKeeper servers in the cluster.</description>

</property>

<property>

    <name>hbase.zookeeper.property.clientPort</name>

    <value> ---- PLEASE SPECIFY ---- </value>

    <description>The Zookeeper client port. Default clientPort is 2181.</description>

</property>

Use --auxpath while entering into Hive

hive --auxpath <HIVE_HOME>/lib/hive-hbase-handler-<version>.jar:<HBASE_HOME>/hbase.jar:<HBASE_HOME>/lib/zookeeper.jar:<HBASE_HOME>/lib/guava-<version>.jar

When using CDH4.2 with Kerberos authentication

Export the HIVE_OPTS environment variable with Kerberos parameters as shown below to replicate Hive over HBase.

export HIVE_OPTS="-hiveconf hbase.security.authentication=kerberos -hiveconf hbase.rpc.engine=org.apache.hadoop.hbase.ipc.SecureRpcEngine -hiveconf hbase.master.kerberos.principal=hbase/_HOST@KerberosRelm -hiveconf hbase.regionserver.kerberos.principal=hbase/_HOST@KerberosRelm -hiveconf hbase.zookeeper.quorum=zookeeperQuorum"

Tip: The Kerberos utility kinit allows you to identify yourself to the Kerberos server. “kinit” needs to be invoked (only once) if you are starting a new session.

SharePlex Connector for Hadoop

Unpack the Archive

SharePlex Connector for Hadoop is distributed in the archive: shareplex-hadoop-connector-version-date-clouderaVersion.tar.gz where version identifies the SharePlex Connector for Hadoop release.

Extract the archive with clouderaVersion appropriate to your Hadoop installation on a machine where Hadoop libraries and configurations are present. Use the command:

$tar -xzf shareplex-hadoop-connector-version-date-clouderaVersion.tar.gz

The archive contains the following files:

File Description
install.sh

A shell script that installs SharePlex Connector for Hadoop and other programs in the archive.

shareplex-hadoop-connector-version-date.tar SharePlex Connector for Hadoop archive.
db-derby-10.9.1.0-bin.tar.gz

Apache Derby installable.

SharePlex Connector for Hadoop uses the Apache Derby network server and creates a database for storing metadata and status information.

sqoop-quest-1.4.3.bin__hadoop-version.tar.gz

Apache Sqoop installable.

Sqoop is a tool designed to transfer bulk data between Apache Hadoop and structured data stores.

quest-oraoop-1.6.0-date-version.tar.gz

Quest Data Connector for Oracle and Hadoop archive.

Quest Data Connector for Oracle and Hadoop is an optional plugin to Sqoop. It facilitates the movement of data between Oracle and Hadoop.

Run install.sh

Run install.sh

This shell script installs programs in the SharePlex Connector for Hadoop archive.

Shell Script Usage

[user@host bin]$ ./install.sh -d <INSTALL_DIR> [-h <HADOOP_HOME_DIR>] [-c <HADOOP_CONF_DIR>] [-b <HBASE_HOME_DIR>] [-v <HIVE_HOME_DIR>] [-p <DERBY_PORT_NUMBER>] [--help] [--version]

Options

Parameter

Description

-d <INSTALL_DIR>

SharePlex Connector for Hadoop will be installed in the shareplex_hadoop_connector directory inside this directory.

-h <HADOOP_HOME_DIR>

The path to the Hadoop home directory.

NOTE: This option overrides HADOOP_HOME in the environment. If this option is not set and the HADOOP_HOME environment variable is also not set, this parameter is set to /usr/lib/hadoop as default.

-c <HADOOP_CONF_DIR>

The path to the Hadoop conf directory.

NOTE: This option overrides HADOOP_CONF_DIR in the environment. If this option is not set and the HADOOP_CONF_DIR environment variable is also not set, this parameter is set to

  • HADOOP_HOME/conf folder (for CDH3)
  • HADOOP_HOME/etc/hadoop (for CDH4).

-b <HBASE_HOME_DIR>

The path to HBase home directory.

NOTE: This option overrides HBASE_HOME in the environment. If this option is not set and the HBASE_HOME environment variable is also not set, this parameter is set relative to HADOOP_HOME.

-v <HIVE_HOME_DIR>

The path to Hive home directory.

NOTE: This option overrides HIVE_HOME in the environment.

-p <DERBY_PORT_NUMBER>

The port number for the Apache Derby connection.

NOTE: If not specified, this parameter is set to 1527

--help

Show this help and exit.

--version

Show version information and exit.

While install.sh is executing

Please accept the Apache license to install Quest Data Connector for Oracle and Hadoop.

When install.sh has finished executing

install.sh starts the Apache Derby network server.

About the new shareplex_hadoop_connector directory

Files and Directories

Description

bin

SharePlex Connector for Hadoop shell scripts as documented in this guide.

In addition shareplex_hadoop_env.sh is used by SharePlex Connector for Hadoop. You can set the environment variables by executing source shareplex_hadoop_env.sh

conf

SharePlex Connector for Hadoop configuration files.

db-derby-version-bin

Apache Derby application.

lib

SharePlex Connector for Hadoop required dependencies.

logs

SharePlex Connector for Hadoop log files.

shareplex_hadoop_connector.jar

The Java archive file containing SharePlex Connector for Hadoop application code.

oraoop-version

Quest Data Connector for Oracle and Hadoop application

sqoop-quest-version.bin

Apache Sqoop application
関連ドキュメント

The document was helpful.

評価を選択

I easily found the information I needed.

評価を選択