If you intend to replicate tables in Hive over HBase then complete the following additional setup steps.
Ensure the Zookeeper Quorum and client port are configured in $HIVE_HOME/conf/hive-site.xml.
<property>
<name>hbase.zookeeper.quorum</name>
<value> ---- PLEASE SPECIFY ---- </value>
<description>A comma separated list (with no spaces) of the IP addresses of all ZooKeeper servers in the cluster.</description>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value> ---- PLEASE SPECIFY ---- </value>
<description>The Zookeeper client port. Default clientPort is 2181.</description>
</property>
Use --auxpath while entering into Hive
hive --auxpath <HIVE_HOME>/lib/hive-hbase-handler-<version>.jar:<HBASE_HOME>/hbase.jar:<HBASE_HOME>/lib/zookeeper.jar:<HBASE_HOME>/lib/guava-<version>.jar
When using CDH4.2 with Kerberos authentication
Export the HIVE_OPTS environment variable with Kerberos parameters as shown below to replicate Hive over HBase.
export HIVE_OPTS="-hiveconf hbase.security.authentication=kerberos -hiveconf hbase.rpc.engine=org.apache.hadoop.hbase.ipc.SecureRpcEngine -hiveconf hbase.master.kerberos.principal=hbase/_HOST@KerberosRelm -hiveconf hbase.regionserver.kerberos.principal=hbase/_HOST@KerberosRelm -hiveconf hbase.zookeeper.quorum=zookeeperQuorum"
Tip: The Kerberos utility kinit allows you to identify yourself to the Kerberos server. “kinit” needs to be invoked (only once) if you are starting a new session.
SharePlex Connector for Hadoop is distributed in the archive: shareplex-hadoop-connector-version-date-clouderaVersion.tar.gz where version identifies the SharePlex Connector for Hadoop release.
Extract the archive with clouderaVersion appropriate to your Hadoop installation on a machine where Hadoop libraries and configurations are present. Use the command:
$tar -xzf shareplex-hadoop-connector-version-date-clouderaVersion.tar.gz
The archive contains the following files:
File | Description |
---|---|
install.sh |
A shell script that installs SharePlex Connector for Hadoop and other programs in the archive. |
shareplex-hadoop-connector-version-date.tar | SharePlex Connector for Hadoop archive. |
db-derby-10.9.1.0-bin.tar.gz |
Apache Derby installable. SharePlex Connector for Hadoop uses the Apache Derby network server and creates a database for storing metadata and status information. |
sqoop-quest-1.4.3.bin__hadoop-version.tar.gz |
Apache Sqoop installable. Sqoop is a tool designed to transfer bulk data between Apache Hadoop and structured data stores. |
quest-oraoop-1.6.0-date-version.tar.gz |
Quest Data Connector for Oracle and Hadoop archive. Quest Data Connector for Oracle and Hadoop is an optional plugin to Sqoop. It facilitates the movement of data between Oracle and Hadoop. |
This shell script installs programs in the SharePlex Connector for Hadoop archive.
Shell Script Usage
[user@host bin]$ ./install.sh -d <INSTALL_DIR> [-h <HADOOP_HOME_DIR>] [-c <HADOOP_CONF_DIR>] [-b <HBASE_HOME_DIR>] [-v <HIVE_HOME_DIR>] [-p <DERBY_PORT_NUMBER>] [--help] [--version]
Options
Parameter |
Description |
---|---|
-d <INSTALL_DIR> |
SharePlex Connector for Hadoop will be installed in the shareplex_hadoop_connector directory inside this directory. |
-h <HADOOP_HOME_DIR> |
The path to the Hadoop home directory. NOTE: This option overrides HADOOP_HOME in the environment. If this option is not set and the HADOOP_HOME environment variable is also not set, this parameter is set to /usr/lib/hadoop as default. |
-c <HADOOP_CONF_DIR> |
The path to the Hadoop conf directory. NOTE: This option overrides HADOOP_CONF_DIR in the environment. If this option is not set and the HADOOP_CONF_DIR environment variable is also not set, this parameter is set to
|
-b <HBASE_HOME_DIR> |
The path to HBase home directory. NOTE: This option overrides HBASE_HOME in the environment. If this option is not set and the HBASE_HOME environment variable is also not set, this parameter is set relative to HADOOP_HOME. |
-v <HIVE_HOME_DIR> |
The path to Hive home directory. NOTE: This option overrides HIVE_HOME in the environment. |
-p <DERBY_PORT_NUMBER> |
The port number for the Apache Derby connection. NOTE: If not specified, this parameter is set to 1527 |
--help |
Show this help and exit. |
--version |
Show version information and exit. |
While install.sh is executing
Please accept the Apache license to install Quest Data Connector for Oracle and Hadoop.
When install.sh has finished executing
install.sh starts the Apache Derby network server.