Installation prerequisites
Already installed and configured Hadoop cluster (stand-alone or fully distributed will do)
software download
Hive official website:
https://hive.apache.org/index.html
Hive installation
Configuration environment variable
Hive upload the downloaded package into the machine, extracting the specified path
editing / etc / profile, the configuration environment variables Hive
editing / etc / profile, the configuration environment variables Hive
export HIVE_HOME=/.../apache-hive-2.1.0-bin
export PATH=$PATH:$HIVE_HOME/bin
The environment configuration to take effect: source / etc / profile
Modify hive-env.sh
Because the use of Hadoop Hive, Hadoop need to specify the installation path hive-env.sh file:
while also modify the path JAVA_HOME
export JAVA_HOME = / usr / java / jdk-1.8.0_101 ## Java path
export HADOOP_HOME = / usr /hadoop-2.6.4 ## Hadoop installation path
export HIVE_HOME = / usr / hadoop- 2.6.4 / thirdparty / apache-hive-2.1.0-bin ## Hive installation path
export HIVE_CONF_DIR = $ HIVE_HOME / conf ## Hive configuration file path
while also modify the path JAVA_HOME
export JAVA_HOME = / usr / java / jdk-1.8.0_101 ## Java path
export HADOOP_HOME = / usr /hadoop-2.6.4 ## Hadoop installation path
export HIVE_HOME = / usr / hadoop- 2.6.4 / thirdparty / apache-hive-2.1.0-bin ## Hive installation path
export HIVE_CONF_DIR = $ HIVE_HOME / conf ## Hive configuration file path
Configuration Hive
Into the hive's conf directory:
cp hive-env.sh.template hive-env.sh
cp hive-default.xml.template hive-site.xml
cp hive-log4j2.properties.template hive-log4j2.properties
cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties
Modify hive-site.xml
Corresponding to <name />, modified as follows <value /> Found:
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive-${user.name}</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/${user.name}</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/hive/resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/tmp/${user.name}</value>
<description>Location of Hive run time structured log file</description>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/tmp/${user.name}/operation_logs</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
Placed Hive Metastore
By default, the Hive metadata is stored in the embedded derby database, but in general the production environment uses MySQL to store the Hive metadata.
The mysql-connector-java-5.1.40- bin.jar placed under $ HIVE_HOME / lib.
MySQL database connection configuration information in hive-site.xml.
The mysql-connector-java-5.1.40- bin.jar placed under $ HIVE_HOME / lib.
MySQL database connection configuration information in hive-site.xml.
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&characterEncoding=UTF-8&useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
</property>
Create a directory for HDFS Hive
In the Hive you need to create / tmp and / user / hive / warehouse (default value hive-site.xml configuration file attributes of items hive.metastore.warehouse.dir) HDFS directory using the following command before creating the table and assign them write permission .
start-dfs.sh
hdfs dfs -mkdir /tmp
hdfs dfs -mkdir -p /usr/hive/warehouse
hdfs dfs -chmod g+w /tmp
hdfs dfs -chmod g+w /usr/hive/warehouse
mysql create a user hive
$ mysql -u root -p
mysql> CREATE USER 'hive'@'localhost' IDENTIFIED BY "hive";
mysql> grant all privileges on *.* to hive@localhost identified by 'hive';
mysql> CREATE USER 'hive'@'localhost' IDENTIFIED BY "hive";
mysql> grant all privileges on *.* to hive@localhost identified by 'hive';
Run Hive
When the command line to run the hive command must ensure that HDFS has been launched. You can use start-dfs.sh to start HDFS.
Hive 2.1 version from the start, we need to run schematool command to execute initialization.
To Hive CLI (Hive command line interface) , the input terminal may be: Hive
start information follows:
Test:
used to show tables show all the tables:
Hive 2.1 version from the start, we need to run schematool command to execute initialization.
schematool -dbType mysql -initSchema
To Hive CLI (Hive command line interface) , the input terminal may be: Hive
start information follows:
Test:
used to show tables show all the tables:
note
If you configure MySQL stored as metadata, before starting Hive first start the MySQL database service