Hive2.1.1 installation
foreword
Before installing Hive, install hadoop first to ensure that HDFS and YARN are successfully started. The Hive version installed in this tutorial is version 2.1.1, and the corresponding hadoop version is 2.7.3.
By default, Hive metadata is stored in the embedded database Derby, which allows only one session connection and is only suitable for simple testing. In order to support multi-user and multi-session, an independent metabase is required, and we use MySQL as the metabase.
1. Download Hive
Hive download page, click the corresponding mirror address, download hive-2.1.1 version
Or download directly using the wget command:
wget http://mirror.bit.edu.cn/apache/hive/hive-2.1.1/apache-hive-2.1.1-bin.tar.gz
2. Install mysql
2.1 Check if mysql is installed
Use the command to yum list installed | grep mysql
detect whether mysql is installed. As shown in the figure below, mysql has been installed.
If mysql is installed and it can be used normally, there is no need to install it again. If it cannot be used normally, uninstall it and install it again.
Uninstall command:yum -y remove mysql-libs.x86_64
After uninstalling, we can use the rpm -qa | grep mysql command to check whether mysql has been uninstalled successfully.
2.2 Install mysql through yum
- View available mysql versions on yum
To install the mysql database through yum, first enter the yum list | grep mysql
command to view the downloadable version of the mysql database provided on yum:
- Install mysql using yum
Both will be installed by entering the yum install -y mysql-server mysql mysql-devel
command .mysql mysql-server mysql-devel
备注:
mysql-devel 是开发用到的库以及包含文件
mysql mysql 是mysql客户端
mysql-server 是mysql数据库服务器
Run the installation command as shown below:
The interface after installation is complete:
- View installed mysql version information
Use the following command to view the installed mysql version informationrpm -qi mysql-server
2.3 Initialization of mysql database
Start the mysql service through the command service mysqld start, as shown in the following figure:
Startup process description:
When the mysql service is started for the first time, the mysql server will first perform initialization configuration, and will prompt a lot of information. The purpose is to initialize the mysql database. When the mysql service is restarted again, it will not prompt so much information, such as Run the restart command again: service mysqld restart
:
2.4 Set the mysql service to start automatically at boot
Use the chkconfig --list | grep mysqld
command to check whether the mysql service starts automatically at boot
It can be seen from the output information that the mysqld service does not start automatically at boot. You can use the chkconfig mysqld on
command to set it to start at boot, so that you don't have to start it manually every time:
2.5 Set the default root account password
Note that this root account is not the root account in Linux.
Run the following command:
mysqladmin -u root password ‘123456’
Use this command to set the password for the root account to 123456
2.6 Log in to the mysql database
Use the mysql -u root -p
command to log in to the mysql database, as shown in the following figure:
2.7 Create Hive Metabase
Create a database named hive to store the metadata of hive, as shown in the following figure:
2.8 User authorization
To authorize the root user to access the mysql database from an external IP, use the following command:
grant all privileges on . to ‘root’@’%’ identified by ‘123456’;
To make the authorization take effect immediately, use the following command:
flush privileges;
The command execution is shown in the following figure:
2.9 Connect mysql from windows environment
Use the mysql client tool Navicat for MySQL in the windows environment to connect to mysql:
3. Hive installation
3.1 Copy the downloaded hive package to the Linux environment
Copy the hive package to a server in the hadoop cluster. The specific directory can be set according to the situation. The big data software in this tutorial is installed in the /opt directory. As shown below:
3.2 Unzip the hive package
Unzip the hive package with the following command:
tar –zxvf apache-hive-2.1.1-bin.tar.gz
3.3 Configure Hive environment variables
In the /etc/profile
file, configure the environment variables:
export HIVE_HOME=hive安装路径
export PATH=$PATH:$HIVE_HOME/bin:
The configured profile file is shown in the following figure:
Make environment variables take effect:
source /etc/profile
3.4 Hive configuration file
Enter the conf directory of the hive decompression directory and run the following command to copy the configuration file:
[root@bigdata02 conf]# cp hive-env.sh.template hive-env.sh
- Modify the hive-env.sh file
Modify the hive-env.sh file, add two lines, and modify the installation path according to the actual situation:
export HADOOP_HOME=hadoop安装路径
export HIVE_CONF_DIR=hive安装路径/conf
- Create hive-site.xml file
The contents of the file are as follows:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://bigdata02:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
</configuration>
javax.jdo.option.ConnectionURL // 数据库URL
javax.jdo.option.ConnectionDriverName // JDBC 驱动名称
javax.jdo.option.ConnectionUserName // 数据库用户名
javax.jdo.option.ConnectionPassword // 数据库密码
- Create tmp directory
Create a tmp directory in the $HIVE_HOME directory:
- create directory on hdfs
Use the following HDFS command to create the corresponding directory:
# hive的默认临时文件目录
hdfs dfs -mkdir /tmp
# hive的warehouse默认目录
hdfs dfs -mkdir -p /user/hive/warehouse
# 为tmp目录授权
hadoop fs -chmod g+w /tmp
# 为目录授权
hadoop fs -chmod g+w /user/hive/warehouse
- Upload the mysql driver package
We have configured the metabase of hive to be mysql, hive to connect to mysql, and needs the jdbc driver package of mysql, so we need to upload the driver package to the $HIVE_HOME/lib
directory
jdbc driver package download path
3.5 Initializing the Hive Metabase
Use the schematool command in the bin directory of the hive installation directory to initialize the metabase:
schematool -dbType mysql -initSchema
If you see the following prompt, the initialization is complete:
3.6 Check whether hive is successfully installed
Enter hive directly on the command line:
Note: hdfs and yarn must be in the startup state at this time.