Article Directory
Hadoop-based data warehouse Hive installation
1. Install Hive
1.1 Download the Hive source program
Apache official: https://www.apache.org/dyn/closer.cgi/hive/
Mirror of Tsinghua University: https://mirrors.tuna.tsinghua.edu.cn/apache/hive/
In Ubuntu, use the wget command to download:
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz
It seems to have failed (network speed problem), forget it, or use Xshell to pass it over! !
1.2 Unzip and rename
sudo tar -zxvf ./apache-hive-3.1.3-bin.tar.gz -C /usr/local # 解压到 /usr/local 中
sudo mv apache-hive-3.1.3-bin hive # 重命名为hive
1.3 Modify file permissions
sudo chown -R hadoop:hadoop hive
Note: Hadoop:hadoop above is the user group and user name. If you currently log in to the Linux system with the user name user_name, replace hadoop with user_name.
1.4 Configure environment variables
For ease of use, we add the hive command to the environment variable, and use the vim editor to open .bashrc
the file. The command is as follows:
sudo vi ~/.bashrc
Add the following:
export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin
export HADOOP_HOME=/usr/local/hadoop
HADOOP_HOME needs to be configured as the installation path of Hadoop on your system, for example, it is installed in
/usr/local/hadoop
the directory.
After saving and exiting, run the following command to make the configuration take effect immediately:
source ~/.bashrc
1.5 Configure hive-site.xml
/usr/local/hive/conf
After modification hive-site.xml
, execute the following command:
cd /usr/local/hive/conf
sudo mv hive-default.xml.template hive-default.xml
The above command is to be hive-default.xml.template
renamed to hive-default.xml
.
Then, use the vim editor to create a new configuration file hive-site.xml
, the command is as follows:
cd /usr/local/hive/conf
sudo vi hive-site.xml
hive-site.xml
Add the following configuration information in :
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
</configuration>
2. Install and configure MySQL
Here we use the MySQL database to save the metadata of Hive, instead of using the derby that comes with Hive to store metadata.
Installation reference of MySQL under Ubuntu: Ubuntu installation of MySQL and common operations
2.1 Download mysql jdbc package
Download address: https://dev.mysql.com/downloads/connector/j/
Upload in Xshell:
2.2 Unzip and copy
tar -zxvf mysql-connector-j-8.0.31.tar.gz
will be mysql-connector-j-8.0.31.jar
copied to /usr/local/hive/lib
the directory:
cd 下载
cd mysql-connector-j-8.0.31
sudo cp mysql-connector-j-8.0.31.jar /usr/local/hive/lib
2.3 Start and log in to the mysql shell
service mysql start # 启动mysql服务
mysql -u root -p # 登陆shell界面
2.4 Create a new hive database
create database hive;
This hive database
localhost:3306/hive
corresponds to hive in hive-site.xml and is used to save hive metadata.
2.5 Configure mysql to allow hive access
grant all on *.* to hive@localhost identified by 'hive'; # 将所有数据库的所有表的所有权限赋给hive用户,后面的hive是配置hive-site.xml中配置的连接密码
flush privileges; # 刷新mysql系统权限关系表
Will report an error! Reference blog: grant all on . to hive@localhost identified by 'hive'; ERROR 1064 (42000): You have an error in yo
Change to the following code:
create user 'hive'@'localhost' identified by 'hive';
grant all on *.* to 'hive'@'localhost';
flush privileges;
2.6 start hadoop
Before starting hive, please start the hadoop cluster:
cd /usr/local/hadoop
./sbin/start-all.sh
jps # 查看进程(6个为正常)
2.7 start hive
cd /usr/local/hive
./bin/hive
Try the following:
./bin/schematool -dbType mysql -initSchema
Still can't! ! ! [See Bug1]
2.8 exit hive
exit;
3. Bug1 (solved)
Reference blog: Hive initialization error Exception in thread “main” java.lang.NoSuchMethodError: com.google.common.base.
Reason for error:guava.jar
because the two versions of hadoop and hive are inconsistent
solution:
(1) Delete guava.jar in hive:
cd /usr/local/hive/lib
sudo rm guava-19.0.jar
(2) Copy guava.jar in hadoop to hive:
cd /usr/local/hadoop/share/hadoop/common/lib # 进入hadoop
cp -r guava-27.0-jre.jar /usr/local/hive/lib # 复制到hive中
(3) Initialize hive:
./bin/schematool -dbType mysql -initSchema
(4) Start hive again:
cd /usr/local/hive
./bin/hive
4. Bug2 (to be resolved)
When starting hadoop or hive , the following error will be reported:
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
5. Bug3 (to be resolved)
When starting hive , the following error will be reported:
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored