Hadoop-based data warehouse Hive installation

Hadoop-based data warehouse Hive installation

1. Install Hive

1.1 Download the Hive source program

Apache official: https://www.apache.org/dyn/closer.cgi/hive/

Mirror of Tsinghua University: https://mirrors.tuna.tsinghua.edu.cn/apache/hive/

In Ubuntu, use the wget command to download:

wget https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz

It seems to have failed (network speed problem), forget it, or use Xshell to pass it over! !

1.2 Unzip and rename

sudo tar -zxvf ./apache-hive-3.1.3-bin.tar.gz -C /usr/local # 解压到 /usr/local 中
sudo mv apache-hive-3.1.3-bin hive # 重命名为hive

1.3 Modify file permissions

sudo chown -R hadoop:hadoop hive

Note: Hadoop:hadoop above is the user group and user name. If you currently log in to the Linux system with the user name user_name, replace hadoop with user_name.

1.4 Configure environment variables

For ease of use, we add the hive command to the environment variable, and use the vim editor to open .bashrcthe file. The command is as follows:

sudo vi ~/.bashrc

Add the following:

export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin
export HADOOP_HOME=/usr/local/hadoop

HADOOP_HOME needs to be configured as the installation path of Hadoop on your system, for example, it is installed in /usr/local/hadoopthe directory.

After saving and exiting, run the following command to make the configuration take effect immediately:

source ~/.bashrc

1.5 Configure hive-site.xml

/usr/local/hive/confAfter modification hive-site.xml, execute the following command:

cd /usr/local/hive/conf
sudo mv hive-default.xml.template hive-default.xml

The above command is to be hive-default.xml.templaterenamed to hive-default.xml.

Then, use the vim editor to create a new configuration file hive-site.xml, the command is as follows:

cd /usr/local/hive/conf
sudo vi hive-site.xml

hive-site.xmlAdd the following configuration information in :

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
    <description>username to use against metastore database</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>hive</value>
    <description>password to use against metastore database</description>
  </property>
</configuration>

2. Install and configure MySQL

Here we use the MySQL database to save the metadata of Hive, instead of using the derby that comes with Hive to store metadata.

Installation reference of MySQL under Ubuntu: Ubuntu installation of MySQL and common operations

2.1 Download mysql jdbc package

Download address: https://dev.mysql.com/downloads/connector/j/

Upload in Xshell:

2.2 Unzip and copy

tar -zxvf mysql-connector-j-8.0.31.tar.gz

will be mysql-connector-j-8.0.31.jarcopied to /usr/local/hive/libthe directory:

cd 下载
cd mysql-connector-j-8.0.31
sudo cp mysql-connector-j-8.0.31.jar /usr/local/hive/lib

2.3 Start and log in to the mysql shell

service mysql start # 启动mysql服务
mysql -u root -p  # 登陆shell界面

2.4 Create a new hive database

create database hive;

This hive database localhost:3306/hivecorresponds to hive in hive-site.xml and is used to save hive metadata.

2.5 Configure mysql to allow hive access

grant all on *.* to hive@localhost identified by 'hive'; # 将所有数据库的所有表的所有权限赋给hive用户,后面的hive是配置hive-site.xml中配置的连接密码
flush privileges; # 刷新mysql系统权限关系表

Will report an error! Reference blog: grant all on . to hive@localhost identified by 'hive'; ERROR 1064 (42000): You have an error in yo

Change to the following code:

create user 'hive'@'localhost' identified by 'hive';
grant all on *.* to 'hive'@'localhost';
flush privileges;

2.6 start hadoop

Before starting hive, please start the hadoop cluster:

cd /usr/local/hadoop
./sbin/start-all.sh
jps # 查看进程(6个为正常)

2.7 start hive

cd /usr/local/hive
./bin/hive

Try the following:

./bin/schematool -dbType mysql -initSchema

Still can't! ! ! [See Bug1]

2.8 exit hive

exit;

3. Bug1 (solved)

Reference blog: Hive initialization error Exception in thread “main” java.lang.NoSuchMethodError: com.google.common.base.

Reason for error:guava.jar because the two versions of hadoop and hive are inconsistent

solution:

(1) Delete guava.jar in hive:

cd /usr/local/hive/lib
sudo rm guava-19.0.jar 

(2) Copy guava.jar in hadoop to hive:

cd /usr/local/hadoop/share/hadoop/common/lib # 进入hadoop
cp -r guava-27.0-jre.jar /usr/local/hive/lib # 复制到hive中

(3) Initialize hive:

./bin/schematool -dbType mysql -initSchema


(4) Start hive again:

cd /usr/local/hive
./bin/hive

4. Bug2 (to be resolved)

When starting hadoop or hive , the following error will be reported:

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

5. Bug3 (to be resolved)

When starting hive , the following error will be reported:

WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored

Guess you like

Origin blog.csdn.net/m0_70885101/article/details/127441100