Hadoop3.x: mysql and Hive installation under Centos7.x (stepping pit notes)

foreword

Due to the incomplete information given by the teacher of a certain Hadoop course, it is impossible to happily complete the installation of mysql and hive under centos7, so I can only step on the pit by myself. I would like to post this article to remember it. Welcome fellow Taoists to fill in the pit together.
Mysql and hive can be installed on any machine in the Hadoop cluster, not necessarily on the master host. Here I choose to install it on the second slave machine. And it is necessary to mention that during the installation process, it is necessary to specify the path where the installation package is installed and decompressed, because only the person concerned knows this, so please make a record.
PS: When completing a series of tedious steps, you must remember to take a snapshot of the virtual machine. The car overturned on the day I wrote the article, the computer crashed and restarted, and the cluster collapsed. Fortunately, there are snapshots that can be recovered in time.

1. mysql installation

1. Try to install mysql using yum

When using the command yum install mysql mysql-server mysql-devel, due to the historical problems of centos7, centos will change the installation of mysql into the installation of mariadb, so that you can’t use yum to install mysql, and some netizens said that mariadb should be uninstalled, but even if I uninstall it cleanly, execute the above again The command is still installed as mariadb.

2. Use mysql-5.7.32-linux-glibc2.12-x86_64.tar.gz to install

For details, please jump to: Centos7 tar package installation mysql5.7
should be able to proceed smoothly according to the above article, pay special attention to the fact that there may be no /etc/my.cnf file when configuring mysql in step 8, do not Don’t panic, create one by yourself, and then fill in the content of the article. You can see 12. Change some codes before you can.

3. Enter the mysql client for authorization

This step is to allow different places to access the mysql settings on this machine, so that not only can you log in on the virtual machine, but you can also log in remotely from the mysql client on the local machine.

mysql -u root -p 

Spaces between -u and root, -p and password are optional.
Then enter the following command:

grant all privileges on *.* to 'root'@'%' identified by '123456' with grant option;
flush privileges;

Here we just take the user root password as 123456 as an example, please change it according to your own situation.
insert image description here

4. Local host remote login demonstration (optional)

insert image description here
But make sure that the mysql service on the virtual machine is already enabled!

Two, hive installation

1.apache-hive-3.1.2-bin.tar.gz installation

Upload the installation package locally, and then tar -xvf apache-hive-3.1.2-bin.tar.gz -C 指定目录extract it to the specified directory

Daily stepping pit:

There is a matching relationship between the hive version and the hadoop version. The hadoop I installed is 3.1.3, so I chose the 3.1.2 hive. I used hive2.3.7 when writing a blog, but I found that if this is the case, the HQL statement will report an error. The reason is that the version is different. match, so it was replaced with hive3.1.2, and finally the configuration was successful.

2. Configure hive-env.sh

insert image description here
Enter the directory where hive is installed, and then enter cd conf,
insert image description here
you will see a bunch of template files as shown above, these are hive configuration file templates, the content inside is very complete, but we don’t need so many configurations, so we create our own configuration file.

cp hive-env.sh.template hive-env.sh

In this way, a hive-env.sh file is copied, and vi hive-env.sh is changed as follows

HADOOP_HOME=/install/hadoop-3.1.3
export HIVE_CONF_DIR=/install/apache-hive-2.3.7-bin/conf

Remove these two comments in the file, and then fill in the path where your own hadoop is located and the path where hive's conf is located. Readers can change it according to their own situation. Note that there should be no spaces on the left and right sides of the equal sign. In terms of obtaining the full path, readers can cd to the directory, then pwd to obtain the full path and copy it to hive-env.sh.

3. Generate hive-site.xml

hive-site.xml is a file that does not exist originally. It is derived from hive-default.xml.template, which can be said to be a subset. Since we do not need most of the configuration in hive-default.xml.template, Therefore, vi hive-site.xml is generated by itself.
The content of hive-site.xml is as follows, readers can copy and paste directly:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>请填写mysql用户名,例root</value>
</property>
<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>请填写mysql密码,例123456</value>
</property>
<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>请填写连接URL,例jdbc:mysql://slave002:3306/hive?createDatabaseIfNotExist=true&amp;useSSL=false</value>
</property>
<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
</property>
<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
</property>
<property>
    <name>hive.metastore.schema.verification</name>
    <value>false</value>
</property>
<property>
    <name>datanucleus.schema.autoCreateAll</name>
    <value>true</value>
</property>
<property>
   <name>datanucleus.autoStartMechanism</name>
   <value>SchemaTable</value>
</property>
<property>
    <name>hive.server2.thrift.bind.host</name>
    <value>请填写主机名,例:slave002</value>
    <description>Bind host on which to run the HiveServer2 Thrift service.</description>
</property>
</configuration>

Among them, there are four places that readers need to change by themselves, mysql user name, password, connection URL, and host name. Take myself as an example:

The user name is root and
the password is 123456. The URL is jdbc:mysql://slave002:3306/hive?createDatabaseIfNotExist=true&useSSL=false, where slave002 is the host name and hive is the name of the created database. The host
will be generated when hive is executed for the first time
name slave002

4. Add the mysql connection driver package to the lib directory of hive

Hive uses mysql as metadata storage, and it must be connected to the mysql database, so a mysql connection driver package must be added to hive, and then hive can be started.
insert image description here
Enter the lib directory, you can see many jar packages,
insert image description here
we will upload the mysql driver package in this directory next, I use mysql-connector-java-5.1.46.jar

Daily stepping pit:

I used the SecureFX function of SecureCRT to upload from the local to the virtual machine before, but this time it did not work for unknown reasons, so I changed to the following method: use the rz -E command to upload, if there is no such command, one-
click yum -y install lrzszinstallation
insert image description here

5. Configure hive environment variables

vi /etc/profile

Enter the environment variable configuration file and paste the following into it

export HIVE_HOME=/install/apache-hive-2.3.7-bin
export PATH=:$HIVE_HOME/bin:$PATH

HIVE_HOME Please change it yourself!
Save and exit after changing, and use to source /etc/profilemake the configuration take effect.

6. Hive startup

Enter the hive directory, and then execute bin/hive to start the hive
insert image description here
start process may be a bit slow, and before that, you must first start the hadoop cluster, namely hdfs and yarn

Daily stepping pit:

The following errors may be reported during execution:

Exception in thread “main” java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1380)

at org.apache.hadoop.util.RunJar.main(RunJar.java:236)

The reason for reporting NoSuchMethodError is that the two guava.jar versions of hadoop and hive are inconsistent, and the two locations are located in the following two directories:

  • /usr/local/hive/lib/
  • /usr/local/hadoop/share/hadoop/common/lib/
    This is just a sample, readers are asked to search for the lib directory of their own hive and the share/hadoop/common/lib of hadoop. The jar package with guava name is is the object file.
    Solution:
    delete the lower version, and copy the higher version to the lower version directory.
    insert image description here
    If you see the above interface, congratulations, you have completed the hive installation!

references

[1] Centos7 tar package installs mysql5.7
[2] hive-site.xml configuration
[3] dark horse programmer Hive installation video
[4] hive startup error
[5] version relationship between hive and Hadoop

Guess you like

Origin blog.csdn.net/weixin_43594279/article/details/116331186