Twenty-nine, Linux install Hive and change Metastore dependency library

This article mainly introduces how to install and deploy Hive on Linux, as well as the installation of MySQL, which is the dependent library for storing Hive Metastore. Pay attention to the column "Break the Cocoon and Become a Butterfly-Big Data" to see more related content~


table of Contents

One, install Hive

1.1 Download the installation package

1.2 Hive installation and deployment

1.2.1 Upload and unzip

1.2.2 Modify the configuration file

1.2.3 Hadoop cluster configuration

1.2.4 Add Hive to environment variables

1.3 Testing

Two, Hive instance test

2.1 Start Hive

2.2 Create a table

2.3 Load local data

Three, install MySQL

Fourth, configure Hive Metastore to MySQL

4.1 Copy driver

4.2 Modify the configuration file

4.3 Test


 

One, install Hive

1.1 Download the installation package

First, we need to download the installation package, please click here to download the corresponding installation package, here we choose the 1.2.1 version of the installation package.

1.2 Hive installation and deployment

1.2.1 Upload and unzip

1. Upload the downloaded installation package to the master node

2. Unzip to the /opt/modules/ directory

tar -zxvf ./apache-hive-1.2.1-bin.tar.gz -C ../modules/

3. Rename

In order to facilitate subsequent operations, we rename the directory here.

1.2.2 Modify the configuration file

1. Copy the hive-env.sh.template file

In the Hive conf directory, copy a hive-env.sh.template configuration file as follows:

cp hive-env.sh.template hive-env.sh

2. Modify the hive-env.sh file

The Hadoop path configured in the hive-env.sh file is the path of the Hive conf directory.

HADOOP_HOME=/opt/modules/hadoop-2.7.2
export HIVE_CONF_DIR=/opt/modules/hive/conf

1.2.3 Hadoop cluster configuration

1. Start HDFS and Yarn. What needs to be noted here is that the startup of Yarn needs to be started on the machine where ResourceManager is installed, and we have installed it on slave01 here.

start-dfs.sh
start-yarn.sh

2. Create two directories /tmp and /user/hive/warehouse on HDFS and modify their permissions.

hdfs dfs -mkdir -p /user/hive/warehouse
hdfs dfs -mkdir /tmp

hdfs dfs -chmod g+w /tmp
hdfs dfs -chmod g+w /user/hive/warehouse

1.2.4 Add Hive to environment variables

vim /etc/profile

##HIVE_HOME
export HIVE_HOME=/opt/modules/hive
export PATH=$PATH:$HIVE_HOME/bin

source /etc/profile

1.3 Testing

Enter hive to enter the command line mode, you can perform a simple operation test, the statement is no different from SQL.

Two, Hive instance test

After Hive is installed, let's make a small example and test it briefly.

2.1 Start Hive

Start Hive and create a new database (xzw).

hive

create database xzw;

2.2 Create a table

Here we create a people table to store personal information. The people table has three fields: id, name, and gender. The data is separated by English commas.

create table people(id int, name string, sex string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

2.3 Load local data

1. First, you need to construct part of the data locally. It should be noted here that when building the table, we specify the English comma as the separator. When importing data, it will automatically recognize the English comma as the segmentation mark of the field.

2. Load local data to Hive.

load data local inpath '/root/files/p.txt' into table people;

Three, install MySQL

Once inadvertently, I entered the Hive command line interface in another window, and then the following problem occurred:

This is because the Metastore is stored in the built-in derby database by default, and it is recommended to use MySQL to store the Metastore. Therefore, with the installation of MySQL, the specific installation process can refer to the section ( 5. Installing MySQL ) in "CDH6.3.1 Deploying Big Data Clusters" . I won't repeat it here.

Fourth, configure Hive Metastore to MySQL

4.1 Copy driver

Copy the specified directory of the database driver, as shown in the following figure:

4.2 Modify the configuration file

1. Create a new hive-site.xml file

touch hive-site.xml

2. Add the following content in hive-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
	<property>
	  <name>javax.jdo.option.ConnectionURL</name>
	  <value>jdbc:mysql://master:3306/metastore?createDatabaseIfNotExist=true</value>
	  <description>JDBC connect string for a JDBC metastore</description>
	</property>

	<property>
	  <name>javax.jdo.option.ConnectionDriverName</name>
	  <value>com.mysql.jdbc.Driver</value>
	  <description>Driver class name for a JDBC metastore</description>
	</property>

	<property>
	  <name>javax.jdo.option.ConnectionUserName</name>
	  <value>root</value>
	  <description>username to use against metastore database</description>
	</property>

	<property>
	  <name>javax.jdo.option.ConnectionPassword</name>
	  <value>p@ssw0rd</value>
	  <description>password to use against metastore database</description>
	</property>
</configuration>

4.3 Test

Start the Hadoop cluster, open multiple windows on the master node, create a database in one of them, and find the newly created database in the other windows, which proves that the configuration is successful.

 

This is the whole content of this article. What problems did you encounter in the process, welcome to leave a message and let me see what problems you have all encountered~

Guess you like

Origin blog.csdn.net/gdkyxy2013/article/details/110945163