hive environment to build and problems


Hive installation

  1. alt + p open sftp, upload apache-hive-2.3.4-bin.tar.gz to hadoop02.
    Here Insert Picture Description
    Hadoop02 into the apps directory in the virtual machine, the hive see just upload the installation package, extract the tar zxvf the Apache-hive-2.3.4-bin.tar.gz , into the bin directory.
    Here Insert Picture Description
  2. Run hive scripts in the bin directory
    [hadoopUser @ hadoop02 bin] $ ./hive
    (which is installed derby, but we want to install mysql, so this step only for understanding, do not install the derby without running)
    Here Insert Picture Description
  3. Modify the configuration files
    return to the apache-hive-2.3.4-bin conf directory, create a hive-site.xml file, Touch hive-site.xml
    Here Insert Picture Description
  4. In this hive-site.xml file newly created vim edit the following, save and exit.
# 如果mysql和hive在同一个服务器节点,那么请把hadoop01修改为localhost
# mysql://mysql安装所在的服务器名称:3306/
<property>
    	<name>javax.jdo.option.ConnectionURL</name>
    	<value>jdbc:mysql://hadoop01:3306/hivedb?createDatabaseIfNotExist=true</value>
    	<description>JDBC connect string for a JDBC metastore</description>
</property>
  
<property>
		<name>javax.jdo.option.ConnectionDriverName</name>
		<value>com.mysql.jdbc.Driver</value>
		<description>Driver class name for a JDBC metastore</description>
</property>

# mysql的用户名
<property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hadoopUser</value>
        <description>username to use against metastore database</description>
</property>

# mysql的密码
<property>
		<name>javax.jdo.option.ConnectionPassword</name>
		<value>123456</value>
		<description>password to use against metastore database</description>
</property>

# 可选配置,该配置信息用来指定hive数据仓库的数据存储在HDFS上的目录(该目录必须预先创建好)
<property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/home/hadoopUser/apps/hive/warehouse</value>
        <description>hive default warehouse,you can change</description>
</property>

Here Insert Picture Description
5. sftp upload mysql driver package (mysql-connector-java-5.1.40 -bin.jar), the lib files in the apache-hive-2.3.4-bin directory folder.
Here Insert Picture Description
6. The configuration environment variable
in ~ / .bashrc vim add the content
Export HIVE_HOME = / Home / hadoopUser / Apps / Apache-2.3.4-bin-Hive
Export the PATH the PATH = $: $ HIVE_HOME / bin
executable files (files reload) will be able to take effect: Source ~ / .bashrc
Here Insert Picture Description

7. Verify the hive installation
Here Insert Picture Description
7. initialize metadata database
2.x hive later versions need to manually initialize metadata database.
MySQL -initSchema -dbType SchemaTool
Here Insert Picture Description
8. The client starts Hive
[hadoopUser @ hadoop02 bin] $ hive --service cli
or
[hadoopUser @ hadoop02 bin] $ hive

9. Check Database
success!
Here Insert Picture Description

Two connections Hive

  1. CLI
    into the hive at bin directory Here Insert Picture Description
    input commands hive, is successful will be shown below, the operation can be performed hive.
    Here Insert Picture Description
  2. HiveServer2 service and beeline client
    hiveserver2 is a hive of servers, beeline is the client, it can be understood as the relationship between mysql and navicat.

① modify hadoop cluster, into the bottom /apps/hadoop-2.7.6/etc/hadoop, you ll see a lot of configuration files, here we are hdfs-site.xml and core-site.xml added to the configuration information.
Here Insert Picture Description

② modify the vim hdfs-site.xml configuration file (each node in the cluster must be configured) indicates enabled webhdfs.

<property>
	<name>dfs.webhdfs.enabled</name>
	<value>true</value>
</property>

③ modified vim core-site.xml profile (of each node within the cluster configuration to be), the user agent is provided hadoop.

<property>
	<name>hadoop.proxyuser.XXX.host</name>
	<value>*</value>
</property>

<property>
	<name>hadoop.proxyuser.XXX.groups</name>
	<value>*</value>
</property>
# XXX是你的用户名,我是hadoopUser
# hadoop.proxyuser.XXX.host配置成*的意义在于:
# 表示任何节点使用hadoop集群的代理用户hadoop都能访问hdfs集群,
# hadoop.proxyuser.hadoop.groups表示代理用户的组所属。

After configuration is complete, restart the cluster hdfs take effect (stop / start-dfs.sh and stop / start-yarn.sh)

④ After the reboot is complete, start hiveserver2 service
nohup (no hang up, do not suspend) the general form of the command:
nohup the Command &
if you are running one of you think that the process will not end when you exit the account, you can use nohup command,.

Enter the command: nohup hiveserver2 1> /home/hadoopUser/apps/hive/hiveserver.log 2> /home/hadoopUser/apps/hive/hiveserver.err &
where is the path followed by the 1 behind your store hiveserver.log, representing standard log
followed by 2 followed by the path to your storage hiveserver.err, representing the error log
if you do not output path, the log will be generated by default in the current directory, the default name nohup.xxx

If you do not enter the log, can use the following commands, not suspended may nohup omitted:
nohup hiveserver2. 1> / dev / null 2> / dev / null &
or
nohup hiveserver2> / dev / null 2> &. 1 &
Here Insert Picture Description
⑤ then start beeline Customer end to connect hiveserver2 service
Beeline -u jdbc: hive2: // hadoop02: 10000 -n hadoopUser
-u: specifies the metadata database connection information that you install hive virtual machine name
-n: to build a cluster of user name and user name
, but I here error Unauthorized connection for super-user, you can look at specific questions below 6.
If a successful connection, you can use beeline be hive operation.

Build process problems

  1. Database initialization problem yuan, showing hive-site.xml file the second line has a problem, open the configuration file is found wrong (stupid mistake).
    Here Insert Picture Description
  2. Continue to initialize, there is still a problem because of my linux install mysql in hadoop01, mysql error here because I wrote hadoop02 path in the hive-site.xml in.
    Here Insert Picture Description
  3. After modification, or there is a problem, because I connect remotely mysql account is not root, it was changed to the correct account number and password, you initialize the database metadata success.
    Here Insert Picture Description
  4. beeline client connection problems, because the core-site.xml in hadoop to modify your user name, my username is hadoopUser (as shown below).
    After editing to restart the cluster, jps look to see if runjar, to kill off any restart
    Here Insert Picture Description
    Here Insert Picture Description
    Here Insert Picture Description
    6. After restarting the cluster, run again beeline, find another mistake. hadoopUser not give super user privileges?
    Here Insert Picture Description
    Open sodoers configuration file, the configuration found to hadoopUser root privileges inconsistent color display above, the main reason is the user name with capital, could be the cause did not authorize successfully.
    No solution, create a linux user name later do not use uppercase!
    Here Insert Picture Description
    So if you change the configuration file, you must be careful cautious before without problems.
Published 56 original articles · won praise 34 · views 3670

Guess you like

Origin blog.csdn.net/MicoOu/article/details/103293904