CentOS7下Hive集群搭建

1、下载Hive

本实例使用的是apache-hive-2.1.1,请根据需要将hive下载到本地并解压。下载地址:http://archive.apache.org/dist/hive/
解压后的路径:

[root@hadoop-master apache-hive-2.1.1]# pwd
/usr/local/hive/apache-hive-2.1.1

2、安装mysql数据库

Hive搭建分三种方式:
内嵌Derby方式:使用derby存储方式时,运行hive会在当前目录生成一个derby文件和一个metastore_db目录。这种存储方式的弊端是在同一个目录下同时只能有一个hive客户端能使用数据库。
本地模式:这种存储方式需要在本地运行一个mysql服务器,并作如下配置(下面两种使用mysql的方式,需要将mysql的jar包拷贝到$HIVE_HOME/lib目录下)。
多用户模式:这种存储方式需要在远端服务器运行一个mysql服务器,并且需要在Hive服务器启动meta服务。
三种方式归根到底就是元数据的存储位置不一样,本文采用的是多用户模式。
mysql的安装可以参考 《CentOS7下安装mysql-5.7.24》

3、修改配置文件

首先进入下面这个目录,编辑hive-site.xml文件,没有就新加一个

[root@hadoop-master conf]# vi /usr/local/hive/apache-hive-2.1.1/conf/hive-site.xml 

其中hive-site.xml内容:

<?xml version="1.0" encoding="utf-8"?>

<configuration>

 <!--  在hdfs上文件的位置  -->
  <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/usr/local/hive/warehouse</value>
  </property>

  <!--数据连接地址  -->
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://hadoop-master:3306/hive?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>
  </property>
  <!-- 数据库驱动,这里使用mysql数据库驱动 -->
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
  <!-- 数据库用户名,根据自己的数据库用户名填写 -->
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>username to use against metastore database</description>
  </property>
  <!-- 数据库密码,根据自己的数据库密码填写 -->
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>root</value>
    <description>password to use against metastore database</description>
  </property>

 <property>
   <name>hive.metastore.schema.verification</name>
   <value>false</value>
    <description>
    Enforce metastore schema version consistency.
    True: Verify that version information stored in metastore matches with one from Hive jars.  Also disable automatic
          schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures
          proper metastore schema migration. (Default)
    False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
    </description>
 </property>

 <property>
     <name>hive.querylog.location</name>
     <value>/usr/local/hive/tmp</value>
 </property>

 <property>
     <name>hive.exec.local.scratchdir</name>
     <value>/usr/local/hive/tmp</value>
 </property>

 <property>
     <name>hive.downloaded.resources.dir</name>
     <value>/usr/local/hive/tmp</value>
 </property>

 <property>
     <name>datanucleus.schema.autoCreateAll</name>
     <value>true</value>
 </property>
</configuration>

4、新增mysql驱动到hive中

在以下目录添加mysql-connector-java-5.1.30.jar

[root@hadoop-master lib]# pwd
/usr/local/hive/apache-hive-2.1.1/lib

并且chomd 777 mysql-connector-java-5.1.30.jar 赋权限
p1

5、将hive命令添加到环境变量中

[root@hadoop-master bin]# vi /etc/profile

添加HIVE_HOME到环境变量中

#java环境变量
export JAVA_HOME=/usr/local/jdk/jdk1.8.0_261
export CLASSPATH=.:${
    
    JAVA_HOME}/jre/lib/rt.jar:${
    
    JAVA_HOME}/lib/dt.jar:${
    
    JAVA_HOME}/lib/tools.jar
export PATH=$PATH:${
    
    JAVA_HOME}/bin

#配置Hadoop环境变量
export HADOOP_HOME=/usr/local/hadoop/apps/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:

#配置hive环境变量
export HIVE_HOME=/usr/local/hive/apache-hive-2.1.1
export PATH=$PATH:$HIVE_HOME/bin:$HIVE_HOME/sbin:

添加完成之后执行刷新:

[root@hadoop-master bin]# source /etc/profile

6、启动hive

如果启动报错:

扫描二维码关注公众号,回复: 11778976 查看本文章
MetaException(message:Hive metastore database is not initialized. Please use schematool (e.g. ./sch

可以按照下面操作:https://blog.csdn.net/beidiqiuren/article/details/53056270
需要执行以下指令(在hive的bin目录下执行)元数据的初始化

[root@hadoop-master bin]# ./schematool -initSchema -dbType mysql

最后启动hive

[root@hadoop-master bin]# hive
which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/jdk/jdk1.8.0_261/bin:/usr/local/hadoop/apps/hadoop-2.7.3/bin:/usr/local/hadoop/apps/hadoop-2.7.3/sbin::/root/bin:/usr/local/jdk/jdk1.8.0_261/bin:/usr/local/hadoop/apps/hadoop-2.7.3/bin:/usr/local/hadoop/apps/hadoop-2.7.3/sbin::/usr/local/hive/apache-hive-2.1.1/bin:/usr/local/hive/apache-hive-2.1.1/sbin:)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/apache-hive-2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/apps/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/usr/local/hive/apache-hive-2.1.1/lib/hive-common-2.1.1.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> 

其他的几台机器只需要配置hive的环境变量即可。

猜你喜欢

转载自blog.csdn.net/u011047968/article/details/108679838