Hive on Spark安装指南

hive安装:

  • 下载后解压(通过github 中的hive pom.xml查看对应的spark版本,进而决定是否和spark匹配)
    hive源码
版本对应如下
spark    hive
2.3.0	   3.1/3.0
2.0.0		2.3.0/2.3.1/2.3.2/2.3.3/2.3.4/2.3.5/2.3.6
1.6.0		2.2.0
如果主版本在同一个版本(如都是1.6.x,2.3.x,2.0.x),可以使用对应的hive版本
  • 下载
我的spark 2.3.0,可用3.0/3.1
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz 
sudo tar xzvf apache-hive-2.3.6-bin.tar.gz

配置文件

hive的hadoop相关配置

  • vim hive-env.sh
HADOOP_HOME=/usr/hadoop273
export HIVE_CONF_DIR=/usr/hive/conf

hive元数据配置:

  • vim ${HIVE_HOME}/conf/hive-site.xml
<configuration>
  <!-- WARNING!!! This file is auto generated for documentation purposes ONLY! -->
  <!-- WARNING!!! Any changes you make to this file will be ignored by Hive.   -->
  <!-- WARNING!!! You must make your changes in hive-site.xml instead.         -->
  <!-- Hive Execution Parameters -->

    <!--配置元数据库为MySQL的数据库 -->
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>MySQL_userName</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>MySQL_passWord</value>
    </property>
   <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://MySQL_HOSTNAME:3306/localhive?createDatabaseIfNotExist=true&amp;useSSL=false</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>

    <property>
       <!-- 不需要进行校验否则会报错:Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient -->
       <name>hive.metastore.schema.verification</name>
       <value>false</value>
   </property>

    <property>
      <name>hive.exec.script.wrapper</name>
      <value/>
      <description/>
    </property>

  </configuration>

将MySQL驱动复制到spark,hive的classpath,删除spark中hive相关依赖

  • 将MySQL相关的驱动复制到spark,hive的classpath
cp  ~/.m2/repository/mysql/mysql-connector-java/5.1.47/mysql-connector-java-5.1.47.jar    ${HIVE_HOME}/lib
cp  ~/.m2/repository/mysql/mysql-connector-java/5.1.47/mysql-connector-java-5.1.47.jar 	 ${SPARK_HOME/jars
cp  scala-library-2.11.8.jar  /usr/share/hive-2.3.0/lib/
cp  spark-core_2.11-2.0.2.jar  /usr/share/hive-2.3.0/lib/
cp  spark-network-common_2.11-2.0.2.jar  /usr/share/hive-2.3.0/lib/
  • 将spark相关的hive依赖移出spark classpath目录(需要用的时候再移回去)
mv ${SPARK_HOME/jars/spark-hive*   ${SPARK_HOME}
mv ${SPARK_HOME/jars/hive*   ${SPARK_HOME}
  • 元数据初始化
${HIVE_HOME}/bin/schematool -initSchema -dbType mysql --initSchema

启动hdfs,spark,hive

  • 启动hdfs集群
start-dfs.sh
  • 启动spark集群
${SPAEK_HOME}/sbin/start-all.sh
  • 登录
${HIVE_HOME}/bin/hive

sql 操作

  • 创建hive数据库
CREATE DATABASE IF NOT EXISTS  demo  COMMENT  ' practice demo'   LOCATION  '/jerry/hive/dw/geo' 
 WITH DBPROPERTIES('purpose'='stage');

use geo;
  • 创建hive 表
create table course
 (`Cno` string comment 'course id',
  `Cname` string comment 'course name',
  `Tno` string comment 'teacher name'
 ) row format delimited fields terminated by ',' lines terminated by '\n';


create table  course
(`CNO`      string  comment '课程编号',
`CNAME`     string   comment '课程名称,',
`TNO`       string     comment '教师编号'
)
row format delimited fields terminated by '\t' lines terminated by '\n';

  • 插入数据
insert into course values('3-105','计算机导论',835),('3-245','数据电路',856);
发布了150 篇原创文章 · 获赞 15 · 访问量 10万+

猜你喜欢

转载自blog.csdn.net/dymkkj/article/details/103235730
今日推荐