hive-2.1.1配置安装

其实在公司我们用的是0.13.0这个版本，看见官网有2.1.1版本，所以想先体验下，具体这两个版本有啥区别，还没有去研究过
先说下hive安装的几种方式
1. 内嵌方式，使用的是derby数据库存储元数据，默认也是采用的这种数据库，但是有个缺点，一次只能有一个hiveclient链接
2. 本地模式，使用本地的mysql数据库存储元数据
3. 远程模式，mysql和hive独立开来

在公司里一般是使用第三种方式，首先安装远程模式，启动metasotre服务，服务的接口地址：thrift://localhost:9083，

然后给公司的各个部门分配hive客户端，客户端就配置一个元数据的地址

还有那就是需要配置HADOOP_HOME，我一般是直接配置到~/.barcrc文件，下面是我机器的所有配置，大家根据自己的安装目录配置下

vi ~/.bashrc

ELASTICSEARCH_HOME=/home/qun/soft/elasticsearch-2.3.4
OPENRESTY=/home/qun/nginx/openresty
FLUME_HOME=/home/qun/apache-flume-1.6.0-bin
export FLUME_CONF_DIR=/home/qun/apache-flume-1.6.0-bin/conf
JAVA_HOME=/home/qun/soft/jdk1.8.0_91
SOLR_HOME=/home/qun/solr/solr-6.0.0
SCALA_HOME=/home/qun/scala-2.11.8
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
SPARK_HOME=/home/qun/spark
#HADOOP_HOME=/home/qun/hadoop-2.6.0
HADOOP_HOME=/home/qun/soft/hadoop-2.8.0
MAVEN_HOME=/home/qun/apache-maven-3.3.9
STORM_HOME=/home/qun/apache-storm-0.9.3
HIVE_HOME=/home/qun/soft/apache-hive-2.1.1-bin
KAFKA_HOME=/home/qun/kafka_2.11-0.10.0.0
ZOOKEEPER_HOME=/home/qun/zookeeper-3.4.6
export JAVA_HOME PATH CLASSPATH
export PATH=$HIVE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH:$SQOOP_HOME/bin:$SPARK_HOME/bin:$SPARK_HOME/sbin:$PRESTO_HOME/bin:$HBASE_HOME/bin:$SCALA_HOME/bin:$STORM_HOME/bin:$MAVEN_HOME/bin:$KAFKA_HOME/bin:$ZOOKEEPER_HOME/bin:$SOLR_HOME/bin:$SPARK_HOME/bin:$SPARK_HOME/sbin:$SOLR_HOME/bin:$FLUME_HOME/bin:$OPENRESTY/nginx/sbin:$ELASTICSEARCH_HOME/bin
export CLASSPATH=$CLASSPATH:$HIVE_HOME/lib

配置完后，记得刷新配置 source ~/.bashrc

先找一台机器master安装hive，启动metastore(metastore服务就一个，连接metastore的客户端可以用很多个)
下载hive安装包

wget https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-2.1.1/apache-hive-2.1.1-bin.tar.gz
tar -zxvf apache-hive-2.1.1-bin.tar.gz
cd apache-hive-2.1.1-bin/conf
mv hive-default.xml.template hive-site.xml
vi hive-site.xml
<property>
    <name>system:java.io.tmpdir</name>
    <value>/home/qun/soft/apache-hive-2.1.1-bin/tmpdir</value>
    <description/>
  </property>
<property>
    <name>system:user.name</name>
    <value>hive</value>
    <description/>
</property>
<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>presto</value>
  <description>username to use against metastore database</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>123456</value>
  <description>password to use against metastore database</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://master:3306/hive3?createDatabaseIfNotExist=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>
<property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
</property>

将mysql的驱动包复制到HIVE_HOME/lib

cp mysql-connector-java-5.1.32.jar /home/qun/soft/apache-hive-2.1.1-bin/lib/

初始化数据库

bin/schematool -dbType mysql -initSchema

hive测试（直接连mysql数据库查询元数据）

bin/hive
hive> show databases;
OK
default
Time taken: 1.399 seconds, Fetched: 1 row(s)
hive> create database test;
OK
Time taken: 0.429 seconds
hive> use test;
OK
Time taken: 0.037 seconds
hive> create table t(line string);
OK
Time taken: 0.48 seconds

[qun@master tmpdir]$ hadoop dfs -ls /user/hive/warehouse/test.db
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Found 1 items
drwxr-xr-x   - qun supergroup          0 2017-06-17 21:36 /user/hive/warehouse/test.db/t

hive> load data local inpath '/home/qun/soft/apache-hive-2.1.1-bin/conf/hive-site.xml' into table t;
Loading data to table test.t
OK
Time taken: 0.628 seconds

启动metastore，默认占用9083端口

./hive --service metastore &
[qun@master bin]$ netstat -anpl|grep 9083
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 0.0.0.0:9083                0.0.0.0:*                   LISTEN      8540/java

配置客户端

找一台机器slave2安装hive-client，将之前的apache-hive-2.1.1-bin.tar.gz复制到slave2机器上，解压，配置hive.metasoter.uris
通过thrift://master:9083链接元数据

vi hive-site.xml
  <property>
    <name>hive.metastore.uris</name>
    <value>thrift://master:9083</value>
    <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
  </property>
    <property>
    <name>system:java.io.tmpdir</name>
    <value>/home/qun/soft/apache-hive-2.1.1-bin/tmpdir</value>
    <description/>
  </property>
  <property>
    <name>system:user.name</name>
    <value>hive</value>
    <description/>
  </property>

测试客户端（通过连接master启动的metastore服务获取元数据）：

hive> show tables;
OK
Time taken: 1.351 seconds
hive> use test;
OK
Time taken: 0.055 seconds
hive> show tables;
OK
t
Time taken: 0.048 seconds, Fetched: 1 row(s)
hive> select coutn(*) from t;
FAILED: SemanticException [Error 10011]: Invalid function coutn
hive> select count(*) from t;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = qun_20170617222212_d724486b-70a1-4937-b7eb-e72183721c3f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1497708218417_0002, Tracking URL = http://master:8888/proxy/application_1497708218417_0002/
Kill Command = /home/qun/soft/hadoop-2.8.0/bin/hadoop job  -kill job_1497708218417_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2017-06-17 22:22:33,632 Stage-1 map = 0%,  reduce = 0%
2017-06-17 22:22:45,680 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.17 sec
2017-06-17 22:22:56,668 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4.55 sec
MapReduce Total cumulative CPU time: 4 seconds 550 msec
Ended Job = job_1497708218417_0002
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 4.55 sec   HDFS Read: 466382 HDFS Write: 105 SUCCESS
Total MapReduce CPU Time Spent: 4 seconds 550 msec
OK
10718
Time taken: 45.346 seconds, Fetched: 1 row(s)

一般情况下会metastore，mysql会单独占用一个节点，其他客户端只需要配置mestore.uris就能执行hivesql

如果需要使用jdbc链接hive的话，还需要启动hiveserver或者hiveserver2，后续会对hiveserver和hiveserver的配置和使用写一篇文章

猜你喜欢