这个系列指南使用真实集群搭建环境,不是伪集群,用了三台腾讯云服务器
或者访问我的个人博客站点,链接
Hive整合Hbase
版本兼容性:
在hive-site.xml文件中添加,如果使用2181这个默认端口的话,2181可以不写
<property>
<name>hbase.zookeeper.quorum</name>
<value>master:2181,slave1:2181,slave2:2181</value>
</property>
拷贝一些必要的jar包
从hbase/lib中拷贝一些jar到hive/lib中- hbase-protocal-xxx.jar
- hbase-server-xxx.jar
- hbase-client-xxx.jar
- hbase-common-xxx.jar
- 有test.jar的话也拷贝一下
在hbase里建表并添加测试数据
create 'student',{NAME=>'info'},{NAME=>'data'}
put 'student','rk0001','info:age','15'
put 'student','rk0001','info:name','zhangsan'
put 'student','rk0002','info:age','15'
put 'student','rk0002','info:name','lisi'
put 'student','rk0003','info:age','18'
put 'student','rk0003','info:name','wanger'
由于非关系型数据库的优越性,下面可以再自由插入新的列
put 'student','rk0001','data:date','11'
put 'student','rk0002','data:date','22'
put 'student','rk0003','data:date','33'
scan 'student'
- 在hive里建立映射关系
> CREATE EXTERNAL TABLE hive_student (key string, name string, age string, data string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:name,info:age,data:date")
> TBLPROPERTIES ("hbase.table.name" = "student");
- 使用hive操作表,支持HQL(HBase本身不支持)
select * from hive_student;
select * from hive_student where age >= 16;