MongoDB之hive读mongodb数据

MongoDB(三)之hive读mongodb数据

说明:users表数据同MongoDB(二)中users表

1 准备jar包

mongo-hadoop-core-2.0.2.jar;
mongo-hadoop-hive-2.0.2.jar;
mongo-java-driver-3.9.1.jar;
mongodb-driver-3.9.1.jar;

2 将jar包上传到hdfs上

hdfs dfs -mkdir /user/mongo
hdfs dfs -put mongo*.jar /user/mongo
hdfs dfs -ls /user/mongo

3 完成hive和mongoDB读写数据
3.1 进入hive

beeline -u 'jdbc:hive2://localhost:10000' '' ''

3.2 从hdfs向hive添加jar包

add jar hdfs://sandbox-hdp.hortonworks.com:8020/user/mongo/mongo-hadoop-core-2.0.2.jar;
add jar hdfs://sandbox-hdp.hortonworks.com:8020/user/mongo/mongo-hadoop-hive-2.0.2.jar;
add jar hdfs://sandbox-hdp.hortonworks.com:8020/user/mongo/mongodb-driver-3.9.1.jar;
add jar hdfs://sandbox-hdp.hortonworks.com:8020/user/mongo/mongo-java-driver-3.9.1.jar;

3.3 创建外部表users并从mongoDB读取users表数据

drop table if exists demo.users;
create external table demo.users(object_id STRING,
 user_id STRING,
 locale STRING,
 birthyear INT,
 gender STRING,
 joined_at STRING,
 location STRING,
 time_zone STRING
)
 stored by 'com.mongodb.hadoop.hive.MongoStorageHandler'
 with serdeproperties('mongo.columns.mapping'='{"object_id":"_id","user_id":"user_id","locale":"locale","birthyear":"birthyear","gender":"gender"},"joinedAt":"joinedAt","location":"location","timezone":"timezone"')
 tblproperties('mongo.uri'='mongodb://192.168.30.1:27017/events_db.users');

192.168.30.1:27017为本人mongoDB所在虚拟机的ip及端口号,记得修改为你自己哦!!!

3.4 执行成功后,在hive中查看users表

use demo;
show tables;

注:由于是外部表,所以数据依旧存储在mongoDB中,必须保证mongoDB中events_db库中存在users表,否则将查询不到数据。

发布了22 篇原创文章 · 获赞 22 · 访问量 768

猜你喜欢

转载自blog.csdn.net/weixin_45568892/article/details/105293405