MongoDB(三)之hive读mongodb数据
说明:users表数据同MongoDB(二)中users表
1 准备jar包
mongo-hadoop-core-2.0.2.jar;
mongo-hadoop-hive-2.0.2.jar;
mongo-java-driver-3.9.1.jar;
mongodb-driver-3.9.1.jar;
2 将jar包上传到hdfs上
hdfs dfs -mkdir /user/mongo
hdfs dfs -put mongo*.jar /user/mongo
hdfs dfs -ls /user/mongo
3 完成hive和mongoDB读写数据
3.1 进入hive
beeline -u 'jdbc:hive2://localhost:10000' '' ''
3.2 从hdfs向hive添加jar包
add jar hdfs://sandbox-hdp.hortonworks.com:8020/user/mongo/mongo-hadoop-core-2.0.2.jar;
add jar hdfs://sandbox-hdp.hortonworks.com:8020/user/mongo/mongo-hadoop-hive-2.0.2.jar;
add jar hdfs://sandbox-hdp.hortonworks.com:8020/user/mongo/mongodb-driver-3.9.1.jar;
add jar hdfs://sandbox-hdp.hortonworks.com:8020/user/mongo/mongo-java-driver-3.9.1.jar;
3.3 创建外部表users并从mongoDB读取users表数据
drop table if exists demo.users;
create external table demo.users(object_id STRING,
user_id STRING,
locale STRING,
birthyear INT,
gender STRING,
joined_at STRING,
location STRING,
time_zone STRING
)
stored by 'com.mongodb.hadoop.hive.MongoStorageHandler'
with serdeproperties('mongo.columns.mapping'='{"object_id":"_id","user_id":"user_id","locale":"locale","birthyear":"birthyear","gender":"gender"},"joinedAt":"joinedAt","location":"location","timezone":"timezone"')
tblproperties('mongo.uri'='mongodb://192.168.30.1:27017/events_db.users');
192.168.30.1:27017为本人mongoDB所在虚拟机的ip及端口号,记得修改为你自己哦!!!
3.4 执行成功后,在hive中查看users表
use demo;
show tables;
注:由于是外部表,所以数据依旧存储在mongoDB中,必须保证mongoDB中events_db库中存在users表,否则将查询不到数据。