版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/l1028386804/article/details/88622207
转载请注明出处:https://blog.csdn.net/l1028386804/article/details/88622207
这里,我们列举一个例子,这个Serde用于从JSON数据中抽取出一些域,这些数据假设是来自某个信息系统,
并非要解析出JSON中所有的字段,而那些解析出来的字段都将作为表的字段
create external table message(
id bigint,
create_at string,
text string,
user_info map<string, string>
)
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'
location '/data/message';
接下来我们定义数据格式:
{"id":1, "create_at":"20190317", "text":"你好", "user_info":{"id":"1","name":"binghe01"}}
{"id":2, "create_at":"20190317", "text":"Hello", "user_info":{"id":"2","name":"binghe02"}}
我们将上面的数据保存在json.txt,并上传到服务器的/usr/local/src/目录下:
然后我们执行
hive> load data local inpath '/usr/local/src/json.txt' into table message;
hive> add jar /usr/local/hive-2.3.4/lib/hive-hcatalog-core-2.3.4.jar;
hive> select * from message;
OK
1 20190317 你好 {"id":"1","name":"binghe01"}
2 20190317 Hello {"id":"2","name":"binghe02"}
hive> select user_info["id"], user_info["name"] from message;
OK
1 binghe01
2 binghe02
Time taken: 0.415 seconds, Fetched: 2 row(s)