Hive中,同时存在map、array、struct,建表语句应该怎么指定分隔符?

      Hive中存在map、array、和struct格式,那如果同时存在这三种格式时,建表语句的分隔符应该怎么指定呢?
      

一、 先说答案

      先说答案:

create table test(
	name string,
	friends array<string>,
	children map<string, int>,
	address struct<street:string, city:string>
)
row format delimited
fields terminated by ','
collection items terminated by '_'
map keys terminated by ':'
lines terminated by '\n';

      字段解释:

row format delimited fields terminated by ','   /* 列分隔符 */
collection items terminated by '_'         /*  MAP STRUCT 和 ARRAY 的分隔符(数据分割
符号)  */
map keys terminated by ':'    /* MAP 中的 key 与 value 的分隔符    */
lines terminated by '\n';       /* 行分隔符  */

      其中需要解释的地方其实只有两个:
      ①. collection items terminated by ‘_’ ,在hive中,map、array、struct都使用collection items terminated by来指定,所以只能共用一个分隔符。
      ②. lines terminated by ‘\n’, 不写也行,行分隔符默认就是 \n

二、 举个例子

      假设有如下数据,需要插入到hive相关表中

{
	"name": "张三",
	"friends": ["李四" , "王五"] , //列表 Array,
	"children": { //键值 Map,
		"小李四": 18 ,
		"小王五": 19
	}
	"address": { //结构 Struct,
		"street": "大兴" ,
		"city": "北京"
	}
}
  1. 首先,将其整理成一条数据:
张三,李四_王五,小李四:18_小王五:19,大兴_北京

      注意分隔符

  1. 建表
create table test(
	name string,
	friends array<string>,
	children map<string, int>,
	address struct<street:string, city:string>
)
row format delimited
fields terminated by ','
collection items terminated by '_'
map keys terminated by ':'
lines terminated by '\n';
  1. 将数据先vim到文档中,再读取到hive中
load data local inpath
"/home/software/data/test.txt" into table test;
  1. 访问方式
    访问map:
select 	friends[1], /* 这是访问array */
	children['xiaosong'], /* 这是访问map */
	address.city
	from test;

      
      

发布了48 篇原创文章 · 获赞 36 · 访问量 13万+

猜你喜欢

转载自blog.csdn.net/weixin_42845682/article/details/104919328