Hive: front-end and back-end data transmission case practice

Briefly understand the data transmission of front-end and back-end

Insert image description here

data structure mapping

(1) Assume that a table has the following row, and we use JSON format to represent its data structure. The format accessed under Hive is

{
    
    
    "name": "songsong",
    "friends": ["bingbing" , "lili"] ,       //列表Array, 
    "children": {
    
                          //键值Map,
        "xiao song": 19 ,
        "xiaoxiao song": 18
    }
    "address": {
    
                          //结构Struct,
        "street": "hui long guan" ,
        "city": "beijing" 
    }
}

(2) Based on the above data structure, we create the corresponding table in Hive and import the data.
Create a local test file personInfo.txt in the directory /opt/module/hive/datas
[atguigu@hadoop102 datas]$ vim personInfo.txt
songsong,bingbing_lili,xiao song:18_xiaoxiao song:19,hui long guan_beijing
yangyang,caicai_susu,xiao yang :18_xiaoxiao yang:19,chao yang_beijing
Note: The relationships between elements in MAP, STRUCT and ARRAY can all be represented by the same character, here "_" is used.

Test Case

(1) Create test table personInfo on Hive
Insert image description here

hive(default)>create table personInfo (
name string,
friends array<string>,
children map<string, int>,
address struct<street:string, city:string>
)
row format delimited
fields terminated by ','
collection items terminated by '_'
map keys terminated by ':'
lines terminated by '\n';	

Specify the delimiter of the row format in the data file.
Use ',' to separate
the specified fields. Use '_' to separate the elements of
the specified collection type. Specify the key and value in the map type to use ':' to separate the
specified lines. The symbol is '\n'

(2) Upload the data to the corresponding path of the above table in hdfs
[atguigu@hadoop102 ~]$ hadoop fs -put /opt/module/hive/datas/personInfo.txt /user/hive/warehouse/personInfo;

(3) Access data in three collection columns. The following are the access methods of ARRAY, MAP, and STRUCT.

hive (default)>
select
friends[1],
children['xiao song'],
address.city
from personInfo
where name="songsong";
结果:
_c0     _c1     city
lili    18      beijing

Guess you like

Origin blog.csdn.net/weixin_45427648/article/details/131840027