wild_table

wild_table

Table of Contents

1 问题

  • 空间 宽表存储数据,必然会面临一个问题:一对多,数据是N倍的存储。将会多占用N倍空间。

2 解决方法

  • json等格式存储

    测试:

    -- 建表
    drop table if exists jsontest;
    
    CREATE TABLE IF NOT EXISTS jsonTest
    (teacher_name varchar(10),
    major varchar(10),
    students_info string comment "学生信息"
    )
    comment "学生课程信息"
    row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'
    LOCATION
      'hdfs://nameservice1/user/hive/warehouse/bigdata.db/jsontest';
    
    insert into jsontest values
    ('t1','语文','{"grade":1,"info":{"name":"xinzi","age":14,"sex":"M"}}|{"grade":3,"info":{"name":"lisi","age":14,"sex":"M"}}'),
    ('t2','maths','{"grade":2,"info":{"name":"zhangs","age":14,"sex":"F"}}')
    ;
    
    -- 查看已有数据
    select row_number() over() , a.* from jsontest a;
    
    
    
    -- 创建视图
    drop view if exists v_jsontest;
    create view if not exists v_jsontest
    as
    select teacher_name,major,a.grade,b.name,b.age,b.gender from jsontest lateral view explode(split(students_info,'\\|')) st as stin
    lateral view json_tuple(stin,'grade','info') a as grade,info
    lateral view json_tuple(a.info,'name','age','sex') b as name,age,gender;
    

– 原表数据

hive> select * from jsontest; OK t1 语文 {"grade":1,"info":{"name":"xinzi","age":14,"sex":"M"}}|{"grade":3,"info":{"name":"lisi","age":14,"sex":"M"}} t2 maths {"grade":2,"info":{"name":"zhangs","age":14,"sex":"F"}} Time taken: 0.054 seconds, Fetched: 2 row(s)

– 引用后结果

hive> select * from v_jsontest; ………. OK t1 语文 1 xinzi 14 M t1 语文 3 lisi 14 M t2 maths 2 zhangs 14 F Time taken: 12.933 seconds, Fetched: 3 row(s)

Author: halberd.lee

Created: 2020-04-09 Thu 12:50

Validate

猜你喜欢

转载自www.cnblogs.com/halberd-lee/p/12666170.html
今日推荐