[Personal Notes] Complex Types of hive

The data types of hive are very rich, so let’s not talk about the simple ones, just talk about the understanding and use of several commonly used complex types.
According to the official website , there are 4 complex types of hive:

Complex Types:
1. arrays: ARRAY<data_type> (Note: negative values ​​and non-constant expressions are allowed as of Hive 0.14.)
2. maps: MAP<primitive_type, data_type> (Note: negative values ​​and non-constant expressions are allowed as of Hive 0.14.)
3. structs: STRUCT<col_name : data_type [COMMENT col_comment], …>
4. union: UNIONTYPE<data_type, data_type, …> (Note: Only available starting with Hive 0.7.0.)
Our daily The first three are commonly used. We only need to understand that they are definition, value, and construction. They can be used in hive.

ARRAY
definition: array
value: arr[0]
construction: array(val2, val2, val3,...), split(), COLLECT_SET()

map
definition: map<String, String>
value: map[key]
construct L: map(key1, value1, key2, value2, ...)

struct:
Definition: struct<a:string,b:string>
Value: struct.id
Construction: name_struct(name1,val1,name2,val2,name3,val3,…)

  • There is a small difference between the use of map and struct:
  • The types in the map are the same, we can know according to our definition
  • The types in the struct can be different.

Here is a simple scene table create statement:

CREATE EXTERNAL TABLE ods_log_inc
(
    `common`   STRUCT<ar :STRING,ba :STRING,ch :STRING,is_new :STRING,md :STRING,mid :STRING,os :STRING,uid :STRING,vc:STRING> ,
    `page`     STRUCT<during_time :STRING,item :STRING,item_type :STRING,last_page_id :STRING,page_id:STRING,source_type :STRING> ,
    `actions`  ARRAY<STRUCT<action_id:STRING,item:STRING,item_type:STRING,ts:BIGINT>> ,
    `displays` ARRAY<STRUCT<display_type :STRING,item :STRING,item_type :STRING,`order` :STRING,pos_id:STRING>> ,
    `start`    STRUCT<entry :STRING,loading_time :BIGINT,open_ad_id :BIGINT,open_ad_ms :BIGINT,open_ad_skip_ms:BIGINT> ,
    `err`      STRUCT<error_code:BIGINT,msg:STRING> ,
    `ts`       BIGINT  
) 
    PARTITIONED BY (`dt` STRING)
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.JsonSerDe'
    LOCATION '/warehouse/gmall/ods/ods_log_inc/';

Guess you like

Origin blog.csdn.net/m0_49303490/article/details/128276240