Hive - Hive Data Types

1, the basic data types

Hive Data Types

Java data types

length

example

TINYINT

byte

1byte signed integer

20

SMALINT

short

2byte signed integer

20

INT

int

4byte signed integer

20

BIGINT

long

8byte signed integer

20

BOOLEAN

boolean

Boolean, true or false

TRUE  FALSE

FLOAT

float

Single-precision floating-point number

3.14159

DOUBLE

double

Double-precision floating-point number

3.14159

STRING

string

Character series. You can specify the character set. You may be used single or double quotes.

‘now is the time’ “for all good men”

TIMESTAMP

 

Time Type

 

BINARY

 

Byte array

 

        For a String Hive equivalent varchar type database, which is a variable of type string, but it can not declare which can hold up to the number of characters, in theory, it can store 2GB of characters.

2, a set of data types

type of data

description

Syntax Example

STRUCT

And similar c language struct, are available through the "dot" notation access element content. For example, if the data type of a column is STRUCT {first STRING, last STRING}, then the first element 1 can be referenced by a field .first.

struct()

MAP

MAP is a set of keys - tuples of values ​​using array notation can access the data. For example, if the data type of a column is the MAP, wherein the key -> value pair is the 'first' -> 'John' and 'last' -> 'Doe', it may be obtained by a last name field [ 'last'] element

map()

ARRAY

An array is a collection of variables having the same name and type. These variables are called elements of the array, each array element is given a number, numbers from scratch. For example, an array value of [ 'John', 'Doe'], then the second element can be referenced by the name of the array [1].

Array()

        There are three types of complex data Hive ARRAY, MAP and STRUCT. ARRAY and Java with the MAP Map Array and the like, and the C language STRUCT Struct Similarly, it encapsulates a collection of named fields, complex data type allows any level of nesting.

The actual case are as follows

1) Suppose the following table for a row, we use JSON format to represent its data structure. Format is accessible at Hive

{

    "name": "songsong",

    "friends": ["bingbing" , "lili"] ,       //列表Array,

    "Children": {// keys Map,

        "xiao song": 18 ,

        "xiaoxiao song": 19

    }

    "Address": {// structure Struct,

        "street": "hui long guan" ,

        "city": "beijing"

    }

}

2) Based on the above data structures, we have created a corresponding table in Hive and import data.

    Create a local test file test.txt

songsong, bingbing_lili, xiao song: 18_xiaoxiao song: 19, hui long guan_beijing

yangyang, caicai_susu, xiao yang: 18_xiaoxiao yang: 19, chao yang_beijing

    Note : between MAP, STRUCT and ARRAY element in the relationship can be represented by the same character, here "_."

3)Hive上创建测试表test

create table test(
    name string,
    friends array<string>,
    children map<string, int>,
    address struct<street:string, city:string>
)
row format delimited 
fields terminated by ','
collection items terminated by '_'
map keys terminated by ':'
lines terminated by '\n';

字段解释:   

row format delimited fields terminated by ','  -- 列分隔符

collection items terminated by '_'   --MAP STRUCT 和 ARRAY 的分隔符(数据分割符号)

map keys terminated by ':' -- MAP中的key与value的分隔符

lines terminated by '\n'; -- 行分隔符

4)导入文本数据到测试表    或者直接put到这个目录也可以

hive > load data local inpath ‘/opt/module/datas/test.txt’into table test

5)访问三种集合列里的数据,以下分别是ARRAY,MAP,STRUCT的访问方式

hive > select friends[1],children['xiao song'],address.city from test
where name="songsong";
OK
_c0     _c1     city
lili    18      beijing
Time taken: 0.076 seconds, Fetched: 1 row(s)

3、类型转化

        Hive的原子数据类型是可以进行隐式转换的,类似于Java的类型转换,例如某表达式使用INT类型,TINYINT会自动转换为INT类型,但是Hive不会进行反向转化,例如,某表达式使用TINYINT类型,INT不会自动转换为TINYINT类型,它会返回错误,除非使用CAST操作

1.隐式类型转换规则如下

  (1)任何整数类型都可以隐式地转换为一个范围更广的类型,如TINYINT可以转换成INT,INT可以转换成BIGINT。

  (2)所有整数类型、FLOAT和STRING类型都可以隐式地转换成DOUBLE。

  (3)TINYINT、SMALLINT、INT都可以转换为FLOAT。

  (4)BOOLEAN类型不可以转换为任何其它的类型。

2.可以使用CAST操作显示进行数据类型转换

       例如CAST('1' AS INT)将把字符串'1' 转换成整数1;如果强制类型转换失败,如执行CAST('X' AS INT),表达式返回空值 NULL。

 

 

 

 

 

Guess you like

Origin blog.csdn.net/qq_41544550/article/details/92128076