Hive basic use of (a) - Data type

Hive basic use of (a) - Data type

1. Hive basic data types

Hive Data Types Java data types length example
TINYINT byte 1byte signed integer 20
SMALINT short 2byte signed integer 20
INT int 4byte signed integer 20
BIGINT long 8byte signed integer 20
BOOLEAN boolean Boolean, true or false TRUE FALSE
FLOAT float Single-precision floating-point number 3.14159
DOUBLE double Double-precision floating-point number 3.14159
STRING string Character series. You can specify the character set. You may be used single or double quotes. ‘now is the time’ “for all good men”
TIMESTAMP Time Type
BINARY Byte array

Note: For Hive of type String equivalent of varchar type database, which is a variable of type string, but it can not declare which can hold up to the number of characters, in theory, it can store 2GB of characters.

2. collection of data types

type of data description Syntax Example
STRUCT And similar c language struct, are available through the "dot" notation access element content. For example, if the data type of a column is STRUCT {first STRING, last STRING}, then the first element 1 can be referenced by a field .first. struct() 例如struct<street:string, city:string>
MAP MAP is a set of keys - tuples of values ​​using array notation can access the data. For example, if the data type of a column is the MAP, wherein the key -> value pair is the 'first' -> 'John' and 'last' -> 'Doe', it may be obtained by a last name field [ 'last'] element map () e.g. map <string, int>
ARRAY An array is a collection of variables having the same name and type. These variables are called elements of the array, each array element is given a number, numbers from scratch. For example, an array value of [ 'John', 'Doe'], then the second element can be referenced by the name of the array [1]. Array () array e.g.

There are three types of complex data Hive ARRAY, MAP and STRUCT. ARRAY and Java with the MAP Map Array and the like, and the C language STRUCT Struct Similarly, it encapsulates a collection of named fields, complex data type allows any level of nesting.

3. Case practical operation

1) Suppose the following table for a row, we use JSON format to represent its data structure format is accessed at Hive

{
    "name": "songsong",
    "friends": ["bingbing" , "lili"] , //列表Array, 
    "children": {                      //键值Map,
        "xiao song": 18 ,
        "xiaoxiao song": 19
    }
    "address": {                       //结构Struct,
        "street": "hui long guan" ,
        "city": "beijing" 
    }
}

2) Based on the above data structures, we have created a corresponding table in the Hive, and import data

Create a local test file test.txt, reads as follows:

songsong,bingbing_lili,xiao song:18_xiaoxiao song:19,hui long guan_beijing
yangyang,caicai_susu,xiao yang:18_xiaoxiao yang:19,chao yang_beijing

3) create a test table shujujiegou the Hive

create table test(
name string,
friends array<string>,
children map<string, int>,
address struct<street:string, city:string>
)
row format delimited fields terminated by ','
collection items terminated by '_'
map keys terminated by ':'
lines terminated by '\n';


字段解释:
row format delimited fields terminated by ','  -- 列分隔符
collection items terminated by '_'      --MAP STRUCT 和 ARRAY 的分隔符(数据分割符号)
map keys terminated by ':'              -- MAP中的key与value的分隔符
lines terminated by '\n';                   -- 行分隔符

4) into the text data to the test table

hive (default)> load data local inpath  ‘/opt/module/datas/test.txt’into table test

5) access the set of columns in the three kinds of data, respectively, the following access method ARRAY, MAP, STRUCT of

hive (default)> select friends[1],children['xiao song'],address.city from test
where name="songsong";
OK
_c0     _c1     city
lili    18      beijing
Time taken: 0.076 seconds, Fetched: 1 row(s)

4. Data type conversion

Hive atomic data type is implicit conversion may be similar to Java type conversion, for example, using an expression of the type INT, INT TINYINT automatically converted to the type, but not Hive reverse conversion, e.g., an expression use TINYINT type, INT will not be automatically converted to TINYINT type, it will return an error, unless CAST operations.

(1) The implicit type conversion rules as follows

1) any integer type can be implicitly converted to a broader range of types, such as TINYINT can be converted into INT, INT can be converted into BIGINT.

2) all integer types, FLOAT and STRING type can be converted implicitly to DOUBLE.

3) TINYINT, SMALLINT, INT can be converted to FLOAT.

4) BOOLEAN type can not be converted into any other type.

(2) may be used CAST operation display data type conversion

E.g. CAST ( '1' AS INT) uses the string '1' into an integer 1; if cast fails, performed as CAST ( 'X' AS INT), the expression returns NULL NULL.

0: jdbc:hive2://hadoop102:10000> select '1'+2, cast('1'as int) + 2;
+------+------+--+
| _c0  | _c1  |
+------+------+--+
| 3.0  | 3    |
+------+------+--+

Guess you like

Origin www.cnblogs.com/simon-1024/p/11785985.html