Hive's built-in data types can be divided into two categories: (1), basic data types; (2), complex data types. Among them, the basic data types include: TINYINT, SMALLINT, INT, BIGINT, BOOLEAN, FLOAT, DOUBLE, STRING, BINARY, TIMESTAMP, DECIMAL, CHAR, VARCHAR, DATE. The following table lists the bytes occupied by these basic types and from what version these types are supported.
type of data | bytes | Start support version |
TINYINT | 1byte,-128 ~ 127 | |
SMALLINT | 2byte , -32.768 ~ 32.767 | |
INT | 4byte,-2,147,483,648 ~ 2,147,483,647 | |
BIGINT | 8byte, -9,223,372,036,854,775,808 ~ 9,223,372,036,854,775,807 | |
BOOLEAN | ||
FLOAT | 4byte single precision | |
DOUBLE | 8byte double precision | |
STRING | ||
BINARY | Supported from Hive 0.8.0 | |
TIMESTAMP | Supported from Hive 0.8.0 | |
DECIMAL | Supported from Hive 0.11.0 | |
CHAR | Supported from Hive 0.13.0 | |
VARCHAR | Supported from Hive 0.12.0 | |
DATE | Supported from Hive 0.12.0 |
Complex types include ARRAY, MAP, STRUCT, UNION, these complex types are composed of basic types.
ARRAY: The ARRAY type is composed of a series of elements of the same data type, which can be accessed by subscripting. For example, there is an ARRAY type variable fruits, which is composed of ['apple','orange','mango'], then we can access the element orange through fruits[1], because the subscript of ARRAY type starts from 0 ;
MAP: MAP contains key->value key-value pairs, and elements can be accessed by key. For example, "userlist" is a map type, where username is key and password is value; then we can get the password corresponding to this user through userlist['username'];
STRUCT: STRUCT can contain elements of different data types. These elements can be obtained by "dot syntax". For example, if user is a STRUCT type, then the user's address can be obtained through user.address.
UNION: UNIONTYPE, which is supported since Hive 0.7.0.
MAP: MAP contains key->value key-value pairs, and elements can be accessed by key. For example, "userlist" is a map type, where username is key and password is value; then we can get the password corresponding to this user through userlist['username'];
STRUCT: STRUCT can contain elements of different data types. These elements can be obtained by "dot syntax". For example, if user is a STRUCT type, then the user's address can be obtained through user.address.
UNION: UNIONTYPE, which is supported since Hive 0.7.0.
Creating a table with copy type can be as follows
- CREATE TABLE employees (
- name STRING,
- salary FLOAT,
- subordinates ARRAY<STRING>,
- deductions MAP<STRING, FLOAT>,
- address STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>
- ) PARTITIONED BY (country STRING, state STRING);