A, Hive data type
1. The basic data types
From the above table we see that hive does not support date type, in the hive where the date is represented by a string, and commonly used date format conversion operation is operated by a custom function.
hive is in java, hive where the basic data types and basic data types java is also one correspondence , in addition to a string type. Signed integer type: TINYINT, SMALLINT, INT and BIGINT are equivalent to the java byte, short, int, and type of atoms long, they are 1-byte, 2 bytes, and 4 bytes of 8-byte signed integer . Hive and floating point data types FLOAT DOUBLE, corresponding to the basic type float and double types of java. The hive and the type BOOLEAN java equivalent of basic data types boolean.
For hive of type String equivalent of varchar type of database , which is a variable of type string, but it can not declare which can hold up to the number of characters, in theory, it can store the number of characters to 2GB.
Hive supports basic types of conversions basic types can be converted into the low byte of the high byte type , e.g. TINYINT, SMALLINT, INT can be converted to FLOAT, and all integer types, FLOAT and STRING type can be converted to DOUBLE, these transformation can be transformed into consideration java from the type of language, because the hive is written in java. Of course, also supports the high byte of the low byte type conversion type, which requires the use of custom function of the CAST hive
2. Complex Data Types
Complex data types including arrays ( ARRAY), map (MAP) and the structure (STRUCT / or may be understood as an object), shown in the following table:
3. Text character encoding
Text Format file, there is no doubt that for users, should be very familiar with the text file in comma or tab-separated parts, as long as the user needs, Hive support these file formats. However, both file formats have a common drawback that the user needs to text files that do not require extra care of as delimiters commas or tabs, and therefore, Hive uses several default control characters, these characters rarely appear in the field values. Hive belonging field to represent replace the default character delimiter.
Second, the operation of the database
1. Create a database
the Create Database database name;
2. Check database
// View all databases SHOW DATABASES; // check the specified database SHOW DATABASE database name; // use like fuzzy queries, such as: the beginning of the hive database SHOW DATABASE the LIKE 'like_ *' ; // to see a detailed description of the database desc database hive_01;
3. Use Database
use database name;
4. Delete database
// database name; this deletion, need corresponding table in the database after deleting all you can delete the database drop database database name; // force delete, delete all the tables themselves drop database database name cascade;
Third, the operation of the data table
1. Create a data table
// Create internal table Create Table inner_table ( ID int , name String, Hobby Array <String> , address Map <String, String> ) Row DELIMITED the format // fixed format Fields terminated by ',' // represents division field collection items terminated by '-' // striped array Map terminated by Keys ':'; // segmentation set
Precautions:
Data Sheet I created a symbol divided according to each according to their needs rewriting
Download Data
// After inpath Linux is the path to load the file overwrite represent all data before emptying loaded from the write and load data local inpath '/hivetest/person.txt' overwrite into table person;
// create the external table // create an external table requires the use of external keyword the Create the Table inner_table external ( the above mentioned id int , name String, Hobby Array <String> , address the Map <String, String> ) Row format DELIMITED // fixed format fields terminated by ',' // represents division field collection items terminated by '-' // striped array Map terminated by Keys ':'; // segmentation set LOCATION '/ the outter / Data' // this path is the path existing in HDFS
2. The difference between the inner and outer tables
Although only one key difference between the internal and outer tables, but nature is completely different, the internal table exists hive, if you delete an internal table can not be restored, after the outer table is deleted, on the path to the external table hdfs created back to leave a file when creating the same type of field again right path to the original path you do not need to import data, and then automatically create a table exists, if in the new table, as long as the path to local memory in the file, then the file will be created along with matching table type, an error table.
If speak directly into the table position of the outer table but not the metadata table in the hive, the through MSCK the REPAIR TABLE table_name (to write metadata information to the HDFS Metastore);
3. Review the table
// Check the entire contents of the table of the SELECT * from table name; // Check the table structure desc formatted table name; // Check the table in the library Show the Tables; // will table the results of inquiries into the new table create table table name as select query words; // create the same table no data structure cREATE tABLE new table table LIKE to be copied;
4. Delete table
drop table 表名;
Table 5. Modify
@ 1: Rename table alter table source table name rename to new name; // 2: modification information table columns alter table change table name table column names new table column name table column names new type; // 3: new additional table (disposable insert multiple columns) ALTER table table add columns ( column name type a name 1, column 2 column name type name 2 ) // 4: remove the column can not be deleted and replaced in the hive in nature Therefore, with a replacement to achieve the effect of deleting the ALTER table person_info replace Columns ( ID String, name String, Hobby Array <String> , address Map <String, String> ) // . 5: substitution table storage format rcfile -> orcfile ALTER table the sET fileformat SequenceFile T1; // 6: view the construction of the table statement to show create table table name; // 7: comment table settings Table person_info SET ALTER tblproperties ( "Comment" = "Person Detail" ); // . 8: modifying a table separator ALTER Table person_info SET serdeproperties ( 'colelction.delim' = '~' ;) // . 9: serde_class setting table serializable class Create Table T1 (ID int , String name, Age int ); ALTER Table T1 SET SerDe 'org.apache.hadoop.hive.serde2.RegexSerDe' with serdeproperties ( "input.regex" = "ID = (. *), name = (. * ), age = (. *) ")
Note: It should be noted that when the type of modification type compatibility follows: