DDL, Hive Data Definition Language, Data Definition Language; that is, a database, we summarize the basic method
Outside management table vs built table
Management table, also known as built-in table; tables are created by default hive management table;
Data management table and outside the built form are stored in hdfs, as are the hive's table;
the difference
When creating the table hive interior, it will move the data to the data warehouse specified path, as hdfs somewhere;
If you create an external table, it does not move data, where the data is only recorded in the metadata;
The biggest difference is that : When removing internal table, delete data and metadata; deleted when the external table, only the metadata is deleted, the data is not deleted;
In view of this characteristic, the management table is not suitable for sharing data, prone to security issues;
In practice, generally used to build the table outside
Database
Create Database
CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name [COMMENT database_comment] [LOCATION hdfs_path] [WITH DBPROPERTIES (property_name=property_value, ...)];
Examples
hive> create database hive1101 location '/usr/hive_test'; OK Time taken: 0.12 seconds
Note the location of the address is not the hive default hdfs address, indicating that you can specify a non-default address
Drop Database
The database must be empty
DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];
Alter Database
Change the properties of the database
ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, ...); -- (Note: SCHEMA added in Hive 0.14.0) ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role; -- (Note: Hive 0.13.0 and later; SCHEMA added in Hive 0.14.0) ALTER (DATABASE|SCHEMA) database_name SET LOCATION hdfs_path; -- (Note: Hive 2.2.1, 2.4.0 and later)
Examples
hive> alter database hive1101 set dbproperties ('edit_by'='wjd'); OK Time taken: 0.118 seconds
Note, location can not be changed
May be Hive 2.2.1, 2.4.0 and later it can be, I was 2.3.6, not tested
Use Database
Switch to the target database
USE database_name; USE DEFAULT;
Show Database
Displays all database names
show databases;
Table
Create Table
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive 0.14.0 and later) [(col_name data_type [column_constraint_specification] [COMMENT col_comment], ... [constraint_specification])] [COMMENT table_comment] [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)] [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS] [SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)] ON ((col_value, col_value, ...), (col_value, col_value, ...), ...) [STORED AS DIRECTORIES] [ [ROW FORMAT row_format] [STORED AS file_format] | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later) ] [LOCATION hdfs_path]
And there are many parameters, specifically, with reference to the official website of - the following references
Parameter Description
temporary:
exeternal : Create an external table, also you need to specify the path where the actual data, location is specified
like : Copy table structure, but does not copy data
format Row : specifies the format of each line, if the format does not match the original data can be written to the table, but can not correctly write the table
// delimited fields terminated by '\ t' to \ t intervals
// delimited fields terminated by ',' attention comma delimited csv files can only be, can not be used to write their own, to be wrong
// delimited interval; end terminated;
ROW FORMAT DELIMITED [FIELDS TERMINATED BY char] [COLLECTION ITEMS TERMINATED BY char] [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char] | SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, ...)]
AS the Stored : load file formats
// If it is a plain text file, you can use textfile stored as; if the file is compressed, can be stored as SEQUENCEFILE
// There ORC, json other number, you can view the official website
by Partitioned : partition table, this is very important, devoted to the back
Examples
Hive > Create Table Student (ID int , name String) Row the format DELIMITED Fields terminated by ' \ T ' ; create a table to Hive > Create Table IF Not EXISTS student1 like Student; create a and a table as pattern table Hive > Create Table IF not EXISTS MyTable (sid int , sname String) > Row format DELIMITED Fields terminated by ' \ 005 ' > the Stored AS textfile; create an internal table hive> create external table if not exists pageview( > pageid int, > page_url string comment 'the page url' > ) > row format delimited fields terminated by ',' > location 'hdfs://192.168.220.144:9000/user/hive/warehouse'; 创建外部表 hive> create table student_p(id int,name string,sexex string,age int,dept string) > partitioned by(part string) > row format delimited fields terminated by ',' > stored as textfile; 创建分区表
Test row format
Writes the following data to the student, to \ t intervals
1 a 2 b 3 c 4 d,
Obviously, the last line is not to \ t interval
hive> load data local inpath '/usr/lib/hive2.3.6/1.txt' into table student; Loading data to table hive1101.student OK Time taken: 0.868 seconds hive> select * from student; OK 1 a 2 b 3 c NULL NULL Time taken: 0.17 seconds, Fetched: 4 row(s)
You can see the last line is not properly written
Drop Table
DROP TABLE [IF EXISTS] table_name [PURGE]; -- (Note: PURGE available in Hive 0.14.0 and later)
Truncate Table
Empty Table
TRUNCATE TABLE table_name [PARTITION partition_spec]; partition_spec: : (partition_column = partition_col_value, partition_column = partition_col_value, ...)
Alter Table
Attribute changes to the table
Rename Table
ALTER TABLE table_name RENAME TO new_table_name;
Alter Table Properties
ALTER TABLE table_name SET TBLPROPERTIES table_properties; table_properties: : (property_name = property_value, property_name = property_value, ... )
Alter Table Comment
ALTER TABLE table_name SET TBLPROPERTIES ('comment' = new_comment);
Add SerDe Properties
ALTER TABLE table_name [PARTITION partition_spec] SET SERDE serde_class_name [WITH SERDEPROPERTIES serde_properties]; ALTER TABLE table_name [PARTITION partition_spec] SET SERDEPROPERTIES serde_properties; serde_properties: : (property_name = property_value, property_name = property_value, ... )
Alter Column
Change Column Name/Type/Position/Comment
Modify the column name, type, location, etc.
ALTER TABLE table_name [PARTITION partition_spec] CHANGE [COLUMN] col_old_name col_new_name column_type [COMMENT col_comment] [FIRST|AFTER column_name] [CASCADE|RESTRICT];
Examples
CREATE TABLE test_change (a int, b int, c int); // First change column a's name to a1. ALTER TABLE test_change CHANGE a a1 INT; // Next change column a1's name to a2, its data type to string, and put it after column b. ALTER TABLE test_change CHANGE a1 a2 STRING AFTER b; // The new table's structure is: b int, a2 string, c int. // Then change column c's name to c1, and put it as the first column. ALTER TABLE test_change CHANGE c c1 INT FIRST; // The new table's structure is: c1 int, b int, a2 string. // Add a comment to column a1 ALTER TABLE test_change CHANGE a1 a1 INT COMMENT 'this is column a1';
Add/Replace Columns
Add or replace the column
ALTER TABLE table_name [PARTITION partition_spec] -- (Note: Hive 0.14.0 and later) ADD|REPLACE COLUMNS (col_name data_type [COMMENT col_comment], ...) [CASCADE|RESTRICT] -- (Note: Hive 1.1.0 and later)
Index
Create Index
CREATE INDEX index_name ON TABLE base_table_name (col_name, ...) AS index_type [WITH DEFERRED REBUILD] [IDXPROPERTIES (property_name=property_value, ...)] [IN TABLE index_table_name] [ [ ROW FORMAT ...] STORED AS ... | STORED BY ... ] [LOCATION hdfs_path] [TBLPROPERTIES (...)] [COMMENT "index comment"];
Drop Index
DROP INDEX [IF EXISTS] index_name ON table_name;
Alter Index
ALTER INDEX index_name ON table_name [PARTITION partition_spec] REBUILD;
Show
Show Databases
SHOW (DATABASES|SCHEMAS) [LIKE 'identifier_with_wildcards'];
Show Tables
SHOW TABLES [IN database_name] ['identifier_with_wildcards'];
Show Table Properties
SHOW TBLPROPERTIES tblname;
SHOW TBLPROPERTIES tblname("foo");
Show Create Table
SHOW CREATE TABLE ([db_name.]table_name|view_name);
Show Indexes
SHOW [FORMATTED] (INDEX|INDEXES) ON table_with_index [(FROM|IN) db_name];
Show Columns
SHOW COLUMNS (FROM|IN) table_name [(FROM|IN) db_name];
Examples
-- SHOW COLUMNS CREATE DATABASE test_db; USE test_db; CREATE TABLE foo(col1 INT, col2 INT, col3 INT, cola INT, colb INT, colc INT, a INT, b INT, c INT); -- SHOW COLUMNS basic syntax SHOW COLUMNS FROM foo; -- show all column in foo SHOW COLUMNS FROM foo "*"; -- show all column in foo SHOW COLUMNS IN foo "col*"; -- show columns in foo starting with "col" OUTPUT col1,col2,col3,cola,colb,colc SHOW COLUMNS FROM foo '*c'; -- show columns in foo ending with "c" OUTPUT c,colc SHOW COLUMNS FROM foo LIKE "col1|cola"; -- show columns in foo either col1 or cola OUTPUT col1,cola SHOW COLUMNS FROM foo FROM test_db LIKE 'col*'; -- show columns in foo starting with "col" OUTPUT col1,col2,col3,cola,colb,colc SHOW COLUMNS IN foo IN test_db LIKE 'col*'; -- show columns in foo starting with "col" (FROM/IN same) OUTPUT col1,col2,col3,cola,colb,colc -- Non existing column pattern resulting in no match SHOW COLUMNS IN foo "nomatch*"; SHOW COLUMNS IN foo "col+"; -- + wildcard not supported SHOW COLUMNS IN foo "nomatch";
There are many, please see the official website
References:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL official website
https://ask.hellobi.com/blog/wujiadong/9483
https://blog.csdn.net/xiaozelulu/article/details/81585867