A practice first, direct get started
1. hive tables and data preparation
Construction table and insert the initial data. Inserted into the table
hive> use test; hive> create table kwang_test (id int, name string); hive> insert into kwang_test values(1,'kwang'); hive> insert into kwang_test values(2,'rzheng'); hive> select * from kwang_test; OK 1 kwang 2 rzheng
2. insert into operation
insert into syntax:
INSERT INTO TABLE tablename [PARTITION (partcol1[=val1], partcol2[=val2] ...)] VALUES values_row [, values_row ...]
By inserting data into a table, insert into kwang_test statements and query results.
hive> insert into table kwang_test values(3,'kk'); hive> select * from kwang_test; OK 1 kwang 2 rzheng 3 kk
3. insert overwrite 操作
insert overwrite syntax:
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1 FROM from_statement;
Source overwrite standard syntax insert is inserted through the select syntax, for convenience, directly into the values. By inserting data into a table, insert overwrite kwang_test statements and query results.
hive> insert overwrite table kwang_test values(4,'zz'); hive> select * from kwang_test; OK 4 zz
4. The similarities and differences between the two
See the results of the above phenomenon, we can understand the basic similarities and differences between the insert and the insert into Overwrite, both of which can insert data into the hive table, but the insert into the operation mode is added to the tail hive additional data table, the insert direct overwriting operation is to overwrite data, i.e., delete the data table hive, and then perform a write operation. Note that if the hive is the partition table, then the table, insert overwrite operation will rewrite the data of the current partition, the other partition will not rewrite data.
[References]
[1]. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML