Thirty-two, Hive's DML data operation

The last article talked about Hive's DDL operations, which are actually very similar to database operations. Since there is DDL, then there must be DML. This article mainly introduces Hive's DML data operations. Hive's DML operation can be divided into two parts: data import and data export. Let's take a look at it together. Pay attention to the column "Break the Cocoon and Become a Butterfly-Big Data" to see more related content~


table of Contents

One, data import

1.1 Use Load command to load data

1.1.1 Syntax

1.1.2 Example

1.2 Use the insert command to load data

1.2.1 Syntax

1.2.2 Example

1.3 create+select load data

1.4 Specify the data storage path through location when building the table

1.5 Use the Import command to import data into Hive

Two, data export

2.1 Use the Insert command to export data

2.2 Use HDFS commands to get data

2.3 Use hive command to export data

2.4 Use export to export data

2.5 Export using Sqoop statement

Three, clear data


 

One, data import

1.1 Use Load command to load data

1.1.1 Syntax

load data [local] inpath 'path_name' [overwrite] into table table_name [partition (partcol1=val1,…)];

Among them, local means to load data from the local to the hive table. If there is no local in the statement, it means to load data from the HDFS to the hive table. path_name represents the name of the location where the data is stored. Overwrite means to overwrite the existing data in the table, otherwise means to append. table_name represents the name of the hive table. partition means upload to the specified partition.

1.1.2 Example

1. To construct the test data set, it is still based on the file we have before, the separator is, as shown below:

2. Create Hive table

create table people(id string, name string, sex string) row format delimited fields terminated by ',';

3. Load the local file to the hive table

load data local inpath '/root/files/p.txt' into table people;

4. Load the file on hdfs to the hive table

load data inpath '/xzw/files/p.txt' into table people;

5. Load data to overwrite the data in the hive table

load data inpath '/xzw/files/p.txt' overwrite into table people;

1.2 Use the insert command to load data

1.2.1 Syntax

insert [into|overwrite] table table_name values(...);

Insert into means inserting into a table or partition by means of appending data, and the original data will not be deleted. Insert overwrite means that it will overwrite the existing data in the table or partition. It should also be noted that insert does not support inserting some fields.

1.2.2 Example

1. Create a partition table of employees and partition by date.

create table emp(id int, name string) partitioned by (rq string) row format delimited fields terminated by ',';

2. Insert data

insert into table emp partition(rq='202012') values(1,'xzw'),(2,'lzq');

3. Insert+select to insert data

insert overwrite table emp partition(rq='202011') select id,name from emp where rq = '202012';

Someone here will find out why data duplication occurs when overwrite is used? This is because we are building a partition table, and only the specified partition will be overwritten here.

4. Insert multi-table multi-partition insert

from emp
insert overwrite table emp partition(rq='202011')
select id, name where rq='202012'
insert overwrite table emp partition(rq='202010')
select id, name where rq='202012';

1.3 create+select load data

create table emp_tmp as select * from emp;

1.4 Specify the data storage path through location when building the table

create external table e_people(id string, name string, sex string) row format delimited fields terminated by ',' location '/xzw/files';

1.5 Use the Import command to import data into Hive

The Import command is applicable to the data exported using the export command. The syntax format is as follows:

import table table_name [partition(...)] from 'export_data_path';

Two, data export

2.1 Use the Insert command to export data

insert overwrite local directory '/root/files/emp_data' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' select * from emp;

If local is removed from the above command, it is an operation to export data to HDFS.

2.2 Use HDFS commands to get data

Find the storage path of the corresponding data on HDFS, and use the get command of HDFS to get the required data to the local.

2.3 Use hive command to export data

Use hive -e/-f to export data. For details, please refer to my other blog: "Hive calls sql files through -f and transfers parameters" .

2.4 Use export to export data

export table people to '/xzw/files/people';

2.5 Export using Sqoop statement

How to export data from Sqoop will be explained later, so I won’t talk about it here for now.

Three, clear data

Use the following command to clear the data of the hive table. It should be noted here that the truncate command can only clear the data of the internal table, and the data of the external table cannot be cleared.

truncate table table_name;

 

OK~Hive's DML data operation is coming to an end here. What problems did you encounter in the process? Welcome to leave a message and let me see what problems you all encountered~

Guess you like

Origin blog.csdn.net/gdkyxy2013/article/details/111152504