[Big Data] Insert multiple pieces of data into the Hive table

Insert multiple pieces of data into Hive table

In Hive, we can use the INSERT INTO statement to insert data into a table. When we need to insert multiple pieces of data, there are many ways to do it. This article will introduce how to insert multiple pieces of data into a Hive table and provide corresponding code examples.

1. Use a single INSERT INTO statement to insert multiple pieces of data

The easiest way is to use a single INSERT INTO statement to insert multiple data. We can insert multiple pieces of data into a table at one time by using a list of values. Here is an example:

INSERT INTO table_name
VALUES (value1, value2, ...),
       (value1, value2, ...),
       ...;

For example, suppose we have a employeestable named with fields for employee names and ages. We can use the following code to insert multiple data into the table:

INSERT INTO employees
VALUES ('John', 30),
       ('Alice', 25),
       ('Bob', 35);

2. Use the INSERT INTO SELECT statement to insert multiple pieces of data

Another way is to use the INSERT INTO SELECT statement to insert multiple pieces of data. This way allows us to select multiple records from another table or query results and insert them into the target table. Here is an example:

INSERT INTO table_name
SELECT column1, column2, ...
FROM source_table
WHERE condition;

Suppose we have a employees_temptemporary table named with fields for employee names and ages. We can employees_tempinsert data from the table into employeesthe table with the following code:

INSERT INTO employees
SELECT name, age
FROM employees_temp;

3. Use the LOAD DATA statement to insert multiple pieces of data

If our data is already stored in the file, we can use the LOAD DATA statement to load multiple pieces of data in the file into the Hive table. Here is an example:

LOAD DATA [LOCAL] INPATH 'file_path'
[OVERWRITE] INTO TABLE table_name;

Among them, LOCALthe keyword means to load data from the local file system, file_pathwhich is the path of the file, and OVERWRITEthe keyword means to overwrite the data in the target table.

For example, assuming our data file name is data.txtstored in the HDFS /user/hive/data/directory, we can use the following code to load the data in the file into employeesthe table:

LOAD DATA INPATH '/user/hive/data/data.txt'
OVERWRITE INTO TABLE employees;

4. Summary

This article describes several ways to insert multiple pieces of data into a Hive table: using a single INSERT INTO statement, using an INSERT INTO SELECT statement, and using a LOAD DATA statement. According to different needs and data sources, we can choose a suitable way to insert multiple pieces of data. Hope the content of this article is helpful to you!

Note: Hive is a data warehouse tool built on top of Hadoop, and all data is stored in Hadoop's distributed file system. Before executing INSERT INTO statements or LOAD DATA statements, make sure your data is prepared and accessible through Hadoop's file system.


Reference: Insert multiple pieces of data into Hive table

Guess you like

Origin blog.csdn.net/be_racle/article/details/132393682