Hive data loading methods (load, insert; ordinary table, partition table)

Preface

Introducing Hive data loading methods (insert, load)


Method 1: load data

Basic syntax:
load data [local] inpath '/opt/module/datas/student.txt' [overwrite] into table student[partition ]

Parameter description:
1 load data: indicates loading data
2 local: indicates loading data from local to hive table; otherwise, loads data from HDFS to hive table
3 inpath: indicates the path of loading data
Relative path, for example: project/data1
Absolute path, for example :/user/hive/project/data1
contains the complete URI of the schema, such as: hdfs://namenode:9000/user/hive/project/data1
4 overwrite: means overwriting the existing data in the table, otherwise it means appending. The content in the target table (or partition) will be deleted, and then the content in the file/directory pointed to by filepath will be added to the table/partition. 5 into table: indicates
which table to load
6 student: indicates the specific table
7 partition : Indicates uploading to the specified partition

-- 加载本地文件
load data local inpath '/home/hadoop/load1.txt' into table tb_load1;

-- 加载HDFS文件
load data inpath '/hive/test/load2.txt' into table tb_load1;

-- 加载分区数据
load data inpath '/hive/test/load_part_male.txt' into table tb_load2 
partition (sex='male');

--使用overwrite:会覆盖之前的数据
load data local inpath '/home/hadoop/load3.txt' overwrite into table tb_load1;

Method 2: insert insert

1. Ordinary watch

-- 覆盖 
insert overwrite table tb_insert1 select id,name from tb_select1;
-- 追加
insert into table tb_insert1 select id,name from tb_select1;

2.Partition table

-- 分区插入
insert overwrite table tb_insert_part partition(sex = 'male')
select id,name from tb_select1 where sex='male';

-- 动态分区插入(需先设置非严格模式)
set hive.exec.dynamic.partition.mode=nonstrict;

insert overwrite table tb_dy_part partition(sex) 
select id,name,sex from tb_select1;

Method 3: as select

注意: Data can only be loaded in as mode. If there are other partition fields, the partition fields are only retained in field form.

create table tb_create_mode as 
select id,name from tb_select1;

Data output

(1) Export to local

insert overwrite local directory '/home/hadoop/'
select id,name from tb_select1;

example :

INSERT overwrite directory "/user/yuanpengfei/ypf/lifeng/vehPOI" ROW format delimited fields terminated BY "," 
select substr( md5(concat('mb',field_2,'xx')),9,6), field_3, field_4, field_5, field_6, field_7
from default.longchuan_od_temp

Summarize

If this article is helpful to you, I hope the big guys can 关注support me. Thank you very much ! Please correct me if there is something wrong!!!点赞收藏评论

Reference 1
Reference 2

Guess you like

Origin blog.csdn.net/weixin_42326851/article/details/132214145