Hive entry partition table

1. Concept

Hive's partition table actually corresponds to an independent folder on the HDFS file system, and all data files of the partition are under this folder. The partition in Hive is to divide the directory , and divide a large data set into small data sets according to business needs. During the query, select the specified partition required by the query through the expression in the WHERE clause, such query efficiency will be greatly improved.

2. Create a partition table

create table dept_partition(
id int, name string
)
partitioned by (month string)
row format delimited fields terminated by '\t'
stored as textfile;

Insert picture description here

3. Load data to the partition table

There is a srcdata.txt under / home / hive:
Insert picture description hereload data:

load data local inpath '/home/hive/srcdata.txt' into table default.dept_partition partition(month='202001');

Insert picture description here
Insert picture description here

4. Partition data query

  1. Single partition query:
 select * from dept_partition where month = '202001';

Insert picture description here

  1. Multi-partition joint query:
 select * from dept_partition where month = '202001'
 union
 select * from dept_partition where month = '202002';

Insert picture description here

5. Add partition

  • Create a single partition:
alter table dept_partition add partition(month='202008');
  • Create multiple partitions:
alter table dept_partition add partition(month='202005') partition(month='202006');

Insert picture description here

6. Delete the partition

  • Delete a single partition
alter table dept_partition drop partition (month='202008');

Insert picture description here

  • Delete multiple partitions
alter table dept_partition drop partition(month='202005'),partition(month='202006');

Insert picture description hereNote: When
deleting partitions, you need to add commas to multiple partitions, but not to add multiple partitions.

7. Check how many partitions are in the partition table

show partitions dept_partition;

Insert picture description here

8. View the partition table structure

show partitions dept_partition;

Insert picture description here

8. Secondary partition table

  • Create a secondary partition table:
create table dept_partition2(
id int, name string
)
partitioned by (month string, day string)
row format delimited fields terminated by '\t'
stored as textfile;

Insert picture description here

  • Load data to the partition table
load data local inpath '/home/hive/srcdata.txt' into table dept_partition2 partition(month='202011', day='01');

Insert picture description here
Insert picture description here

  • Query partition data
select * from dept_partition2 where month='202011' and day='01';

Insert picture description here

9. Upload the data directly to the partition directory, three ways to associate the partition table and the data

  1. Fix after uploading data

First create a partition table:

create table dept_partition2(
id int, name string
)
partitioned by (month string, day string)
row format delimited fields terminated by '\t'
stored as textfile;

Add partition:

alter table dept_partition2 add partition(month='202001',day='01');

upload data:

hadoop fs -put srcdata.txt /user/hive/warehouse/dept_partition2/month=202001/day=01

Direct query:
Insert picture description here
2. Add partition after uploading data.
You can also create a folder in the specified location according to step 1, upload data and add partition at the end;

dfs -mkdir -p /user/hive/warehouse/dept_partition2/month=202002/day=02;
hadoop fs -put srcdata.txt /user/hive/warehouse/dept_partition2/month=202002/day=02/
alter table dept_partition2 add partition(month='202002',day='02');

Insert picture description here
3. After creating the folder, load the data to the partition

alter table dept_partition2 add partition(month='202003',day='03');
load data local inpath '/home/hive/srcdata.txt' into table dept_partition2 partition(month='202003',day='03');

Insert picture description here

Published 39 original articles · won praise 1 · views 4620

Guess you like

Origin blog.csdn.net/thetimelyrain/article/details/104135494