hadoop ecological --Hive (4) - dynamic partitioning Hive partitions, static partitions

thank:

http://bbs.elecfans.com/jishu_1600211_1_1.html

https://www.deeplearn.me/1536.html

 Dynamic partitioning:

In the partition table data is inserted, is not specified (all or specified) in the partition field value, which partition data is inserted by the data value itself determined.

Static partition:

When inserting data partition table, partition field specified by all values ​​manually inserting data into partitions determined.

Dynamic partitioning relevant attributes:

hive.exec.dynamic.par Ti tion of = to true: whether to allow dynamic partitioning

hive.exec.dynamic.partition.mode = strict: partition mode setting

  strict: at least you need to have a static partition

  nostrict: All can be dynamically partitioned

hive.exec.max.dynamic.partitions = 1000: maximum allowable number of dynamic partitions

hive.exec.max.dynamic.partitions.pernode = 100: mapper / reducer on a single node allows to create the maximum partition

Dynamic partitioning operations

## to create a temporary table

create table if not exists tmp(uid int,commentid bigint,recommentid bigint,year int,month int,day int)
row format delimited fields terminated by '\t';

## temporary tables to load data

load data local inpath '/root/Desktop/comm' into table tmp;

1, strict mode

## Create a dynamic partition table

create table if not exists dyp1(uid int,commentid bigint,recommentid bigint)
partitioned by(year int,month int,day int)
row format delimited fields terminated by '\t';

## is inserted into the dynamic partition table data - strict mode

insert into table dyp1 partition(year=2016,month,day)
select uid,commentid,recommentid,month,day from tmp;

2, non-strict mode

## Set non-strict mode dynamic partitioning

set hive.exec.dynamic.partition.mode=nostrict;

## Create a dynamic partition table

create table if not exists dyp2(uid int,commentid bigint,recommentid bigint)
partitioned by(year int,month int,day int)
row format delimited fields terminated by '\t';

## non-strict mode dynamic partitioning load data

insert into table dyp2 partition(year,month,day)
select uid,commentid,recommentid,year,month,day from tmp;
 

Partition attention to detail

(1), try not to use dynamic partitioning, because the dynamic partitioning of time, the number will be assigned to each partition reducer, when the number of partitions when reducer will increase the number of servers is a disaster.

(2) the difference between static and dynamic partitioning partitions, static partitions with or without data will create the partition, dynamic partitioning is the result set will be created, or not created.

(3) strict mode strict mode hive dynamic partitioning of the hive and hive.mapred.mode provided.

We provide a hive strict mode: To prevent users from accidentally committed malicious hql

hive.mapred.mode=nostrict : strict

If the pattern is strict, it will prevent the following three queries:

(1), the partition table query, WHERE field is not a filtering partition field.

(2), Cartesian product join queries, join queries, without conditions, or where on the conditions.

(3), on order by the query, there is order by queries with no limit statement.

 

 

 

 

 

 

 

All transaction automatically submitted

It refers to a format supported by orc

Use bucket list

Hive configuration parameters to support transactions

 

 

Static and dynamic partitioning partition

 

Guess you like

Origin www.cnblogs.com/Jing-Wang/p/10992797.html