hive 动态分区及load和insert用法

hive load用法:
load data local inpath '/home/data/stg_activity_antirush_apply.txt' 
overwrite into table stg_activity_antirush_apply;  
关键字local 不加数据从hdfs上加载,如果加local数据从本地加载;
关键字overwrite 为覆盖加载,会覆盖掉原有的数据;
动态分区设置

也可以在配置文件中修改
--SET hive.exec.dynamic.partition=true;  
--SET hive.exec.dynamic.partition.mode=nonstrict;  
--SET hive.exec.max.dynamic.partitions.pernode = 1000;  
--SET hive.exec.max.dynamic.partitions=1000;


<property>
    <name>hive.exec.dynamic.partition</name>
    <value>true</value>
    <description>Whether or not to allow dynamic partitions in DML/DDL.</description>
  </property>
  <property>
    <name>hive.exec.dynamic.partition.mode</name>
    <value>nonstrict</value>
    <description>
      In strict mode, the user must specify at least one static partition
      in case the user accidentally overwrites all partitions.
      In nonstrict mode all partitions are allowed to be dynamic.
    </description>
  </property>
  <property>
    <name>hive.exec.max.dynamic.partitions</name>
    <value>100000</value>
    <description>Maximum number of dynamic partitions allowed to be created in total.</description>
  </property>
  <property>
    <name>hive.exec.max.dynamic.partitions.pernode</name>
    <value>10000</value>
    <description>Maximum number of dynamic partitions allowed to be created in each mapper/reducer node.</description>
  </property>
  <property>
    <name>hive.exec.max.created.files</name>
    <value>100000</value>
    <description>Maximum number of HDFS files created by all mappers/reducers in a MapReduce job.</description>
  </property>
  
  
  
insert overwrite table dw_activity_antirush_apply PARTITION(DateID = ${etlDate}) --手动分区
select * from tablename;

insert into table dw_activity_antirush_apply PARTITION(DateID = ${etlDate}) --手动分区
select * from tablename;


insert overwrite table dw_activity_antirush_apply PARTITION(DateID) --动态分区
select *, column_name as dateid from tablename;

INSERT OVERWRITE会覆盖表或分区中已存在的数据
INSERT INTO以追加数据的方式插入到表或分区,原有数据不会删除
Insert可以插入表或分区,如果表是分区表,则Insert时需要指定插入到哪个分区

猜你喜欢

转载自blog.csdn.net/AntKengElephant/article/details/83184347