Hive数据导入方式整理

参考文档:https://cwiki.apache.org/confluence/display/Hive/LanguageManual

LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
 
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)] [INPUTFORMAT 'inputformat' SERDE 'serde']

创建普通表:

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name (
  c1 string,
  c2 string)
row format delimited
fields terminated by '\u0001'
STORED AS textfile;

注:

指定EXTERNAL为外部表

hive 默认分隔符为‘\001’

不指定location 默认存储位置为‘/user/hive/warehouse’目录下

数据导入:

load data [local] inpath [local_path]&[hdfs_path] overwrite into table db_name.table_name;
若文件上传至hdfs load 操作会删除源文件,相当于mv操作

创建分区表:

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name (
  c1 string,
  c2 string)
partitioned by (c3 string)
row format delimited
fields terminated by '\u0001'
STORED AS textfile;

数据导入

load data [local] inpath [local_path]&[hdfs_path] overwrite into table db_name.table_name partition (c3=c3_value);
hdfs目录文件结构:
hdfs://nameservice1/user/hive/warehouse/db_name.db/table_name/c3=c3_value/

Like建表:(表结构)

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
  LIKE existing_table_or_view_name
  [LOCATION hdfs_path];

As建表:(结构&数据)

create table table_name
as 
select * from example_table_name;

查询导入

insert into table table_name select col_name... from example_table;
insert overwrite table table_name select col_name... from example_table;

动态分区导入后续更新

Guess you like

Origin blog.csdn.net/qq_24256877/article/details/106495353