Import and export of hive data

1. Data import of hive

Linux local files and data formats:
insert image description here
create tables in hive:

create table t_user(
id int
,name string
)
row format delimited
fields terminated by ","
lines terminated by '\n'
stored as textfile;

stored as several common formats

1. TextFile: Tables stored in the TextFile format store data in the form of text files. This is the most commonly used default storage format.
2. SequenceFile: Tables stored in the SequenceFile format store data in the form of key-value pairs, which is suitable for data compression and efficient reading.
3. ORC (Optimized Row Columnar): ORC is a high-performance columnar storage format of Hive. It organizes data in columns and provides higher compression ratio and query performance.
4. Parquet: Parquet is a columnar storage format and a common option for Hive. It supports optimizations such as high compression and predicate pushdown, and is suitable for large-scale data analysis.
5. Avro: Avro is a cross-language data serialization system, and Hive can store data in Avro format!

load local data

load data local inpath '/home/hivedata/user.txt' into table t_user ;
-- 如果在into前面加了overwrite就是覆盖之前的数据重新导入数据

Load data on hdfs
* Note: data needs to be
uploaded from local to hdfs on hdfs
insert image description here

// 追加添加
load data inpath '/yan/hivedata/user.txt' into table t_user;
//覆盖添加
load data inpath '/yan/hivedata/user.txt' into table t_user;

Insert data from another table into the target table

create table u1(
id int,
name string
);
insert into u1
(select id ,name from t_user);
# 查询一次插入多个表 ,把from写在前面
from t_user 
insert into u2 select *
insert into u3 select id ,name;

clone table

-- 把表结构和数据一起复制
create table u4 as select * from t_user;
-- 只复制表结构,只需要使用like 表名即可,不用select
create table u5 like t_user;

The difference between local data import and hdfs data import:

本地:将数据copy到hdfs的表目录下
hdfs:将数据剪切到hdfs的表目录下

2. Data export in hive

Export to a directory on the local file system

# 必须加overwrite
insert overwrite local directory '/home/hivedata/out/out1' select * from t_user;
# 在本地Linux系统中,最后一级的out1也是目录

insert image description here
insert image description here

Export to the hdfs directory

-- 比本地少了local
insert overwrite directory '/yan/hivedata/out/out1' select * from t_user;

In the exported data file, the default fields are not separated, and the square brackets are the default separation of hdfs, and the previous comma separator is gone. insert image description here
Import the data on hdfs to Linux local:

hive -e 'select * from zoo.t_user' >> /home/hivedata/out/out2/02
# 02是我建的空文件
# 导出的文件中字段分隔符默认是\t

insert image description here

Guess you like

Origin blog.csdn.net/qq_43759478/article/details/131562651