操作过程
hive> select count(*) from test;
2018-05-25 11:08:40,651 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 61.19 sec
MapReduce Total cumulative CPU time: 1 minutes 1 seconds 190 msec
Ended Job = job_1515037630689_0063
MapReduce Jobs Launched:
Stage-Stage-1: Map: 9 Reduce: 1 Cumulative CPU: 61.19 sec HDFS Read: 820348819 HDFS Write: 107 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 1 seconds 190 msec
OK
7273391
Time taken: 462.62 seconds, Fetched: 1 row(s)
hive> load data inpath '/data/test/' into table test;
Loading data to table test
OK
Time taken: 7.003 seconds
hive> select count(*) from mianyandns9test;
MapReduce Total cumulative CPU time: 56 seconds 140 msec
Ended Job = job_1515037630689_0064
MapReduce Jobs Launched:
Stage-Stage-1: Map: 9 Reduce: 1 Cumulative CPU: 56.14 sec HDFS Read: 820348824 HDFS Write: 107 SUCCESS
Total MapReduce CPU Time Spent: 56 seconds 140 msec
OK
7273391
Time taken: 416.049 seconds, Fetched: 1 row(s)
结论:再次装载数据,对数据没有影响
hive> load data inpath '/data/test/' overwrite into table test;
Loading data to table test
OK
Time taken: 6.97 seconds
hive> dfs -ls /data/test/;
hive>
加入overwrite参数后 ,原来的文件消失
hive> select count(*) from test;
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1
2018-05-25 14:21:37,032 Stage-1 map = 0%, reduce = 0%
2018-05-25 14:22:13,490 Stage-1 map = 0%, reduce = 100%, Cumulative CPU 1.79 sec
MapReduce Total cumulative CPU time: 1 seconds 790 msec
Ended Job = job_1515037630689_0065
MapReduce Jobs Launched:
Stage-Stage-1: Reduce: 1 Cumulative CPU: 1.79 sec HDFS Read: 3984 HDFS Write: 101 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 790 msec
OK
0
Time taken: 129.468 seconds, Fetched: 1 row(s)
记录已经清零
小结:当指定了OVERWRITE后,目标文件夹中之前存在的数据将会先被删除,所以在装载数据时需要特别小心。