A: Hive general construction of the table statement
HIve appearance construction of the table statement:
create external table if not exists externaltable_test
(
aa string,
bb string
) partitioned by (date string)
row format delimited
fields terminated by '|'
location '/hive/table/externaltable_test/';
Table built inside the hive table statement
create table if not exists innertable_test
(
aa string,
bb string
) partitioned by (date string)
row format delimited
fields terminated by '|'
location '/hive/table/innertable_test/';
Two: hive data loading speed
Data about 500M, a copy of the data from the
Normal load load data
load data inpath '/hive/table/table_test/table_test/date=20190921' into table externaltable_test partition(date=20190921); 5.288
load data inpath '/hive/table/table_test/table_test/date=20190923' into table innertable_test partition(date=20190923); 5.454
This could see someone spray me, need to cut the data in the table, just change the appearance of metadata
How outer and inner tables speed load data as possible
But that is the truth! !
Three: the appearance of slow load Secret
The role of 1.location
The location is specified using the data stored path appearance, but then when loading data or cuts the data to the location in
Different inner and appearance but will not delete data in the table when deleted
2. Conclusions
If the appearance of a location to develop, then, when loading data into the location or cuts the data, resulting in data loading too slowly.
3. repair bug retest
The appearance of the construction of the table statement to read
create external table if not exists externaltable_test
(
aa string,
bb string
) partitioned by (date string)
row format delimited
fields terminated by '|';
Then test loading data:
load data inpath '/hive/table/table_test/table_test/date=20190921' into table externaltable_test partition(date=20190921); 0.683
So, when the construction of the appearance of the specified location or not, consciously copying the data to a specified location in better ~ ~ ~
Four: load speed of the inner lifting method
ALTER TABLE innertable_test ADD PARTITION(date=20190922) LOCATION '/hive/table/table_test/table_test/date=20190922'; 0.684
This just added to table a similar appearance of links, delete the table when the data does not remove the need to manage. In this way, whether within the outer table use.
This table is not a way to make in the table in the table appearance is not the appearance. But the personal feeling a little better on the appearance of use.
ALTER TABLE externaltable_test ADD PARTITION(ddate=20190920) LOCATION '/hive/table/table_test/dt=20190920'; 6.829
这是对有location的外表加载数据的测试,几乎和load的方式一样了,但是数据并没有剪切过去。具体原因还在探索。