Impact of hive construction of the table statement to load data

As well as the impact hive construction of the table statement to load the data? Are you kidding?
Keguan not go! ! I look at the test data.

A: Hive general construction of the table statement

 HIve appearance construction of the table statement:

create external table if not exists externaltable_test
(
        aa string,
        bb string
) partitioned by (date string)
row format delimited
fields terminated by '|'
location '/hive/table/externaltable_test/';

Table built inside the hive table statement

create table if not exists innertable_test
(
        aa string,
        bb string
) partitioned by (date string)
row format delimited
fields terminated by '|'
location '/hive/table/innertable_test/';

Two: hive data loading speed

    Data about 500M, a copy of the data from the

Normal load load data

    load data inpath '/hive/table/table_test/table_test/date=20190921' into table externaltable_test partition(date=20190921);         5.288

    load data inpath '/hive/table/table_test/table_test/date=20190923' into table innertable_test partition(date=20190923);              5.454

 

   This could see someone spray me, need to cut the data in the table, just change the appearance of metadata

   How outer and inner tables speed load data as possible

   But that is the truth! !

Three: the appearance of slow load Secret

The role of 1.location

                    The location is specified using the data stored path appearance, but then when loading data or cuts the data to the location in

      Different inner and appearance but will not delete data in the table when deleted 

2. Conclusions

                    If the appearance of a location to develop, then, when loading data into the location or cuts the data, resulting in data loading too slowly.

3. repair bug retest

                    The appearance of the construction of the table statement to read

     create external table if not exists externaltable_test
     (
             aa string,
             bb string
     ) partitioned by (date string)
     row format delimited 
     fields terminated by '|';

 

     Then test loading data:

     load data inpath '/hive/table/table_test/table_test/date=20190921' into table externaltable_test partition(date=20190921);         0.683

     So, when the construction of the appearance of the specified location or not, consciously copying the data to a specified location in better ~ ~ ~

Four: load speed of the inner lifting method

    ALTER TABLE innertable_test ADD PARTITION(date=20190922) LOCATION '/hive/table/table_test/table_test/date=20190922';         0.684

 

    This just added to table a similar appearance of links, delete the table when the data does not remove the need to manage. In this way, whether within the outer table use.

    This table is not a way to make in the table in the table appearance is not the appearance. But the personal feeling a little better on the appearance of use.

 

    ALTER TABLE externaltable_test  ADD PARTITION(ddate=20190920) LOCATION '/hive/table/table_test/dt=20190920';            6.829

    这是对有location的外表加载数据的测试,几乎和load的方式一样了,但是数据并没有剪切过去。具体原因还在探索。

Guess you like

Origin www.cnblogs.com/wuxiaolong4/p/11665948.html