The difference between internal table and external table in hive, partition table and bucket table

The difference between internal and external tables:

  • Delete internal tables, delete table metadata and data
  • Delete external tables, delete metadata, do not delete data

Choice of the use of internal and external tables:

  • In most cases, the difference between them is not obvious. If all data processing is performed in Hive, then the internal table is preferred, but if Hive
    and other tools need to process the same data set, the external table is more suitable.

  • Use an external table to access the initial data stored on HDFS, and then use Hive to convert the data and store it in the internal table
  • The scenario of using external tables is that there are multiple different Schemas for a data set (the organization and structure of the database)
  • It can be seen from the comparison of the difference between external and internal tables and the selection of usage that hive actually only provides a new abstraction for the data stored on HDFS. Instead of managing data stored on
    HDFS. Therefore, whether you create an internal table or an external table, you can add or delete data in the data storage directory of the hive table.

The difference between partition table and bucket table:

  • The Hive data table can be partitioned according to certain fields, and the data management can be refined, which can make some queries faster. At the same time, tables and partitions can be further divided into
    Buckets. The principle of bucketing tables is similar to that of HashPartitioner in MapReduce programming.
  • Partitions and sub management data to refine both the tub, but a partition table is added manually to distinguish, since Hive read mode, the
    data of the partitions is not added to the check mode, data packets bucket table are divided in accordance with some The bucket field is hashed to form multiple files, so the accuracy of the data is much higher.

Guess you like

Origin blog.csdn.net/weixin_44703894/article/details/113863076