Internal and external tables in Hive

1. The difference between internal and external tables

  1. Internal table: Load data to the Hdfs directory where Hive is located. When deleted, both metadata and data files are deleted
  2. External table: Does not load data to the Hdfs directory where Hive is located. When deleting, only the table structure is deleted. In comparison, external tables are more secure and prevent accidental deletion of data files .

2. Usage scenarios

  1. Internal table: The intermediate table used in statistical analysis, the result table can use internal table, these data do not need to be shared , it is more appropriate to use internal table. And in many cases, we only need to keep the data of the last 3 days in the result partition table, and the data cannot be deleted when the partition is deleted when using the external table.
  2. External tables: ng logs and burial logs collected every day. The log data is collected in real time by the collection program. Once deleted by mistake, it is very troublesome to restore. And external tables facilitate data sharing .

Guess you like

Origin blog.csdn.net/Cxf2018/article/details/109285388
Recommended