[123] 05_Hive external partition table score4 operation

【123】

The operation of the external partition table score4, and explain why it is necessary to execute msck repair table score4 at the end

【answer】

- into the Hive environment

[hadoop@node03 ~]$ hive

 

- connect to db_hive database (the database has been created in advance: create database db_hive;)

hive (default)> use db_hive;

 

- Create an external partition table score4

hive (db_hive)> create external table score4(s_id string, c_id string, s_score int) partitioned by (month string) row format delimited fields terminated by '\t';

 

- Check whether the external table table

hive (db_hive)> desc formatted db_hive.score;

- - upload score.csv files to node03 directory machine / app / hivedatas / under

- Create a folder

# mkdir -p /app/hivedatas/

# chown hadoo: hadoop / app / hivedatas

 

- upload files by uploading tool, after uploading check whether the hadoop user files, such authorization is not required Same as above

 

- Load local files to db_hive library score4 table

hive (db_hive)> load data local inpath '/app/hivedatas/score.csv' into table db_hive.score4 partition(month=202012);

 

- view table data (data is not important, you can make up some of their own, there may be a few lines)

hive (db_hive)> SELECT * FROM score4;

 

- see if there is a partition , in this way insert data, you can directly see the partition table

hive (db_hive)> show partitions score4;

 

- delete the external table, but not actually delete files hdfs, you can try to delete the table, and then create a new table, use msck repair table score4 command to synchronize data and metadata.

- execution msck repair table score4

hive (db_hive)> msck repair table score4

 

 

 

Why do you need to execute msck repair table score4

1. First of all, you need to understand what Hive data is and where is it stored?

Hive data is divided into table data and metadata;

Table data hive in the table (table) with data stored in a file by way of HDFS (Hadoop distributed file system ) , each generally in the table HDFS respective directories are, of course, except for the external table;

The metadata is used to store the name of the table, the columns and partitions and attributes of the table ( whether it is an external table, etc. ) , the directory where the data of the table is located, etc.; it is generally written to the database, and the configuration is written to the metastore library under mysql .

 

2. The role of msck repair table score4 command

The main role is not added to partition the data, it is added to the corresponding partition, data synchronization source Metadata .

Guess you like

Origin blog.csdn.net/debimeng/article/details/111088901