When the table in the data warehouse is deleted by mistake

In actual work, although it is a small probability to delete the table by mistake, it is also inevitable and the last thing to happen. So, what should I do if I delete it by mistake.

1- The table is still there, and the data was deleted by mistake

2- Hey, it's all gone

 

First case

a- Copy the data from the recycle bin of HDFS to the table again

hdfs dfs -cp /user/USER_NAME/.Trash/user/hive/warehouse/db_name.db/table_name /user/hive/....

b-Execute load data command

load data inpath 'path...' into table table name [partition(xxx)]

If the table is not available for Zeppelin or other platforms after re-copying the data and loading it, you can try to create a temporary table first, load the data into the temporary table, and then select the temporary table insert overwrite to the original table.

It should be noted that the data recovery here completely relies on the HDFS recycle bin mechanism to be turned on, but it is not turned on by default. If you are an HDFS plumber, you can turn it on by Baidu.

 

Second case

I encountered the second situation a few days ago. At that time, the table structure and data were gone. Because it was a dw-layer table, you could run the azkaban task to regenerate the data, but the table structure needs to be restored first if the table is gone.

At the beginning, I wanted to piece together the table statement bit by bit according to the sql that generated the data, but in fact, there are data generation statements. Directly create table tmp_table_name as data generation statement can generate a table with the same structure, and then it’s simple. show create table tmp_table_name to get the table creation statement of the deleted table.

Guess you like

Origin blog.csdn.net/weixin_39445556/article/details/108143605