In the previous logic of the modification of the hive warehouse, if the fields of the hive table were added due to a temporary need, after a few days of thinking, I felt
This business does not need to add this field, in the hive partition table,
To add a column statement, you need to add a cascade, otherwise the partitioned table will not be found when querying the partitioned data of a certain day
alter table ods_teach_online_coursewares ADD COLUMNS (ccdl_begtime string COMMENT 'Teaching start time') CASCADE;
The main discussion here is that columns are added to the partition table, and the processing of these columns is not needed later:
1 This is my usual way, through sql way:
eg : Table 1 needs to remove column 1 and column 2,
Then, first create this table with columns 1 and 2 removed,
Then the hive command line is as follows:
set hive.exec.dynamic.partition.mode=nonstrict; must be set insert overwrite table ods_teach_online_coursewares_bak partition(day) When select selects a specific column name, it must be displayed with day province_id, province_name, city_id, city_name, county_id, county_name, school_id, school_name, grade, class_id, class_name, subject_id, subject_name, book_id, book_name, unit_id, unit_name, ccl_coursewares_id, coursewares_name, is_collect, pid, courseware_creator, creator_name, creator_icon, courseware_owner, owner_name, owner_icon, ccl_id, ccl_begtime, ccl_endtime, duration, ccdl_type, resource_count, ccl_type, day from ods_teach_online_coursewares distribute by day;
Here's what the loading looks like:
If you are copying the columns of the entire table, instead of only copying some of the columns, the writing method is as follows:
insert overwrite table tmp_test partition(day) select * from dm_login_class_user_count_distribution_semester distribute by day
Way 2:
Method 2: Use a combination of hadoop cp command + hive msck repair command 1 create table tmp_test1 like dm_login_class_user_count_distribution_semester; create target table 2 hadoop fs -cp hdfs://Galaxy/user/hive/warehouse/dev_treasury.db/dm_login_class_user_count_distribution_semester/* hdfs://Galaxy/user/hive/warehouse/dev_treasury.db/tmp_test1/ Copy the original table hdfs data to the target Table in hdfs directory 3 Enter the Hive environment and enter MSCK REPAIR TABLE tmp_test1; 4 Verify that the data is loaded in: > select * from dm_login_class_user_count_distribution_semester where day='2016-12-12' limit 1; OK 2016-12-12 4 3301 0 EDUCATION_STAFF 769 896 0 2016-12-12