hive table new external table associated hdfs file

  • The Hadoop and hive environments have been installed, and hive stores metadata in the mysql database. Only the association between external tables and HDFS is discussed here, and after deleting the external table, it has no effect on the files on HDFS
  1. Create a partition in HDFS, and save files in hdfs. The date is the partition, and the data file format is as follows:
-rw-r--r--   3 bigdata supergroup      16031 2018-02-07 09:40 /user/bigdata/dataflowpre/20180207/10.37.167.204_25_1_2018020709.1517967113614

The format of the data in the file is as follows:

2;Lily;1991;Shanghai
3;Jack;1992;Guangxi
4;Jenny;1999;Xinjiang
5;Jay;1995;Xizang
6;Tom;1990;Beijing
7;Lily;1991;Shanghai
8;Jack;1992;Guangxi
9;Jenny;1999;Xinjiang
10;Jay;1995;Xizang
  1. Create a hive external table (the table creation statement does not directly associate the HDFS file with location) Create a hive external table according to the data format in the source file on HDFS.
create external table t4 (seq int,name string,year int, city string) partitioned by (day string) row format delimited fields terminated by '\073' stored as textfile;

The separator between columns in the source file is a semicolon, and the semicolon in hive is a special symbol (command execution terminator). If a semicolon is used in the table creation statement, an error will occur. Use the ascii code '\073' of the semicolon ( The octal code is used here) is normal. Check the table structure of t4: the table has been successfully created.

  1. Associate HDFS file hive does not automatically associate the partitions directory of the specified directory in hdfs, and requires manual operation. By adding partition data to the already created table, it is associated with the files on HDFS. Syntax format:
alter table 表名称 add partition (分区名1=分区值1,...) location 'HDFS上的文件的路径';

Associate the partition of 20180207:

alter table t4 add partition (day=20180207) location '/user/bigdata/dataflowpre/20180207';

Execute the following command to view the table and find that the data has been imported.

select * from t4;

Get as shown below:

But why the first column is null is not clear, continue to check later. Create a new table and change the seq attribute to string.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324950472&siteId=291194637