table of Contents
1. Background
To build a data warehouse from zero these days, you need to synchronize all tables in a database in MySQL to the hive data warehouse using Sqoop. The first time is a full import, followed by an incremental import to hive. I want to view each table in MySQL. Whether there is an updata_time field in it.
Two, query
1. COLUMNS table
Provides column information in the table. It details all the columns of a table and the information of each column.
SELECT * from information_schema.`COLUMNS`;
2. TABLES
Provides information about the tables in the database (including views). It describes in detail which schema, table type, table engine, creation time and other information of a table belongs
SELECT * from information_schema.`TABLES` ;
3. Field query
SELECT t.table_name,c.column_name FROM information_schema.`TABLES` t
INNER JOIN information_schema.`COLUMNS` c
ON c.TABLE_NAME = t.TABLE_NAME
WHERE t.TABLE_TYPE = 'BASE TABLE'
# 查询是否 都有 update_time 字段
AND c.COLUMN_NAME = 'update_time'
# 查询的数据库
AND t.TABLE_SCHEMA = 'data_exchange'
# 数据库中包含了其他的表, 使用模糊查询
AND t.TABLE_NAME LIKE '%dwd\_\jz\_0000%'
ORDER BY t.TABLE_TYPE
Query the data pushed under the data_exchange data, all contain the update_time field, if this is the case, then it is easy for Sqoop to incrementally import the data.