datax import data from mysql to mysql

DataX is Alibaba's open source data synchronization tool, which realizes data synchronization of heterogeneous data sources. Github address: https://github.com/alibaba/DataX. The enterprise stores offline data to the data warehouse, but there is no way to connect to the business. This practice is mainly It uses DataX to import data from the warehouse to MySQL, so as to connect to the business, and to manage the outflow of data in the warehouse.

Generally import data from data warehouse to MySQL, you can query and store it from hive to a file. If the amount of data is relatively large, first divide the file into multiple files according to a certain number of rows, and then traverse the file to import into MySQL Although this method is simple, the disadvantage is that for every import requirement, you need to write a job and generate temporary files every time. mysql load takes up more resources. The reason why DataX is chosen is that it can import hdfs into MySQL. It is fast, can achieve full increments, and can be divided into tables, which can reduce the implementation cost of many technologies.
The use of dataX is very simple, let's take a look at the demo imported from mysql to another mysql:

{
	"job": {
		"setting": {
			"speed": {
				"channel": 3
			},
			"errorLimit": {
				"record": 0,
				"percentage": 0.02
			}
		},
		"content": [{
			"reader": {
				"name": "mysqlreader",
				"parameter": {
					"username": "",
					"password": "",
					"column": [
						"coupon_id",
						"effective_type",
						"effective_price",
						"save_type",
						"save_price",
						"receive_way",
						"effective_way",
						"leave_code",
						"now()"
					],
					"connection": [{
						"table": [
							""
						],
						"jdbcUrl": [
							""
						]
					}]
				}
			},
			"writer": {
				"name": "mysqlwriter",
				"parameter": {
					"writeMode": "insert",
					"username": "",
					"password": "",
					"column": [
						"coupon_id",
						"effective_type",
						"effective_price",
						"save_type",
						"save_price",
						"receive_way",
						"effective_way",
						"leave_code",
						"enter_time"
					],
					"connection": [{
						"jdbcUrl": "",
						"table": [
							""
						]
					}]
				}
			}
		}]
	}
}


 User name password, table name, and jdbcUrl can be configured by yourself.

Startup script: python datax.py xxxxx.json

If there is something wrong, please correct me. If you have any questions, you can add QQ group: 340297350. More dry goods of Flink and spark can be added to the following planets

 

Guess you like

Origin blog.csdn.net/xianpanjia4616/article/details/86749080
Recommended