[DataX] Realize data migration from postgresql to tdenginue

Environmental preparation

Executing DataX requires java 1.8 and a python environment. Python supports 2.0 and 3.0. I use python3.6 here

Download DataX

download link

unzipDataX

tar -zxvf dataX.tar.gz

Write DataX JOB

Write a job file, the suffix is ​​json

{
    
    
	"job": {
    
    
		"setting": {
    
    
			"speed": {
    
    
				// 设置传输速度,单位为byte/s,DataX运行会尽可能达到该速度但是不超过它.
				"byte": 10485762
			},
			//出错限制
			"errorLimit": {
    
    
				//出错的record条数上限,当大于该值即报错。
				"record": 0,
				//出错的record百分比上限 1.0表示100%,0.02表示2%
				"percentage": 0.02
			}
		},
		"content": [{
    
    
			"reader": {
    
    
				"name": "postgresqlreader",
				"parameter": {
    
    
					// 数据库连接用户名
					"username": "***",
					// 数据库连接密码
					"password": "***",
					"column": [
						"data_time", "product_id", "type", "content", "create_time", "device_id", "id"
					],
					//切分主键
					//"splitPk": "id",
					"connection": [{
    
    
						"table": [
							"迁移目标表"
						],
						//"querySql": [
                                      //    "SELECT 'device_log_djzttcc_d1' AS tbname,data_time AS _ts, product_id AS productid, type , content , device_id AS deviceid from logdev_d1;"
                                       //],
						"jdbcUrl": [
							"jdbc:postgresql://ip:port/数据库名"
						]
					}]
				}
			},
			"writer": {
    
    
				"name": "tdengine30writer",
				"parameter": {
    
    
					"username": "***",
					"password": "***",
					"column": [
						"_ts",
						"productid",
						"type",
						"content",
						"createtime",
						"deviceid",
						"id"
					],
					"connection": [{
    
    
						"table": [
							"迁移目的库"
						],
						"jdbcUrl": "jdbc:TAOS://ip:port/数据库名"
					}],
					"batchSize": 100,
					"ignoreTagsUnmatched": true
				}
			}
		}]
	}
}

One thing to note here is that there are only td2.0 writers in the alibaba/datax warehouse. If you want to write td3.0, you need to go to the taosdata/datax warehouse to compile the tdengine30writer and put it in the datax plugin directory.

Noun explanation and modification points

  • reader, writer refers to read and write plug-ins, the official support is as follows
type data source Reader (read) Writer (write) document
RDBMS relational database MySQL read , write
Oracle read , write
OceanBase read , write
SQLServer read , write
PostgreSQL read , write
DRDS read , write
Universal RDBMS (supports all relational databases) read , write
Alibaba Cloud Data Warehouse Data Storage ODPS read , write
ADS Write
OSS read , write
OCS Write
NoSQL data storage OTS read , write
Hbase0.94 read , write
Hbase1.1 read , write
Phoenix4.x read , write
Phoenix5.x read , write
MongoDB read , write
Hive read , write
Cassandra read , write
unstructured data storage TxtFile read , write
FTP read , write
HDFS read , write
Elasticsearch Write
time series database OpenTSDB read
TSDB read , write
TDengine2.0 read , write
TDengine3.0 read , write
  • connection specifies the table to be queried or written and the address of the library connection

execute script

The job file .json written by python datax.py

Guess you like

Origin blog.csdn.net/baidu_29609961/article/details/131830309
Recommended