1.Description environnementale
1.1 Serveur SQLS source
Version | IP | port |
---|---|---|
Microsoft SQL Server 2017 | 192.168.140.160 | 1433 |
1.2 Cibler GreatSQL
Version | IP | port |
---|---|---|
GreatSQL-8.0.32 | 192.168.139.86 | 3308 |
2. Environnement d'installation
2.1 Installer l'environnement SQLServer
Description de l'environnement : Démarrer la base de données à l'aide d'un miroir avec Docker
2.1.1Installer le menu fixe
1. Installez les progiciels de base
$ yum install -y wget net-tools nfs-utils lrzsz gcc gcc-c++ make cmake libxml2-devel openssl-devel curl curl-devel unzip sudo ntp libaio-devel wget vim ncurses-devel autoconf automake zlib-devel python-devel epel-release openssh-server socat ipvsadm conntrack yum-utils
2. Configurer la source yum domestique docker-ce (Alibaba Cloud)
$ yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
3. Installez les packages de dépendances Docker
$ yum install -y device-mapper-persistent-data lvm2
4.Installer docker-ce
$ yum install docker-ce -y
5. Démarrez le conteneur
$ systemctl start docker && systemctl enable docker
2.1.2 Extraire l'image
$ docker pull mcr.microsoft.com/mssql/server:2017-latest
2.1.3 Exécution du conteneur
$ docker run -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=********" \
-p 1433:1433 --name sqlserver2017 \
-d mcr.microsoft.com/mssql/server:2017-latest
N'oubliez pas de définir ici le mot de passe sur un mot de passe complexe
Explication des paramètres :
-
-e "ACCEPT_EULA=Y" : Sélectionnez la licence du contrat par défaut
-
-e "SA_PASSWORD=********" : définissez le mot de passe de connexion. Le mot de passe ne peut pas être trop court ou trop simple, sinon il ne répondra pas à la spécification du mot de passe sqlserver et le conteneur cessera de fonctionner.
-
-p 1433:1433 : le port hôte est mappé au port du conteneur (le premier est l'hôte)
-
--name sqlserver2017 : alias du conteneur
-
-d : exécuté en arrière-plan
-
mcr.microsoft.com/mssql/server:2017-latest:nom de l'image:étiquette
2.1.4 Utilisation de la base de données
1. Entrez dans le conteneur
$ docker exec -it sqlserver2017 bash
2. Connectez-vous à la base de données
$ /opt/mssql-tools/bin/sqlcmd -S localhost -U SA -P "********"
3. Interrogez la base de données
1> select name from sys.Databases;
2> go
4.Créer une base de données
1> create database testdb;
2> go
5. Créez un tableau et insérez des données
use testdb
create table t1(id int)
go
Insert into t1 values(1),(2)
go
2.2 Installer l'environnement GreatSQL
Utilisez l'image Docker pour installer, extrayez simplement l'image GreatSQL directement
$ docker pull greatsql/greatsql
et créer un conteneur GreatSQL
$ docker run -d --name greatsql --hostname=greatsql -e MYSQL_ALLOW_EMPTY_PASSWORD=1 greatsql/greatsql
2.3 Installer datax
L'installation de DataX nécessite des environnements dépendants
-
JDK (1.8 ou supérieur, 1.8 recommandé)
-
Python (Python2.6.X et supérieur recommandé)
Étapes d'installation : décompresser et utiliser. Cependant, si vous exécutez le travail sans effectuer d'autres opérations après la décompression, une erreur sera signalée.
$ cd /soft
$ ll
total 3764708
-rw-r--r-- 1 root root 853734462 Dec 9 04:06 datax.tar.gz
$ tar xf datax.tar.gz
$ python /soft/datax/bin/datax.py /soft/datax/job/job.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-07-19 11:19:17.483 [main] WARN ConfigParser - 插件[streamreader,streamwriter]加载失败,1s后重试... Exception:Code:[Common-00], Describe:[您提供的配置文件存在错误信息,请检查您的作业配置 .] - 配置信息错误,您提供的配置文件[/soft/datax/plugin/reader/._mysqlreader/plugin.json]不存在. 请检查您的配置文件.
2023-07-19 11:19:18.488 [main] ERROR Engine -
经DataX智能分析,该任务最可能的错误原因是:
com.alibaba.datax.common.exception.DataXException: Code:[Common-00], Describe:[您提供的配置文件存在错误信息,请检查您的作业配置 .] - 配置信息错误,您提供的配置文件[/soft/datax/plugin/reader/._mysqlreader/plugin.json]不存在. 请检查您的配置文件.
at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:26)
at com.alibaba.datax.common.util.Configuration.from(Configuration.java:95)
at com.alibaba.datax.core.util.ConfigParser.parseOnePluginConfig(ConfigParser.java:153)
at com.alibaba.datax.core.util.ConfigParser.parsePluginConfig(ConfigParser.java:125)
at com.alibaba.datax.core.util.ConfigParser.parse(ConfigParser.java:63)
at com.alibaba.datax.core.Engine.entry(Engine.java:137)
at com.alibaba.datax.core.Engine.main(Engine.java:204)
Pour résoudre l'erreur : supprimez tous les fichiers commençant par ._ dans le répertoire plugin et les répertoires plugin/reader et plugin/writer.
Besoin de supprimer les fichiers cachés dans trois répertoires
- brancher/
- plugin/lecteur/
- plugin/écrivain/
$ rm -rf /opt/app/datax/plugin/._*
$ rm -rf /opt/app/datax/plugin/reader/._*
$ rm -rf /opt/app/datax/plugin/writer/._*
Exécutez un scénario de test pour vérifier si datax est installé avec succès
$ python /soft/datax/bin/datax.py /soft/datax/job/job.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-07-19 11:22:12.298 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2023-07-19 11:22:12.305 [main] INFO Engine - the machine info =>
osInfo: Oracle Corporation 1.8 25.251-b08
jvmInfo: Linux amd64 4.19.25-200.1.el7.bclinux.x86_64
cpu num: 48
totalPhysicalMemory: -0.00G
freePhysicalMemory: -0.00G
maxFileDescriptorCount: -1
currentOpenFileDescriptorCount: -1
GC Names [PS MarkSweep, PS Scavenge]
MEMORY_NAME | allocation_size | init_size
PS Eden Space | 256.00MB | 256.00MB
Code Cache | 240.00MB | 2.44MB
Compressed Class Space | 1,024.00MB | 0.00MB
PS Survivor Space | 42.50MB | 42.50MB
PS Old Gen | 683.00MB | 683.00MB
Metaspace | -0.00MB | 0.00MB
2023-07-19 11:22:12.320 [main] INFO Engine -
{"content":[{"reader":{"name":"streamreader",
"parameter":{"column":[
{"type":"string","value":"DataX"},
{"type":"long","value":19890604},
{"type":"date","value":"1989-06-04 00:00:00"},
{"type":"bool","value":true},
{"type":"bytes","value":"test"}
],"sliceRecordCount":100000}
},"writer":{"name":"streamwriter","parameter":{"encoding":"UTF-8","print":false}}}],
"setting":{"errorLimit":{"percentage":0.02,"record":0},
"speed":{"byte":10485760}}}
2023-07-19 11:22:12.336 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null
2023-07-19 11:22:12.337 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
2023-07-19 11:22:12.338 [main] INFO JobContainer - DataX jobContainer starts job.
2023-07-19 11:22:12.339 [main] INFO JobContainer - Set jobId = 0
2023-07-19 11:22:12.352 [job-0] INFO JobContainer - jobContainer starts to do prepare ...
2023-07-19 11:22:12.352 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] do prepare work .
2023-07-19 11:22:12.352 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] do prepare work .
2023-07-19 11:22:12.352 [job-0] INFO JobContainer - jobContainer starts to do split ...
2023-07-19 11:22:12.353 [job-0] INFO JobContainer - Job set Max-Byte-Speed to 10485760 bytes.
2023-07-19 11:22:12.354 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] splits to [1] tasks.
2023-07-19 11:22:12.354 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] splits to [1] tasks.
2023-07-19 11:22:12.371 [job-0] INFO JobContainer - jobContainer starts to do schedule ...
2023-07-19 11:22:12.375 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.
2023-07-19 11:22:12.376 [job-0] INFO JobContainer - Running by standalone Mode.
2023-07-19 11:22:12.384 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2023-07-19 11:22:12.388 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.
2023-07-19 11:22:12.388 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.
2023-07-19 11:22:12.396 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2023-07-19 11:22:12.697 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[302]ms
2023-07-19 11:22:12.698 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-07-19 11:22:22.402 [job-0] INFO StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.020s | All Task WaitReaderTime 0.033s | Percentage 100.00%
2023-07-19 11:22:22.402 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
2023-07-19 11:22:22.402 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] do post work.
2023-07-19 11:22:22.403 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] do post work.
2023-07-19 11:22:22.403 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2023-07-19 11:22:22.403 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /soft/datax/hook
2023-07-19 11:22:22.404 [job-0] INFO JobContainer -
[total cpu info] =>
averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00%
[total gc info] =>
NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
PS MarkSweep | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s
PS Scavenge | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s
2023-07-19 11:22:22.404 [job-0] INFO JobContainer - PerfTrace not enable!
2023-07-19 11:22:22.404 [job-0] INFO StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.020s | All Task WaitReaderTime 0.033s | Percentage 100.00%
2023-07-19 11:22:22.406 [job-0] INFO JobContainer -
任务启动时刻 : 2023-07-19 11:22:12
任务结束时刻 : 2023-07-19 11:22:22
任务总计耗时 : 10s
任务平均流量 : 253.91KB/s
记录写入速度 : 10000rec/s
读出记录总数 : 100000
读写失败总数 : 0
3. Migration complète de SQLServer2GreatSQL
3.1 Créer des données de test à partir de la source (SQLServer)
$ docker exec -it 47bd0ed79c26 /bin/bash
$ /opt/mssql-tools/bin/sqlcmd -S localhost -U SA -P "********"
1> create database testdb
1> use testdb
1> insert into t1 values(1),(2),(3);
2> go
1> select * from t1;
2> go
id
\-----------
1
2
3
3.2 Créer une structure de table côté cible (GreatSQL)
greatsql> create database testdb;
greatsql> use testdb;
greatsql> create table t1 (id int primary key);
3.3 Écrire le fichier de travail Datax
$ cat /soft/datax/job/sqlserver_to_greatsql.json
{
"job": {
"content": [
{
"reader": {
"name": "sqlserverreader",
"parameter": {
"connection": [
{
"jdbcUrl": ["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb"],
"table": ["t1"]
}
],
"password": "********",
"username": "SA",
"column": ["*"]
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"column": ["*"],
"connection": [
{
"jdbcUrl": "jdbc:mysql://10.17.139.86:3308/testdb",
"table": ["t1"]
}
],
"password": "******",
"session": [],
"username": "admin",
"writeMode": "insert"
}
}
}
],
"setting": {
"speed": {
"channel": "5"
}
}
}
}
3.4 Exécuter la tâche de migration Datax
$ python /soft/datax/bin/datax.py /soft/datax/job/sqlserver_to_greatsql.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-11-28 09:58:44.087 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2023-11-28 09:58:44.104 [main] INFO Engine - the machine info =>
osInfo: Oracle Corporation 1.8 25.181-b13
jvmInfo: Linux amd64 3.10.0-957.el7.x86_64
cpu num: 8
totalPhysicalMemory: -0.00G
freePhysicalMemory: -0.00G
maxFileDescriptorCount: -1
currentOpenFileDescriptorCount: -1
GC Names [PS MarkSweep, PS Scavenge]
MEMORY_NAME | allocation_size | init_size
PS Eden Space | 256.00MB | 256.00MB
Code Cache | 240.00MB | 2.44MB
Compressed Class Space | 1,024.00MB | 0.00MB
PS Survivor Space | 42.50MB | 42.50MB
PS Old Gen | 683.00MB | 683.00MB
Metaspace | -0.00MB | 0.00MB
2023-11-28 09:58:44.137 [main] INFO Engine -
{
"content":[
{"reader":{
"name":"sqlserverreader",
"parameter":{
"column":["*"],
"connection":[
{"jdbcUrl":["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb"],
"table":["t1"]}],
"password":"*************",
"username":"SA"}},
"writer":{"name":"mysqlwriter","parameter":{"column":["*"],
"connection":[{"jdbcUrl":"jdbc:mysql://10.17.139.86:3308/testdb",
"table":["t1"]}],
"password":"********",
"session":[],
"username":"admin",
"writeMode":"insert"}}}],
"setting":{"speed":{"channel":"5"}}}
2023-11-28 09:58:44.176 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null
2023-11-28 09:58:44.179 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
2023-11-28 09:58:44.180 [main] INFO JobContainer - DataX jobContainer starts job.
2023-11-28 09:58:44.183 [main] INFO JobContainer - Set jobId = 0
2023-11-28 09:58:44.542 [job-0] INFO OriginalConfPretreatmentUtil - Available jdbcUrl:jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb.
2023-11-28 09:58:44.544 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置存在一定的风险. 因为您未配置读取数据库表的列,当您的表字段个数、类型有变动时,可能影响任务正确性甚至会运行出错。请检查您的配置并作出修改.
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
2023-11-28 09:58:45.099 [job-0] INFO OriginalConfPretreatmentUtil - table:[t1] all columns:[id].
2023-11-28 09:58:45.099 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置信息存在风险. 因为您配置的写入数据库表的列为*,当您的表字段个数、类型有变动时,可能影响任务正确性甚至会运行出错。请检查您的配置并作出修改.
2023-11-28 09:58:45.102 [job-0] INFO OriginalConfPretreatmentUtil - Write data [
insert INTO %s (id) VALUES(?)
], which jdbcUrl like:[jdbc:mysql://10..17.139.86:16310/testdb?yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true]
2023-11-28 09:58:45.103 [job-0] INFO JobContainer - jobContainer starts to do prepare ...
2023-11-28 09:58:45.103 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] do prepare work
2023-11-28 09:58:45.104 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do prepare work .
2023-11-28 09:58:45.104 [job-0] INFO JobContainer - jobContainer starts to do split ...
2023-11-28 09:58:45.105 [job-0] INFO JobContainer - Job set Channel-Number to 5 channels.
2023-11-28 09:58:45.112 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] splits to [1] tasks.
2023-11-28 09:58:45.114 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] splits to [1] tasks.
2023-11-28 09:58:45.135 [job-0] INFO JobContainer - jobContainer starts to do schedule ...
2023-11-28 09:58:45.139 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.
2023-11-28 09:58:45.142 [job-0] INFO JobContainer - Running by standalone Mode.
2023-11-28 09:58:45.151 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2023-11-28 09:58:45.157 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.
2023-11-28 09:58:45.158 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.
2023-11-28 09:58:45.173 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2023-11-28 09:58:45.181 [0-0-0-reader] INFO CommonRdbmsReader$Task - Begin to read record by Sql: [select * from t1
] jdbcUrl:[jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb].
2023-11-28 09:58:45.398 [0-0-0-reader] INFO CommonRdbmsReader$Task - Finished read record by Sql: [select * from t1
] jdbcUrl:[jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb].
2023-11-28 09:58:45.454 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[284]ms
2023-11-28 09:58:45.455 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-11-28 09:58:55.175 [job-0] INFO StandAloneJobContainerCommunicator - Total 3 records, 3 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-28 09:58:55.175 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
2023-11-28 09:58:55.175 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do post work.
2023-11-28 09:58:55.176 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] do post work.
2023-11-28 09:58:55.176 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2023-11-28 09:58:55.176 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /soft/datax/hook
2023-11-28 09:58:55.177 [job-0] INFO JobContainer -
[total cpu info] =>
averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00%
[total gc info] =>
NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
PS MarkSweep | 1 | 1 | 1 | 0.061s | 0.061s | 0.061s
PS Scavenge | 1 | 1 | 1 | 0.039s | 0.039s | 0.039s
2023-11-28 09:58:55.177 [job-0] INFO JobContainer - PerfTrace not enable!
2023-11-28 09:58:55.177 [job-0] INFO StandAloneJobContainerCommunicator - Total 3 records, 3 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-28 09:58:55.179 [job-0] INFO JobContainer -
任务启动时刻 : 2023-11-28 09:58:44
任务结束时刻 : 2023-11-28 09:58:55
任务总计耗时 : 10s
任务平均流量 : 0B/s
记录写入速度 : 0rec/s
读出记录总数 : 3
读写失败总数 : 0
3.5 Vérifier les données à l'extrémité cible
greatsql> select * from t1;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
+----+
3 rows in set (0.01 sec)
4. Migration incrémentielle de SQLServer vers GreatSQL
4.1 Créer des données de test côté source (SQLServer)
2> create table t2 (id int,createtime datetime);
3> go
1> insert into t2 values(1,GETDATE());
2> g
(1 rows affected)
1> insert into t2 values(2,GETDATE());
2> go
(1 rows affected)
1> insert into t2 values(3,GETDATE());
2> go
(1 rows affected)
1> insert into t2 values(4,GETDATE());
2> go
(1 rows affected)
1> insert into t2 values(5,GETDATE());
2> go
(1 rows affected)
1> insert into t2 values(6,GETDATE());
2> go
(1 rows affected)
1> select * from t2;
2> go
id createtime
---------- -----------------------
1 2023-11-28 02:18:20.790
2 2023-11-28 02:18:27.040
3 2023-11-28 02:18:32.103
4 2023-11-28 02:18:37.690
5 2023-11-28 02:18:41.450
6 2023-11-28 02:18:46.330
4.2 Écrire le fichier de travail de migration complet de Datax
$ cat sqlserver_to_greatsql_inc.json
{
"job": {
"content": [
{
"reader": {
"name": "sqlserverreader",
"parameter": {
"connection": [
{
"jdbcUrl": ["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb"],
"table": ["t2"]
}
],
"password": "********",
"username": "SA",
"column": ["*"]
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"column": ["*"],
"connection": [
{
"jdbcUrl": "jdbc:mysql://10.17.139.86:3308/testdb",
"table": ["t2"]
}
],
"password": "!QAZ2wsx",
"session": [],
"username": "admin",
"writeMode": "insert"
}
}
}
],
"setting": {
"speed": {
"channel": "5"
}
}
}
}
4.3 Exécuter la tâche de migration complète Datax
$ python /soft/datax/bin/datax.py /soft/datax/job/sqlserver_to_greatsql_inc.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-11-28 10:19:59.279 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2023-11-28 10:19:59.286 [main] INFO Engine - the machine info =>
osInfo: Oracle Corporation 1.8 25.181-b13
jvmInfo: Linux amd64 3.10.0-957.el7.x86_64
cpu num: 8
totalPhysicalMemory: -0.00G
freePhysicalMemory: -0.00G
maxFileDescriptorCount: -1
currentOpenFileDescriptorCount: -1
GC Names [PS MarkSweep, PS Scavenge]
MEMORY_NAME | allocation_size | init_size
PS Eden Space | 256.00MB | 256.00MB
Code Cache | 240.00MB | 2.44MB
Compressed Class Space | 1,024.00MB | 0.00MB
PS Survivor Space | 42.50MB | 42.50MB
PS Old Gen | 683.00MB | 683.00MB
Metaspace | -0.00MB | 0.00MB
2023-11-28 10:19:59.302 [main] INFO Engine -
{"content":[{"reader":{"name":"sqlserverreader","parameter":{"column":[
"*"],"connection":[{"jdbcUrl":["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb"],
"table":["t2"]}],"password":"*************","username":"SA"}},
"writer":{"name":"mysqlwriter","parameter":{"column":["*"],
"connection":[{"jdbcUrl":"jdbc:mysql://10..17.139.86:16310/testdb","table":["t2"]}],
"password":"********",
"session":[],
"username":"admin",
"writeMode":"insert"}}}],
"setting":{"speed":{"channel":"5"}}}
2023-11-28 10:19:59.319 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null
2023-11-28 10:19:59.321 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
2023-11-28 10:19:59.321 [main] INFO JobContainer - DataX jobContainer starts job.
2023-11-28 10:19:59.324 [main] INFO JobContainer - Set jobId = 0
2023-11-28 10:19:59.629 [job-0] INFO OriginalConfPretreatmentUtil - Available jdbcUrl:jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb.
2023-11-28 10:19:59.630 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置存在一定的风险. 因为您未配置读取数据库表的列,当您的表字段个数、类型有变动时,可能影响任务正确性甚至会运行出错。请检查您的配置并作出修改.
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
2023-11-28 10:20:00.027 [job-0] INFO OriginalConfPretreatmentUtil - table:[t2] all columns:[
id,createtime].
2023-11-28 10:20:00.027 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置信息存在风险. 因为您配置的写入数据库表的列为*,当您的表字段个数、类型有变动时,可能影响任务正确性甚至会运行出错。请检查您的配置并作出修改.
2023-11-28 10:20:00.029 [job-0] INFO OriginalConfPretreatmentUtil - Write data [
insert INTO %s (id,createtime) VALUES(?,?)
], which jdbcUrl like:[jdbc:mysql://10..17.139.86:16310/testdb?yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true]
2023-11-28 10:20:00.030 [job-0] INFO JobContainer - jobContainer starts to do prepare ...
2023-11-28 10:20:00.031 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] do prepare work .
2023-11-28 10:20:00.031 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do prepare work .
2023-11-28 10:20:00.032 [job-0] INFO JobContainer - jobContainer starts to do split ...
2023-11-28 10:20:00.032 [job-0] INFO JobContainer - Job set Channel-Number to 5 channels.
2023-11-28 10:20:00.037 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] splits to [1] tasks.
2023-11-28 10:20:00.038 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] splits to [1] tasks.
2023-11-28 10:20:00.060 [job-0] INFO JobContainer - jobContainer starts to do schedule ...
2023-11-28 10:20:00.063 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.
2023-11-28 10:20:00.066 [job-0] INFO JobContainer - Running by standalone Mode.
2023-11-28 10:20:00.073 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2023-11-28 10:20:00.080 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.
2023-11-28 10:20:00.080 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.
2023-11-28 10:20:00.093 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2023-11-28 10:20:00.101 [0-0-0-reader] INFO CommonRdbmsReader$Task - Begin to read record by Sql: [select * from t2
] jdbcUrl:[jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb].
2023-11-28 10:20:00.262 [0-0-0-reader] INFO CommonRdbmsReader$Task - Finished read record by Sql: [select * from t2
] jdbcUrl:[jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb].
2023-11-28 10:20:00.334 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[243]ms
2023-11-28 10:20:00.335 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-11-28 10:20:10.087 [job-0] INFO StandAloneJobContainerCommunicator - Total 6 records, 54 bytes | Speed 5B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-28 10:20:10.088 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
2023-11-28 10:20:10.088 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do post work.
2023-11-28 10:20:10.089 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] do post work.
2023-11-28 10:20:10.090 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2023-11-28 10:20:10.091 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /soft/datax/hook
2023-11-28 10:20:10.094 [job-0] INFO JobContainer -
[total cpu info] =>
averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00%
[total gc info] =>
NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
PS MarkSweep | 1 | 1 | 1 | 0.034s | 0.034s | 0.034s
PS Scavenge | 1 | 1 | 1 | 0.031s | 0.031s | 0.031s
2023-11-28 10:20:10.094 [job-0] INFO JobContainer - PerfTrace not enable!
2023-11-28 10:20:10.095 [job-0] INFO StandAloneJobContainerCommunicator - Total 6 records, 54 bytes | Speed 5B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-28 10:20:10.097 [job-0] INFO JobContainer -
任务启动时刻 : 2023-11-28 10:19:59
任务结束时刻 : 2023-11-28 10:20:10
任务总计耗时 : 10s
任务平均流量 : 5B/s
记录写入速度 : 0rec/s
读出记录总数 : 6
读写失败总数 : 0
4.4 Vérifier toutes les données migrées
greatsql> select * from t2;
+----+---------------------+
| id | createtime |
+----+---------------------+
| 1 | 2023-11-28 02:18:21 |
| 2 | 2023-11-28 02:18:27 |
| 3 | 2023-11-28 02:18:32 |
| 4 | 2023-11-28 02:18:38 |
| 5 | 2023-11-28 02:18:41 |
| 6 | 2023-11-28 02:18:46 |
+----+---------------------+
Vous pouvez également utiliser la table de somme de contrôle x pour vérifier. Pour les grandes tables, vous ne pouvez pas sélectionner la table entière *.
4.5 Insérer des données incrémentielles à partir de la source (SQLServer)
2> insert into t2 values(7,'202311-28 03:18:46.330');
3> go
Changed database context to 'jem_db'.
(1 rows affected)
1> insert into t2 values(8,'2023-11-28 03:20:46.330');
2> go
(1 rows affected)
1> insert into t2 values(9,'2023-11-28 03:25:46.330');
2> go
(1 rows affected)
1> insert into t2 values(10,'2023-11-28 03:30:46.330');
2> go
(1 rows affected)
1> select * from t2;
2> go
id createtime
----------- -----------------------
1 2023-11-28 02:18:20.790
2 2023-11-28 02:18:27.040
3 2023-11-28 02:18:32.103
4 2023-11-28 02:18:37.690
5 2023-11-28 02:18:41.450
6 2023-11-28 02:18:46.330
7 2023-11-28 03:18:46.330
8 2023-11-28 03:20:46.330
9 2023-11-28 03:25:46.330
10 2023-11-28 03:30:46.330
4.6 Écrire le fichier de travail de migration incrémentielle Datax
$ cat sqlserver_to_greatsql_inc.json
{
"job": {
"content": [
{
"reader": {
"name": "sqlserverreader",
"parameter": {
"connection": [
{
"jdbcUrl": ["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb"],
"table": ["t2"]
}
],
"password": "********",
"username": "SA",
"column": ["*"],
"where":"createtime > '${start_time}' and createtime < '${end_time}'"
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"column": ["*"],
"connection": [
{
"jdbcUrl": "jdbc:mysql://10..17.139.86:16310/testdb",
"table": ["t2"]
}
],
"password": "!QAZ2wsx",
"session": [],
"username": "admin",
"writeMode": "insert"
}
}
}
],
"setting": {
"speed": {
"channel": "5"
}
}
}
}
4.7 Exécuter la tâche de migration incrémentielle Datax
$ python /soft/datax/bin/datax.py /soft/datax/job/sqlserver_to_mysql_inc.json -p "-Dstart_time='2023-11-28 03:17:46.330' -Dend_time='2023-11-28 03:31:46.330'"
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-11-28 10:29:24.492 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2023-11-28 10:29:24.504 [main] INFO Engine - the machine info =>
osInfo: Oracle Corporation 1.8 25.181-b13
jvmInfo: Linux amd64 3.10.0-957.el7.x86_64
cpu num: 8
totalPhysicalMemory: -0.00G
freePhysicalMemory: -0.00G
maxFileDescriptorCount: -1
currentOpenFileDescriptorCount: -1
GC Names [PS MarkSweep, PS Scavenge]
MEMORY_NAME | allocation_size | init_size
PS Eden Space | 256.00MB | 256.00MB
Code Cache | 240.00MB | 2.44MB
Compressed Class Space | 1,024.00MB | 0.00MB
PS Survivor Space | 42.50MB | 42.50MB
PS Old Gen | 683.00MB | 683.00MB
Metaspace | -0.00MB | 0.00MB
2023-11-28 10:29:24.524 [main] INFO Engine -
{"content":[{"reader":{"name":"sqlserverreader","parameter":{"column":["*"],
"connection":[{"jdbcUrl":["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb"],
"table":["t2"]}],"password":"*************","username":"SA",
"where":"createtime > '2023-11-28 03:17:46.330' and createtime < '2023-11-28 03:31:46.330'"}},
"writer":{"name":"mysqlwriter","parameter":{"column":["*"],"connection":[{"jdbcUrl":"jdbc:mysql://10..17.139.86:16310/testdb","table":["t2"]}],
"password":"********",
"session":[],
"username":"admin",
"writeMode":"insert"}}}],
"setting":{"speed":{"channel":"5"}}}
2023-11-28 10:29:24.542 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null
2023-11-28 10:29:24.544 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
2023-11-28 10:29:24.544 [main] INFO JobContainer - DataX jobContainer starts job.
2023-11-28 10:29:24.546 [main] INFO JobContainer - Set jobId = 0
2023-11-28 10:29:24.830 [job-0] INFO OriginalConfPretreatmentUtil - Available jdbcUrl:jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb.
2023-11-28 10:29:24.831 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置存在一定的风险. 因为您未配置读取数据库表的列,当您的表字段个数、类型有变动时,可能影响任务正确性甚至会运行出错。请检查您的配置并作出修改.
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
2023-11-28 10:29:25.113 [job-0] INFO OriginalConfPretreatmentUtil - table:[t2] all columns:[id,createtime].
2023-11-28 10:29:25.113 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置信息存在风险. 因为您配置的写入数据库表的列为*,当您的表字段个数、类型有变动时,可能影响任务正确性甚至会运行出错。请检查您的配置并作出修改.
2023-11-28 10:29:25.115 [job-0] INFO OriginalConfPretreatmentUtil - Write data [
insert INTO %s (id,createtime) VALUES(?,?)
], which jdbcUrl like:[jdbc:mysql://10..17.139.86:16310/testdb?yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true]
2023-11-28 10:29:25.116 [job-0] INFO JobContainer - jobContainer starts to do prepare ...
2023-11-28 10:29:25.117 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] do prepare work .
2023-11-28 10:29:25.117 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do prepare work .
2023-11-28 10:29:25.118 [job-0] INFO JobContainer - jobContainer starts to do split ...
2023-11-28 10:29:25.118 [job-0] INFO JobContainer - Job set Channel-Number to 5 channels.
2023-11-28 10:29:25.123 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] splits to [1] tasks.
2023-11-28 10:29:25.124 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] splits to [1] tasks.
2023-11-28 10:29:25.146 [job-0] INFO JobContainer - jobContainer starts to do schedule ...
2023-11-28 10:29:25.150 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.
2023-11-28 10:29:25.153 [job-0] INFO JobContainer - Running by standalone Mode.
2023-11-28 10:29:25.159 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2023-11-28 10:29:25.165 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.
2023-11-28 10:29:25.165 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.
2023-11-28 10:29:25.176 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2023-11-28 10:29:25.183 [0-0-0-reader] INFO CommonRdbmsReader$Task - Begin to read record by Sql: [select * from t2 where (createtime > '2023-11-28 03:17:46.330' and createtime < '2023-11-28 03:31:46.330')
] jdbcUrl:[jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb].
2023-11-28 10:29:25.344 [0-0-0-reader] INFO CommonRdbmsReader$Task - Finished read record by Sql: [select * from t2 where (createtime > '2023-11-28 03:17:46.330' and createtime < '2023-11-28 03:31:46.330')
] jdbcUrl:[jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb].
2023-11-28 10:29:25.606 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[431]ms
2023-11-28 10:29:25.607 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-11-28 10:29:35.173 [job-0] INFO StandAloneJobContainerCommunicator - Total 4 records, 37 bytes | Speed 3B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-28 10:29:35.173 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
2023-11-28 10:29:35.174 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do post work.
2023-11-28 10:29:35.175 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] do post work.
2023-11-28 10:29:35.175 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2023-11-28 10:29:35.177 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /soft/datax/hook
2023-11-28 10:29:35.179 [job-0] INFO JobContainer -
[total cpu info] =>
averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00%
[total gc info] =>
NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
PS MarkSweep | 1 | 1 | 1 | 0.052s | 0.052s | 0.052s
PS Scavenge | 1 | 1 | 1 | 0.024s | 0.024s | 0.024s
2023-11-28 10:29:35.180 [job-0] INFO JobContainer - PerfTrace not enable!
2023-11-28 10:29:35.181 [job-0] INFO StandAloneJobContainerCommunicator - Total 4 records, 37 bytes | Speed 3B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-28 10:29:35.183 [job-0] INFO JobContainer -
任务启动时刻 : 2023-11-28 10:29:24
任务结束时刻 : 2023-11-28 10:29:35
任务总计耗时 : 10s
任务平均流量 : 3B/s
记录写入速度 : 0rec/s
读出记录总数 : 4
读写失败总数 : 0
4.8 Vérifier les données incrémentielles sur la cible (GreatSQL)
greatsql> select * from t2;
+----+---------------------+
| id | createtime |
+----+---------------------+
| 1 | 2023-11-28 02:18:21 |
| 2 | 2023-11-28 02:18:27 |
| 3 | 2023-11-28 02:18:32 |
| 4 | 2023-11-28 02:18:38 |
| 5 | 2023-11-28 02:18:41 |
| 6 | 2023-11-28 02:18:46 |
| 7 | 2023-11-28 03:18:46 |
| 8 | 2023-11-28 03:20:46 |
| 9 | 2023-11-28 03:25:46 |
| 10 | 2023-11-28 03:30:46 |
+----+---------------------+
10 rows in set (0.00 sec)
Résumé de la migration incrémentielle : l'objectif de la migration incrémentielle est atteint en ajoutant des conditions de filtre. Principalement grâce aux conditions de filtrage, la totalité des données migrées est filtrée, puis la migration incrémentielle est effectuée de manière déguisée.
Profitez de GreatSQL :)
À propos de GreatSQL
GreatSQL est une base de données open source nationale indépendante adaptée aux applications financières. Elle possède de nombreuses fonctionnalités de base telles que des performances élevées, une fiabilité élevée, une grande facilité d'utilisation et une sécurité élevée. Elle peut être utilisée en remplacement facultatif de MySQL ou du serveur Percona. et est utilisé dans des environnements de production en ligne, entièrement gratuit et compatible avec MySQL ou Percona Server.
Liens connexes : Communauté GreatSQL Gitee GitHub Bilibili
Communauté GreatSQL :
Suggestions et commentaires de récompenses de la communauté : https://greatsql.cn/thread-54-1-1.html
Détails de la soumission primée du blog communautaire : https://greatsql.cn/thread-100-1-1.html
(Si vous avez des questions sur l'article ou si vous avez des idées uniques, vous pouvez accéder au site Web officiel de la communauté pour les poser ou les partager ~)
Groupe d'échange technique :
Groupe WeChat et QQ :
Groupe QQ : 533341697
Groupe WeChat : ajoutez GreatSQL Community Assistant (WeChat ID : wanlidbc
) comme ami et attendez que l'assistant de communauté vous ajoute au groupe.