Alibaba Open Source Project: Alibaba to Oracle data migration synchronization tool

background

   Around 2008, Alibaba began to try MySQL-related research, and developed related products based on MySQL database and table technology, Cobar/TDDL (currently Alibaba Cloud DRDS product), which solved the scalability problem that single-machine Oracle could not satisfy. At that time, there was also a wave of de-IOE projects, and the Yugong project was born. 

 

Project Introduction

Name: yugong

Translation: Foolish old man moves mountains

Language: pure java development

Positioning: database migration (currently mainly supports oracle -> mysql/DRDS)

 

 

Project Introduction

The entire data migration process is divided into two parts:

  1. full migration
  2. Incremental migration

Process description:

  1. Incremental data collection (creating incremental materialized views of oracle tables)
  2. make a full copy
  3. Perform incremental replication (data verification can be performed in parallel)
  4. Stop writing the original library and switch to the new library

Architecture 


 

illustrate: 

  1. A Jvm Container corresponds to multiple instances, and each instance corresponds to the migration task of a table
  2.  The instance is divided into three parts
    a. extractor (extract data from the source database, which can be divided into full/incremental implementation)
    b. translator (customize the data on the source database according to the needs of the target database)
    c. applier (will The data is updated to the target library, which can be divided into full/incremental/comparative realization)

Design

Full plan

The full-scale solutions commonly used in the industry include:

  1. Data file import/export, such as EXPDP/IMPDP, mysqldump/source, xtrabackup, etc.
  2. ETL data import/export, the main principle is to use the JDBC data query interface

At the beginning of the project design, yugong considered the flexibility and customization capabilities of IOE data migration, and the final solution was to traverse data based on the JDBC interface.  

Compared with data file import/export, its advantages:

  • Flexible data synchronization
  • Support heterogeneous data
  • Relatively simple to implement

shortcoming: 

  • The full pull needs to be used with incremental use, and some data will be repeatedly synchronized
  • 性能和影响,一次性全量拉取,如果持续时间过长,如果此时数据库变更过多,会导致segment过大

增量方案

业界常用的增量方案有:

  1. 基于时间戳定时dump
  2. oracle日志文件,比如LogMiner,OGG
  3. oracle CDC(Change Data Capture)
  4. oracle trigger机制,比如DataBus , SymmetricDS
  5. oracle 物化视图(materialized view)
  6. ...

yugong在项目设计之初考虑去IOE数据迁移的灵活性,支持多种oracle版本,同时为降低DBA的运维成本,最终选择oracle物化视图作为我们的增量方案. 

相比于其他,物化视图方案其优点:

  • 原理简单,方便理解和学习,用户可以理解为一种固化的简易trigger模式
  • 运维简单,DBA一次账户授权后,程序可按需create一张物化视图表即可完成增量订阅
  • 相对透明,不需要像时间戳sql扫描依赖数据库表设计,也不需要关注oracle版本和服务器存储等

缺点:

  • 性能和影响,类似于trigger机制会对源库的数据写入造成一定的性能影响. 

QuickStart

See the page for quick start:  QuickStart

AdminGuide

See the page for admin deploy guide: AdminGuide

Performance

See the page for yugong performance : Performance

 

相关资料

  1. yugong简单介绍ppt :  ppt
  2. 分布式关系型数据库服务DRDS (前身为阿里巴巴公司的Cobar/TDDL的演进版本, 基本原理为MySQL分库分表)

问题反馈

  1. qq交流群: 537157866
  2. 邮件交流: [email protected]
  3. 新浪微博: agapple0002
  4. 报告issue:issues

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327065590&siteId=291194637