[Big Data] The architecture and practice of synchronizing Meituan DB data to the data warehouse

1.Background

In data warehouse modeling, the original business layer data without any processing is called ODS( Operational Data Store) data. In Internet enterprises, common ODS data include business log data ( Log) and business DB data ( DB). For business DB data, collecting business data from relational databases such as MySQL and then importing it into Hive is an important step in data warehouse production.

How to synchronize MySQL data to Hive accurately and efficiently? A commonly used solution is to fetch data in batches and Load: directly connect to MySQL to Selectretrieve the data in the table, then store it in a local file as intermediate storage, and finally store the file Loadin the Hive table. The advantage of this solution is that it is simple to implement, but as the business develops, its shortcomings are gradually exposed:

  • Performance bottleneck: As the business scale grows, Select From MySQL→→ Save to Localfilethis Load to Hivedata flow takes longer and longer, and cannot meet the time requirements of downstream data warehouse production.
  • SelectRetrieving a large amount of data directly from MySQL has a great impact on MySQL, easily causing slow queries and affecting normal services on the business line.
  • Since Hive's own syntax does not support SQL primitives such as update and delete, it cannot well support data generated Update/ generated in MySQL.Delete

In order to completely solve these problems, we gradually turned to the CDC( Change Data Capture) + Mergetechnical solution, that is, a solution of real-time Binlog collection + offline processing of Binlog to restore business data. Binlog is the binary log of MySQL, which records all data changes that occur in MySQL. The master-slave synchronization of the MySQL cluster itself is based on Binlog .

This article mainly introduces how to achieve accurate and efficient entry of DB data into the data warehouse from two aspects: real-time collection of Binlog and offline processing of Binlog to restore business data .

2. Overall architecture

Insert image description here
The overall architecture is shown in the figure above. In terms of real-time collection of Binlog, we adopted Alibaba's open source project Canal, which is responsible for pulling Binlog from MySQL in real time and completing appropriate analysis. After Binlog collection, it will be temporarily stored in Kafka for downstream consumption. The overall real-time collection part is shown by the red arrow in the figure.

For the offline processing of Binlog, as shown by the black arrow in the figure, restore a MySQL table on Hive through the following steps:

  • An open source project using Linkedin Camus, responsible for pulling the Binlog data on Kafka to Hive every hour.
  • For each ODS table, you first need to make a one-time snapshot ( Snapshot) to read the existing data in MySQL into Hive. The bottom layer of this process uses a direct connection to MySQL to select data.
  • For each ODS table, Merge is performed every day based on the existing data and the Binlog generated incrementally on that day to restore the business data.

Let's look back at the various problems encountered by the batch fetching and loading solution introduced in the background. Why can this solution solve the above problems?

  • First of all, Binlog is generated in a streaming manner. Through real-time collection of Binlog, part of the data processing requirements are allocated from once-a-day batch processing to real-time streaming. Both in terms of performance and access pressure to MySQL, there will be significant improvements.
  • Second, Binlog itself records the type of data change ( Insert/ Update/ Delete), and through some semantic processing, it can achieve accurate data restoration.

3.Binlog real-time collection

Real-time collection of Binlog includes two main modules:

  • First CanalManager, it is mainly responsible for the allocation of collection tasks, monitoring and alarming, metadata management, and docking with external dependent systems;
  • The second is to actually perform the collection Canaltask CanalClient.

Insert image description here
When a user submits a Binlog collection request for a certain DB, CanalManager will first call the relevant interface of the DBA platform to obtain relevant information about the MySQL instance where the DB is located, in order to select the most suitable machine for Binlog collection. Then distribute the collection instance ( Canal Instance) to the appropriate Canal server , that is, CanalServeron. When selecting a specific CanalServer, CanalManager will consider factors such as load balancing and cross-machine room transmission, and give priority to machines with lower loads and transmission in the same region.

After CanalServer receives the collection request, it will register the collection information on ZooKeeper. Registration content includes:

  • A permanent node named with the Instance name.
  • ip:portRegister a temporary node named after itself under the permanent node .

This serves two purposes:

  • High availability : When CanalManager distributes Instances, it will select two CanalServers, one as the Running node and the other as the Standby node. The Standby node will monitor the Instance. When the Running node fails, the temporary node disappears, and then the Standby node preempts it. In this way, the purpose of disaster recovery is achieved.
  • Interacting with CanalClient : After CanalClient detects the Running CanalServer where the Instance it is responsible for is located, it will connect to receive the Binlog data sent by CanalServer.

Subscription to Binlog is based on MySQL's DB as the granularity, and one DB's Binlog corresponds to one Kafka Topic. In the underlying implementation, all subscribed DBs under a MySQL instance are processed by the same Canal Instance. This is because Binlog is generated at the granularity of the MySQL instance. CanalServer will discard unsubscribed Binlog data, and then CanalClient will distribute the received Binlog to Kafka according to DB granularity.

4. Restore MySQL data offline

After completing Binlog collection, the next step is to use Binlog to restore business data. The first problem to be solved first is to synchronize Binlog from Kafka to Hive.

Insert image description here

5. Kafka2Hive

The management of the entire Kafka2Hive task is carried out under the ETL framework of Meituan Data Platform, including the expression of task primitives and the scheduling mechanism, which are similar to other ETL. The bottom layer uses LinkedIn's open source project Camus, and carries out targeted secondary development to complete the real Kafka2Hive data transmission work.

6. Secondary development of Camus

The Binlog stored on Kafka does not have a Schema, but the Hive table must have a Schema, and its partitions, fields, etc. must be designed to facilitate efficient downstream consumption. The first modification to Camus is to parse the Binlog on Kafka into a format that conforms to the target Schema.

The second transformation of Camus was determined by Meituan's ETL framework. In our task scheduling system, currently only upstream and downstream dependencies are analyzed for tasks in the same scheduling queue. Dependencies cannot be established across scheduling queues. In the entire process of MySQL2Hive, Kafka2Hive tasks need to be executed once every hour (hourly queue), and Merge tasks need to be executed once a day (day queue). The startup of the Merge task must strictly rely on the completion of the hourly Kafka2Hive task.

To solve this problem, we introduced Checkdonetasks. The Checkdone task is a daily task, mainly responsible for detecting whether the Kafka2Hive of the previous day was successfully completed. If completed successfully, the Checkdone task is executed successfully, so that the downstream Merge task can be started correctly.

7.Checkdone detection logic

How does Checkdone detect it? After each Kafka2Hive task successfully completes data transmission, Camus is responsible for recording the startup time of the task in the corresponding HDFS directory. Checkdone will scan all timestamps of the previous day. If the maximum timestamp has exceeded 0 o'clock, it means that the Kafka2Hive tasks of the previous day have been successfully completed, and Checkdone has completed the detection.

In addition, since Camus itself only completes the process of reading Kafka and then writing HDFS files, it must also complete the loading of the Hive partition to enable downstream queries. Therefore, the last step of the entire Kafka2Hive task is to load the Hive partition. In this way, the entire task is successfully executed.

Each Kafka2Hive task is responsible for reading a specific Topic and writing Binlog data into original_binloga table under the database, which is the one in the previous figure original_binlog.db, which stores all Binlogs corresponding to a MySQL DB.

Insert image description here
The above figure illustrates the directory structure of files on HDFS after a Kafka2Hive is completed. If a MySQL DB is called user, the corresponding Binlog is stored in original_binlog.userthe table. readyIn the directory, the start times of all successfully executed Kafka2Hive tasks are stored by day for use by Checkdone. The Binlog of each table is organized into a partition. For example, userinfothe Binlog of the table is stored in table_name=userinfothis partition. Under each table_nameprimary partition, dtsecondary partitions are organized by . xxx.lzoThe and files in the figure xxx.lzo.indexstore compressed lzoBinlog data.

8.Merge

After Binlog is successfully put into the warehouse, the next step is to restore the MySQL data based on Binlog. The Merge process does two things. First, it stores the Binlog data generated that day into the Delta table, and then performs a primary key-based Merge with the existing stock data. The data in the Delta table is the latest data of the day. When a piece of data changes multiple times in a day, only the data after the last change is stored in the Delta table.

In the process of merging delta data and stock data, a unique key is required to determine whether they are the same piece of data. If the same piece of data appears in both the inventory table and the Delta table, it means that this piece of data has been updated, and the data in the Delta table is selected as the final result; otherwise, it means that no changes have occurred, and the data in the original inventory table is retained as Final Results. The result data of Merge will be Insert Overwritten to the original table, which is the one in the previous figure origindb.table.

9.Merge process example

An example is used below to specifically illustrate the Merge process.

Insert image description here
The data table has two columns in total, where idis the primary key. When extracting Delta data, for multiple updates to the same piece of data, only the last updated one is selected. So for the data, the last updated value is recorded in the Delta table . After the Delta data and existing data are merged, in the final result, a new piece of data is inserted ( ), two pieces of data are updated ( sum ), and one piece of data remains unchanged ( ).valueidid=1value=120id=4id=1id=2id=3

By default, we use the primary key of the MySQL table as the unique key for this judgment. The business can also configure a unique key different from MySQL according to the actual situation.

The above introduces the overall architecture of Binlog-based data collection and ODS data restoration. The following mainly introduces the actual business problems we solve from two aspects.

10. Practice 1: Support of sub-database and sub-table

With the expansion of business scale, MySQL has more and more sub-databases and tables, and the number of sub-tables for many businesses is in the order of thousands. General data development students need to aggregate these data together for analysis. If we manually synchronize each table and then aggregate it on Hive, this cost will be difficult for us to accept. Therefore, we need to complete the aggregation of sub-tables at the ODS layer.

Insert image description here
First of all, during real-time Binlog collection, we support writing Binlogs from different DBs to the same Kafka Topic. Users can check multiple physical DBs under the same business logic at the same time when applying for Binlog collection. Through aggregation at the Binlog collection layer, the Binlogs of all sub-databases will be written to the same Hive table, so that when the downstream merge is performed, only one Hive table still needs to be read.

Second, the configuration of the Merge task supports regular matching. By configuring regular expressions that comply with the business table naming rules, the Merge task can understand which MySQL table binlogs it needs to aggregate, and then select the data of the corresponding partitions for execution.

In this way, through two levels of work, the merger of sub-databases and sub-tables at the ODS layer is completed.

There is a technical optimization here. When performing Kafka2Hive, we processed the table names according to the business table partitioning rules and converted the physical table names into logical table names. For example, userinfo123this table name will be converted to userinfoBinlog data stored in original_binlog.userthe table's table_name=userinfopartition. The purpose of this is to prevent the underlying pressure caused by too many HDFS small files and Hive partitions.

11. Practice 2: Support for deleting events

Delete operations are very common in MySQL. Since Hive does not support Delete, if you want to delete the data deleted in MySQL in Hive, you need to use a "roundabout" method.

For the Merge process that needs to handle the Delete event, the following two steps are used:

  • First, extract the data where the Delete event occurred. Since Binlog itself records the event type, this step is easy to do. Make a left outer join ( ) on the primary key between the existing data (Table A) and the deleted data (Table B) Left outer join. If you can get all jointhe data from both sides, it means that the piece of data has been deleted. Therefore, the data corresponding to NULL records in table B in the selection result is the data that should be retained.
  • Then, perform regular Merge on the retained data obtained above according to the process described above.

Insert image description here

12. Summary and outlook

As the basis for data warehouse production, the Binlog-based MySQL2Hive service provided by Meituan Data Platform basically covers all business lines within Meituan. Currently, it can meet the data synchronization needs of most businesses and achieve accurate and efficient DB data synchronization. Enter the warehouse. In the future development, we will focus on solving the single point problem of CanalManager and build a cross-machine room disaster recovery architecture to more stably support business development.

This article mainly introduces the architecture of this service from two aspects: Binlog streaming collection and Binlog-based ODS data restoration, and introduces some typical problems and solutions we encountered in practice. I hope it can provide some reference value to other developers, and everyone is welcome to communicate with us.


This article is reproduced in:

Guess you like

Origin blog.csdn.net/be_racle/article/details/132840867