Big Data Course C5 - Application Components of ZooKeeper

E-mail of the author of the article: [email protected] Address: Huizhou, Guangdong

 ▲ Purpose of this chapter

⚪ Master the Canal consumption component of Zookeeper;

⚪ Master Zookeeper's Dubbo distributed service framework;

⚪ Master the Metamorphosis message middleware of Zookeeper;

⚪ Master Zookeeper's Otter distributed database synchronization system;

1. Canal - Ali

1 Overview

1. Canal is an incremental subscription and consumption component written in pure Java and based on the MySQL database Binlog, which was officially open sourced by Alibaba in January 2013.

2. The current project homepage address is: https://github.Com/alibaba/canal. It is continuously maintained by agapple, the main person in charge of the project and also a senior open source enthusiast.

3. The project name Canal is taken from the English word for "pipeline", implying the flow of data. It is a general-purpose component positioned to realize database mirroring, real-time backup and incremental data consumption based on the Binlog incremental log of the MySQL database.

4. Most of the early database synchronization services used the trigger mechanism (ie Trigger) of the MySQL database to obtain incremental changes to the database. However, starting from 2010, various companies under the Alibaba Group began to gradually try to obtain incremental changes based on database log parsing, and on this basis to achieve data synchronization, thus deriving incremental subscription and consumption services of the database—— The Canal project was thus born.

5. The working principle of Canal is relatively simple. Its core idea is to simulate the interactive protocol of MySQL Slave, disguise itself as a MySQL Slave machine, and then continuously send dump requests to the Master server. After the Master receives the Dump request, it will start to push the corresponding Binary Log to the Slave (that is, Canal). After Canal receives the Binary Log and parses out the corresponding Binary Log object, it can carry out secondary consumption. The basic working principle is shown in the figure below.

 2. Canal Server active/standby switching design

1. In the design of Canal, based on the consideration of disaster recovery, two or more Canal Servers are often configured to be responsible for the incremental data replication of a MySQL database instance.

2. On the other hand, in order to reduce the performance impact of Canal Server's Dump request on MySQLMaster, it is required that only one instance on different Canal Servers be in the Running state at the same time, and the other instances are in the Standby state. This makes Canal must have the ability to automatically switch between active and standby.

3. In Canal, the control of the entire active/standby switchover process mainly depends on ZooKeeper, as shown in the following figure:

a. Attempt to start: Each Canal Server will first make an attempt to start a judgment to ZooKeeper when starting a Canal instance. The specific method is to create the same temporary node to ZooKeeper, and which Canal Server is successfully created, then let which Server start. Take the instance "example" as an example. When all Canal Servers are started, they will create /otter/canal/destinations/example/running nodes, and no matter how many Canal Servers are started concurrently, ZooKeeper will ensure that the final Only one Canal Server can successfully create the node.

b. Start the instance: Assuming that Canal Server with the final IP address of 10.20.144.51 successfully created the node, it will write its own machine information to the node: {"active":true,"address": "10.20.144.51:11111","cid":1} and start the instance at the same time. Since other Canal Servers failed to create nodes, they will set their status to Standby, and at the same time, the /otter/canal/destinations/example/running node registers Watcher monitoring to monitor the changes of the node.

c. Active-standby switchover: During the operation of Canal Server, some abnormal situations will inevitably occur, which will cause it to fail to work normally. At this time, it is necessary to perform active-standby switchover. Based on the characteristics of ZooKeeper temporary nodes, when the Canal Server that was originally in the Running state disconnects from ZooKeeper due to hangup or network reasons, the /otter/canal/destinations/example/running node will disappear after a period of time. Since all Canal Servers that were in the Standby state have already monitored the node, they will repeat step 1 after receiving the node disappearance notification sent by ZooKeeper to achieve the master-standby switchover.

 4. One of the most common problems encountered in the design process of active/standby switchover is "fake death". The so-called suspended animation state means that the network of the server where the Canal Server is located is temporarily disconnected, causing ZooKeeper to think that its session is invalid, thus releasing the Running node—but at this time, the JVM corresponding to the Canal Server has not exited, and its working state is normal.

5. In the design of Canal, in order to protect the Canal Server in the state of suspended animation, avoid instant running node failure causing instant

Guess you like

Origin blog.csdn.net/u013955758/article/details/131886198