Talk about the evolution of MySQL architecture: from master-slave replication to sub-database sub-table

background

The rapid development of business has led to the rapid expansion of data scale, and the stand-alone database has been unable to meet the development of Internet business.

The traditional scheme of storing data in a single data node in a centralized manner has been difficult to meet the Internet's massive data scenarios in terms of capacity, performance, availability, and maintainability.

In terms of capacity, the capacity of a single-machine database is limited and it is difficult to expand.

In terms of performance, because most relational databases use B+ tree type indexes, after the amount of data exceeds a certain threshold, the increase in the depth of the index leads to an increase in the number of random IOs to the disk, which in turn leads to performance problems.

From the aspect of availability, services are usually designed to be stateless, which inevitably leads to the storage pressure of the system being concentrated on the database level, and a single data node, or a simple master-slave architecture, has become more and more difficult to bear.

From the perspective of operation and maintenance, when data is concentrated on one node, the time cost of data backup and recovery also becomes uncontrollable with the increase in data volume. At the same time, the area affected by the loss of data will be enlarged.

Talk about the evolution of MySQL architecture: from master-slave replication to sub-database sub-table

 

Master-slave replication

  • The main library records transaction operations (operations other than queries) to binlog
  • Synchronize data from the library through the relay log to achieve data synchronization

Talk about the evolution of MySQL architecture: from master-slave replication to sub-database sub-table

 

binlog log format

  • row records detailed records of database operations, including online text information, etc. The file is large.
  • statement records the SQL file related to the transaction.
  • mixed, based on two file formats of row and statement.

Asynchronous replication

In 2000, MySQL version 3.23.15 introduced the replication function, using asynchronous replication. When the network or machine fails, data will be inconsistent.

Talk about the evolution of MySQL architecture: from master-slave replication to sub-database sub-table

 

Semi-synchronous replication

In 2010, MySQL 5.5 introduced semi-synchronous replication. Semi-synchronous replication means that as long as a salve node returns ack, the master node can commit the transaction, ensuring that at least one node in the database has completed data synchronization.

Talk about the evolution of MySQL architecture: from master-slave replication to sub-database sub-table

 

Group copy

In 2016, MysQL introduced InnoDB Group Replication in 5.7.17. This solution is based on the paxos protocol to achieve intra-group replication and ensure data consistency. The core of the paxos protocol is over half of the elections.

Talk about the evolution of MySQL architecture: from master-slave replication to sub-database sub-table

 

The problem of master-slave replication

  • Master-slave replication is delayed, leading to inconsistent data "read after writing". Failed to read from the library, and then go to the main library to execute SQL again, there is a performance problem. The business layer ensures that the core functions of the system are available, and routes the CRUD operations of the core functions to the main library. Even if there is a short-term data inconsistency, the non-core business functions have little effect.
  • For routing issues, the business layer needs to route to different databases based on SQL. When routing to the SLAVE node, it also needs to ensure system load balance. The business layer is implemented through a framework (such as sharding-jdbc) or manually, which is more intrusive to the business, and the existing old system is not friendly to transformation. To achieve through database middleware (such as mycat, sharding-proxy), you need to deploy a middleware (the middleware implements the SQL standard), the rules are configured in the middleware, and there will be one more network forwarding during the execution.
  • Can not guarantee the high availability of the system through a series of high-availability solutions to ensure the high availability of the database

Highly available database

What is high availability?

High availability means less time when services are unavailable, which is generally measured by SLA (Service Level Agreement).

1 year = 365 days = 8760 hours

99 = 8760 * 1% = 8760 * 0.01 = 87.6 hours

99.9 = 8760 * 0.1% = 8760 * 0.001 = 8.76 hours

99.99 = 8760 * 0.0001 = 0.876 hours = 0.876 * 60 = 52.6 minutes

99.999 = 8760 * 0.00001 = 0.0876 hours = 0.0876 * 60 = 5.26 minutes

Why do we need to be highly available?

Through failover, the ability to failover is provided, plus the heartbeat retry of the connection pool on the business side, to realize disconnection and reconnection, uninterrupted business, and reduce RTO (Recovery Time Objective) and RPO (Recovery Point Objective) aims).

  • Disaster recovery: cold standby and hot standby, the difference between cold standby and hot standby is whether services are provided during operation.
  • For the master and slave, in simple terms, the Master node is down, and a certain slave node is automatically switched to the master.
  • From the perspective of the cluster, even if individual nodes are down, they can provide services to the outside world normally.

Common strategies:

  • Multi-instance deployment
  • Deploy across computer rooms
  • Disaster recovery and high-availability solutions for three centers in two places.

Manual switch

That is, if the master node is down, manually modify a slave node to become the master node.

Existing problems:

  • May be inconsistent data
  • Need manual intervention
  • The intrusiveness of the code and configuration requires other nodes to be configured and the configuration of the application data source to be modified.

MHA

The full name of MHA is MySQL Master High Availability. It is a MySQL high-availability framework developed by Facebook engineer Yoshinori Matsunobu. It is developed based on the Perl language and can generally switch between master and slave within 30 seconds. The log information of the master node is copied through SSH during the switch.

MHA is responsible for the high availability of the MySQL main database. When the main database fails, MHA will select a candidate node with the number closest to the original main database as the new master node, and complete the Binlog that is different from the previously down Master. After the data is completed, the VIP will be written to drift to the new main library. The specific architecture diagram is as follows:

Talk about the evolution of MySQL architecture: from master-slave replication to sub-database sub-table

 

advantage

  • Can realize automatic detection and failover according to specific faults
  • Good scalability, the number of data nodes can be expanded arbitrarily

Disadvantages:

  • In extreme cases, split-brain may occur and multiple Masters may appear.
  • Need to configure SSH information.
  • At least three are required.

MGR

MGR is supported by the database. You only need to configure the plug-in. If the master node fails, it will automatically select a slave to become the master. No manual intervention is required, and it is based on group replication (paxos algorithm) to ensure data consistency.

Talk about the evolution of MySQL architecture: from master-slave replication to sub-database sub-table

 

Features of MGR

  • High consistency, based on the distributed Paxos protocol to achieve replication to ensure data consistency.
  • High fault tolerance, automatic detection mechanism, as long as most nodes are down, the database can continue to work, built-in explosion protection mechanism.
  • High scalability, after adding a new node, it automatically achieves incremental synchronization until the data is consistent with other nodes.
  • High flexibility, single-master and multi-master modes are provided. The single-master mode supports the downtime of the master node and automatically selects the master. The multi-master mode supports multi-node writing.

MySQL InnoDb Cluster, a complete database high-availability solution framework, composed of multiple components

  • MySQL Group Replication, which provides DB expansion and fault migration
  • MySQL Router, a lightweight middleware, provides failover of application connection targets.
  • MySQL shell, new MySQL client, multiple interface modes, group replication and Router can be set.

Talk about the evolution of MySQL architecture: from master-slave replication to sub-database sub-table

 

Orchestrator

A MySQL high-availability and replication topology management tool that supports the adjustment of replication topology, automatic failover and manual switching functions, etc., directly dragging the UI, you can achieve master-slave switching.

Sub-library and sub-table

Sub-library sub-table usually refers to vertical sub-library and horizontal sub-table. For vertical sub-table, it is actually to split wide table into small tables. There are not too many technical challenges. Here, I will focus on vertical sub-library and horizontal sub-table.

Vertical sub-library

Vertical sub-database refers to the vertical segmentation of the database, which is usually divided according to the dimensions of the business.

For example, in a typical microservice architecture, the system is vertically split according to business dimensions and divided into multiple services. For example, an e-commerce website can be split into: orders, products, memberships, payment and other services.

Talk about the evolution of MySQL architecture: from master-slave replication to sub-database sub-table

 

After the vertical sub-database, the business is simpler, the responsibilities are single, and part of the database capacity problem can be solved at the same time, but it also introduces new technical complexity, as follows:

  • Distributed transactions, cross-database transaction operations require distributed transaction support, otherwise the system will face the problem of data inconsistency. Solution 1, XA transaction is adopted. XA transaction is the database itself supports specifications and has strong consistency characteristics, but the performance is relatively poor. XA transaction is not suitable for scenarios that pursue high performance. The second solution is to use flexible transactions. Flexible transactions mean that the database guarantees local transactions, and the realization of global transactions is realized by the business layer (such as through scheduling compensation, retry compensation, manual intervention, etc.). Common solutions for flexible transactions include: TCC, utilization The message queue implements transactions.
  • Join problem, after sub-database, tables are scattered to different databases, it is impossible to directly use SQL to perform JOIN operations, and the business layer needs to implement aggregation operations by itself, which increases development costs.

Level score table

Horizontal sharding refers to dividing a table into multiple tables according to a certain rule. The structure of the table after the split is exactly the same as that before the split, but the data is scattered among multiple tables, which can also become data shards.

Talk about the evolution of MySQL architecture: from master-slave replication to sub-database sub-table

 

Through the horizontal division of tables, the capacity and performance problems of a single table are solved. But at the same time, after the level scoring table, new technical complexity has been introduced, mainly as follows:

  • Routing issues. When the business layer performs DML operations on the database through SQL, which table should be queried? Option 1: Range routing. The table is divided according to the value range of a column (sharding key) in the table. For example, the main table is divided into multiple tables according to the creation time, and the data of each month is stored in a separate table. Range routing may have uneven data distribution, but the number of tables is easy to expand. Option 2: Hash routing. The modulo operation (field_value% table_num) based on a column in the table and the number of fragments. Hash routing is the opposite of range routing. When the number of tables expands, the data will be redistributed, but the data is more evenly distributed.
  • The join problem, because after the table is split, the data is scattered into multiple tables. If there is no sharding key in the conditional statement of the JOIN, then all the sharded tables need to be JOINed again. This operation will have performance problems.
  • The count problem, after the table is divided, if you need to count the table to record the sum, you need to traverse all the tables, and then summarize the results, which can be solved by a separate summary table, but this solution requires every insert or delete At that time, the summary table needs to be updated. If there is no update once, it will cause data inconsistency.
  • The order by problem, after the table is divided, if you need to sort, you need to traverse all the tables, and then re-order at the code layer, this operation will have performance problems at first glance.

Sub-database and sub-meter solution

  • The business code layer is solved, and the routing can be manually processed through SQL, but the coupling with the business is very serious and it is not easy to maintain. It is usually solved by integrating the jar package, such as integrating a mature open source project: sharding-jdbc.
  • Database middleware, database middleware implements the SQL standard corresponding to the database, the routing rules are configured in the database middleware, and there is no difference between the business code operation database middleware and the direct operation of the database.

to sum up

From single-node database to master-slave replication, to high availability of the database, to sub-database and sub-table, it solves the problems of data performance, capacity, high availability, operation and maintenance, etc., but it will bring distributed transactions, Complex SQL is difficult to operate, SQL routing and other issues.

Architecture design should follow: "simplicity", "suitability", "evolution" principles, in line with the current business development, so the system design does not need to consider the sub-database and sub-table, but should be a certain amount of data When there is a performance bottleneck, the system will be modified and optimized.

Original link: https://juejin.cn/post/6932298813453369358

If you think this article is helpful to you, you can pay attention to my official account and reply to the keyword [Interview] to get a compilation of Java core knowledge points and an interview gift package! There are more technical dry goods articles and related materials to share, let's learn and make progress together!

Guess you like

Origin blog.csdn.net/weixin_48182198/article/details/114084739