How does the separation of reading and writing land in the business

How does the separation of reading and writing land in the business

(1) Transferred from Lagou Education-Principles and Actual Combat of Distributed Technology 45 Lectures on Bing Yue

In this class, we will explore how the separation of reading and writing can be implemented in business.
Read-write separation is a commonly used technical means in business development, and it is also a hot issue in interviews. Today we are going to talk about what business scenarios require read-write separation, the mechanism for the realization of read-write separation, and the application of read-write separation in actual production. Issues to be aware of.

(2) When do you need to read and write separation

Most business scenarios on the Internet are read more than write less. For typical businesses such as e-commerce, the comparison of read and write requests may be more than an order of magnitude worse. In order to prevent the read of the database from becoming a business bottleneck, and to ensure the success rate of writing to the database, the technology of read-write separation is generally used to ensure.
Read and write separation, as the name implies, is to separate the read library and write library operations. From the perspective of CRUD, the main database handles transactional operations such as adding, modifying, and deleting, while the database handles SELECT query operations. In specific implementation, there can be one master and one slave, one master library is configured with one slave library; it can also be one master with multiple slaves, that is, one master library, but multiple slave libraries are configured, and the read operation is performed through multiple slave libraries, which supports more High read concurrency pressure.
The realization of read-write separation is to transfer the pressure of access from the main library to the slave library, especially when the single-machine database cannot support concurrent read and write, and most of the business requests are read operations. If the business feature is writing more and less reading, such as some business scenarios that need to be dynamically updated, application read-write separation is not appropriate. Because of the support of transactions in relational databases such as MySQL InnoDB, the write performance will not be too high, so it is generally chosen Higher performance NoSQL and other storage to achieve.

(Three) MySQL master-slave replication technology

Read-write separation is based on the master-slave replication architecture. The following describes the master-slave replication technology in MySQL.

binlog log

The master-slave replication of the MySQL InnoDB engine is implemented through the binary log binlog. In addition to the data query statement select, the binlog log records various other data write operations, including DDL and DML statements.
There are three formats of binlog: Statement, Row and Mixed.
· Statement format, SQL statement-based replication
In Statement format, binlog will record every SQL operation that modifies data, and it can be played back locally after it is retrieved from the library.
· Row format, based on row information duplication
Row format records the details of each row of data modification with behavioral dimensions, does not record the context-related information of executing SQL statements, only records the modification of row data. Assuming that there is a batch update operation, the binary file will be saved in the form of row records, which may generate a large amount of log content.
· Mixed format, the mixed mode replicates the
Mixed format, which is the combination of Statement and Row. In this way, different SQL operations will be treated differently. For example, general data operations are saved in row format, and some changes to the table structure are recorded using statement.

(4) Master-slave replication process

The MySQL master-slave replication process is shown in the following figure:

image.png

· The main library writes the changes into the binlog log. After the slave library is connected to the main library, the main library will create a log dump thread to send the contents of the bin log.
· After synchronization is started from the library, an IO thread will be created to connect to the main library and request the updated bin log in the main library. After the I/O thread receives the update sent by the binlog dump process of the main library, it will be saved in the local relay log. .
· Then there is a SQL thread from the library responsible for reading the contents of the relay log, synchronizing to the database storage, that is, playing back locally, and ultimately ensuring the consistency of the master-slave data.

(5) Issues that should be paid attention to when reading and writing separation

The distributed system realizes the separation of read and write through master-slave replication, which solves the performance bottleneck problem of read and write operations, but also increases the overall complexity. Let's take a look at what additional issues need to be paid attention to after the introduction of master-slave replication.

(6) Delay problem under master-slave replication

Since the master database and the slave database are two different data sources, there will be a delay in the master-slave replication process. When the master database has data written, it will be written to the binlog log file at the same time, and then the slave database will synchronize the data through the binlog file. Need to perform additional log synchronization and write operations, during which there will be a certain time delay. Especially in high concurrency scenarios, the data that has just been written to the main library cannot be read from the library immediately, and it can only be done after tens of milliseconds or hundreds of milliseconds.
In some business scenarios that require high consistency, this delay caused by the master and slave will cause some business problems, such as order payment, payment has been completed, the master database data has been updated, and the slave database is not yet available. At this time, go to the slave If the library reads the data, the order will not be paid, which is unacceptable in the business.
In order to solve the problem of master-slave synchronization delay, there are usually the following methods.
· Sensitive business forced the library to read the main
part of the business need to write after reading the data in real-time database in development, this type of operation can usually be solved by forcing the reading of the main library.
· No read-write separation
for key businesses. Businesses that are not sensitive to consistency, such as order comments and personal information in e-commerce, can be read-write separated. Businesses that require higher consistency, such as financial payments, are not read-write. Separation to avoid problems caused by delays.

(7) How to avoid data loss in master-slave replication

Assuming that during the master-slave synchronization of the database, the master database is down, and the data has not been synchronized to the slave database, there will be data loss and inconsistency. Although this is an extreme scenario and generally does not happen, MySQL is still designed Considered.
MySQL database master-slave replication has asynchronous replication, semi-synchronous replication and fully synchronous replication.
· Asynchronous replication In
asynchronous replication mode, when the master library accepts and processes the client's write request, it directly returns the execution result, and does not care about the success of the slave library synchronization. This will cause the problems mentioned above. After the main library crashes, there may be Some operations are not synchronized to the slave library, and data loss occurs.
· Semi-synchronous replication
In the semi-synchronous replication mode, the master library needs to wait for at least one slave library to complete synchronization before completing the write operation. After the master library finishes executing the transaction submitted by the client, the slave library writes the log into its own local relay log, and returns a response result to the master library. The master library confirms that the slave library has completed synchronization before ending the write operation . Compared with asynchronous replication, semi-synchronous replication improves data security and avoids data loss due to main library crashes, but at the same time it also increases the time consumption of main library write operations.
· Full synchronization replication
Full synchronization replication refers to the case of multiple slave libraries. When the master library has completed a transaction, it needs to wait for all slave libraries to complete synchronization before completing the write operation. Full synchronous replication needs to wait for all slaves to execute the corresponding transactions, so the overall performance is the worst.

(8) Summary

Today I shared the business scenario of read-write separation, MySQL's master-slave replication technology, including the application of binlog, the delay problem of master-slave replication, and different mechanisms for database synchronization.
Read-write separation is only a means of distributed performance optimization. It is not necessary to use read-write separation for any read performance bottleneck. In addition to read-write separation, you can also sub-database and table, and use NoSQL databases such as cache and file index to improve performance. These contents will be discussed in later class.

Guess you like

Origin blog.csdn.net/qq_41489540/article/details/113800377