Talking about the pitfalls of Mysql reading and writing separation and the solution | JD Cloud technical team

1. Master-slave architecture

Why do we need to separate read and write? Personally, I think it is still necessary for the business to develop to a certain scale and drive the reform of the technical architecture. The separation of read and write can reduce the pressure on a single server, distribute read requests and write requests to different servers, share the load of a single service, improve availability, and improve Performance of read requests.

The picture above is a basic Mysql master-slave architecture, with 1 master, 1 backup and 3 slaves. This architecture is a load balancing initiative by the client. The connection information of the database is generally placed in the connection layer of the client, that is to say, the client selects the database for reading and writing.

The above figure is a master-slave architecture with proxy, the client only connects with the proxy, and the proxy determines the distribution route of the request according to the request type and context.

What are the characteristics of the two architecture schemes:

1. The direct connection architecture of the client, because there is one less layer of proxy forwarding, the query performance will be better, the architecture is simple, and it is easy to troubleshoot problems encountered. However, for this architecture, due to the need to understand the details of the back-end deployment, the client will be aware of the master-standby switchover when the library is migrated, and the library connection information needs to be adjusted

2. The architecture with proxy is more friendly to the client. The client does not need to know the details of the back-end deployment, connection maintenance, and back-end information maintenance are all done by the proxy. Such an architecture has relatively high requirements for the back-end operation and maintenance team, and the proxy itself also requires high availability, so the overall architecture is relatively complicated

But no matter which architecture is used, due to the delay between the master and the slave, when a transaction update is completed, a read request is initiated immediately. If you choose to read the slave library, it is very likely to read the state before the transaction update. We call this read The request is called an expired read. There are many situations in which master-slave delays occur. Interested students can find out by themselves. Although we also have strategies to deal with master-slave delays, they cannot be 100% avoided. These are not the scope of our discussion this time. Let us mainly discuss if There is a master-slave delay, and it happens that all our reads are from the slave library. How should we deal with it?

First, let me summarize the coping strategies:

  • Force the main library
  • sleep scheme
  • Judging master-slave without delay
  • Waiting for the main warehouse location
  • Other GTID schemes

Next, based on the above-mentioned several schemes, we will discuss how to implement them and what are the problems one by one.

2. Master-slave synchronization

Before starting to introduce the master-slave delay solution, let's briefly review the master-slave synchronization

The above figure shows the complete process of synchronizing an update statement from node A to node B

The standby database B and the main database A maintain a long connection, and there is a thread inside the main database A, which is specially used to serve the connection of the standby database B. The complete process of a transaction log synchronization is:

1. Use the change master command on the standby database B to set the IP, port, user name, password of the main database A, and where to start requesting the binlog. This location includes the file name and log offset.

2. Execute the start slave command on standby database B. At this time, the standby database will start two threads, which are io_thread and sql_thread in the figure.

3. Among them, io_thread is responsible for establishing a connection with the main library.

4. After verifying the user name and password, the master database A starts to read the binlog locally and sends it to B according to the location passed from the standby database B. After the standby database B gets the binlog, it writes it to a local file, which is called a relay log.

5. sql_thread reads the transfer log, parses out the commands in the log, and executes them.

The red arrow in the figure above, if the concurrency degree is indicated by the shade of the color, the darker the color is, the higher the concurrency degree is, so the master-slave delay time depends on how fast the backup synchronization thread executes the relay log (relay log in the figure). Summarize the reasons for possible master-slave delays:

1. The main library has high concurrency, high TPS, and the pressure on the backup library is high, and the log execution is slow

2. Large transactions, one transaction is executed in the main database for 5 seconds, then the same transaction has to be executed in the standby database for 5 seconds, such as deleting a large amount of data at one time, large table DDL, etc. are all large transactions

3. The parallel replication capability of the slave library, the version before Msyql5.6 does not support parallel replication, which is the model in the above figure. Parallel replication is also more complicated, so I won’t go into details here, and you can review it by yourself.

3. Master-slave delay solution

1. Forcibly remove the main library

This scheme is to classify our requests, which can usually be divided into two categories:

1. For requests that must get the latest results, you can force the main library

2. For requests that can read old data, it can be assigned to the slave library

This solution is the simplest solution, but one disadvantage of this solution is that all requests cannot be expired read requests, then all the pressure will come to the main library again, and you have to give up read-write separation and extension sex

2. sleep program

The sleep solution is to execute before each query from the library: select sleep(1), similar to this command, this method has two problems:

1. If the master-slave delay is greater than 1s, the expired status is still read

2. If the request may be able to get the result from the library in 0.5s, it still has to wait for 1s

This kind of solution seems very unreliable and unprofessional, but this kind of solution does have usage scenarios.

When working on projects before, there was such a scenario that we first write the main library, and after writing, send an MQ message, and then the consumer calls our query interface to check the data after receiving the message. Of course, we also read When writing the separation mode, the data cannot be found. At this time, it is recommended that the consumer delay the consumption of the message, such as a delay of 30ms, and then the problem will be solved. This method is similar to the sleep solution, except that the sleep is placed in the caller

3. Judging the master-slave no-delay scheme

  1. command judgment

show slave status, this command is executed on the slave library, there is a seconds_behind_master field in the execution result, this field indicates the master-slave delay in seconds, note that the unit is seconds. So this solution is to judge whether the current value is 0, if it is 0, directly query and obtain the result, if it is not 0, wait until the master-slave delay becomes 0

Because this value is at the second level, but in some of our scenarios it is a request at the millisecond level, so judging by this method is not particularly accurate

  1. Compare the position to judge the master-slave without delay

The picture above is part of the result of performing a show slave status

  • Master_Log_File and Read_Master_Log_Pos indicate the latest position of the master library read
  • Relay_Master_Log_File and Exec_Master_Log_Pos indicate the latest position executed by the standby database

If the values ​​of Master_Log_File and Relay_Master_Log_File, Read_Master_Log_Pos and Exec_Master_Log_Pos are exactly the same, it means that there is no delay between master and slave

3) Compare GTID to judge master-slave without delay

  • Auto_Position: 1 means that the GTID protocol is enabled between the master and slave
  • Retrieved_Gtid_Set: Represents the collection of all GTIDs received from the library
  • Executed_Gtid_Set: Indicates all GTID collections executed from the library

By comparing whether the Retrieved_Gtid_Set and Executed_Gtid_Set sets are consistent, it is determined whether there is a delay between the master and the slave.

It can be seen that comparing the location and comparing the GTID set is more accurate than sleep. Before querying, you can judge whether the received logs have been executed. Although the accuracy has improved, it is not yet accurate. Why is it so? Say?

First review the state of binlog under one thing

1. The execution of the main library is completed, written to the binlog, and fed back to the client

2.binlog is sent from the main database to the standby database, and the standby database receives the log

3. Execute binlog on the standby database

We judged the master-standby no-delay solution above, all judging that the logs received by the standby database have been executed, but from the status analysis of the binlog between the master and backup, we can see that there are still some logs that have been received and submitted by the client Confirmed, but the status of the log has not been received by the standby database

At this time, the main library executes 3 things, trx1, trx2, trx3, among which

  • trx1, trx2 have been transferred to the slave library, and the slave library has been executed
  • The trx3 master library has been executed and has responded to the client, but it has not been passed to the slave library

At this time, if the query is executed on the slave library B, according to the above method of judging the location, the master and slave are not delayed at this time, but trx3 cannot be found, strictly speaking, there is an "expired read". So is there any way to solve this problem?

To solve this problem, semi-synchronous replication can be introduced, that is, semi-sync replication (reference: https://dev.mysql.com/doc/refman/8.0/en/replication-semisync.html ).

able to pass

show variables like '%rpl_semi_sync_master_enabled%'
show variables like '%rpl_semi_sync_slave_enabled%'


These two commands are used to check whether semi-synchronous replication is enabled on both master and slave.

semi-sync made such a design:

1. When the transaction is submitted, the master library sends the binlog to the slave library

2. The slave library receives the binlog sent by the main library, and gives an ack confirmation to the main library, indicating that it has been received

3. After the main library receives the ack confirmation, it will return a transaction completion confirmation to the client

That is, semi-sync is enabled, which means that all the things that have been confirmed to be returned to the client have received binlog logs from the library. In this way, the query on the slave library can be determined through semi-sync and the way of judging the location. Avoid the occurrence of expired read.

However, the semi-sync method of judging the location is only applicable to the case of one master and one backup. In the case of one master and multiple slaves, as long as the master library receives an ack confirmation from a slave library, it will return the completion of the transaction to the client. Confirm that there are two situations when executing queries on the slave library at this time

  • If the query happens to be on the slave library that responds to the master library with ack confirmation, then the correct data can be queried
  • But if the request falls on other slave libraries, they may not have received the log yet, so there may still be expired reads

In fact, there is still a potential problem in the scheme of judging the synchronization site or GTID set, that is, during the peak business period, the site of the main database or the GITD set is updated very quickly, so the judgment of the two sites has not been equal, which is likely There is a situation where the slave library has been unable to respond to the query request.

The above two schemes are a little bit worse in reliability and accuracy. Next, we will introduce two relatively reliable and accurate schemes.

4. Waiting for the main library site

To understand waiting for the main library location, first introduce a command

select master_pos_wait(file, pos[, timeout]);

The logic executed by this command is:

1. First, it is executed from the library

2. The parameters file and pos are the binlog file name and execution location of the main library

3. The timeout parameter is optional, set it to a positive integer N, indicating that this function waits up to N seconds

Possible conditions for the execution result M of this command:

  • M>0 indicates that a total of M transactions have been executed from the start of command execution to the binlog position indicated by file and pos.
  • If an exception occurs in the synchronization thread of the standby database during execution, null will be returned
  • If wait more than N seconds, return -1
  • If it is found that this pos has been executed at the beginning of execution, it will return 0

When a transaction is executed, we need to immediately initiate a query request, which can be achieved through the following steps:

1. When a transaction is executed, execute show master status immediately to obtain the File and Position of the master library

2. Select a slave library to execute the query

3. Execute select master_pos_wait(File,Poistion,1) on the slave library

4. If the returned value >= 0, execute on this slave library

5. Otherwise, return to the main library for query

Here we assume that this query request waits at most 1s on the slave library, then if master_pos_wait returns a number greater than or equal to 0 within 1s, then it can be guaranteed that the latest data of the transaction just executed can be found on this slave library .

The above-mentioned step 5 is a bottom-up solution for this type of solution, because the delay time of the slave library is uncontrollable and cannot wait indefinitely, so if it times out, it should give up and query to the main library.

Some students may feel that if all the delays exceed 1s, then all the pressure will go to the main library. This is indeed the case, but according to our settings, no overdue reading is allowed, then there are only two options, or timeout Give up, or go to the main library, which one to choose requires us to conduct specific analysis according to the business.

5. Etc. GTID scheme

If the GTID mode is enabled for the database, there is also a corresponding GTID solution

 select wait_for_executed_gtid_set(gtid_set, 1);


The logic of this command is:

1. Wait until the transaction executed by this library includes the incoming giid_set collection, and return 0

2. Timeout returns 1

In the previous scheme of waiting for the location of the master database, after executing the transaction, you need to go to the master database to execute show master status. Starting from mysql5.7.6, after the execution of the transaction is allowed, the GTID of the transaction execution is returned to the client, so that the solution of waiting for the GTIID reduces one query.

At this time, the process of waiting for the GTID scheme becomes like this:

1. After the execution of the transaction is completed, the GTID of the transaction is obtained from the returned packet analysis, which is recorded as gtid1

2. Select a slave library to execute the query

3. Execute select wait_for_executed_gtid_set(gtid1,1) on the slave library

4. If it returns 0, execute the query on this slave library

5. Otherwise, return to the main library for query

Same as waiting for the main library location plan, the final bottom line plan is to transfer to the main library for query, and it needs comprehensive business consideration to determine the plan

After the above things are executed, parse the GTID from the returned package. Mysql does not provide corresponding commands. You can refer to the api provided by Mysql ( https://dev.mysql.com/doc/c-api/8.0/en/ mysql-session-track-get-first.html ), our client can call this function to get GTID

Four. Summary

The above briefly introduces the read-write separation architecture, and after the master-slave delay occurs, if we use the read-write separation architecture, how should we deal with this situation? I believe that our master-slave still has more or less delays in daily life. . Of the several solutions introduced above, some seem very unreliable, and some have made some compromises, but all of them have practical application scenarios, and we need to choose the corresponding solution reasonably according to our own business conditions.

But having said that, the essence of the expired read is caused by one write and multiple reads. In actual applications, there may be other database solutions that can be expanded horizontally without waiting, but this is often obtained by sacrificing write performance. It is that we need to make a trade-off between read performance and write performance.

If there are any inaccuracies or mistakes in the article, please correct me.

Author: Jingdong Retail is still smart

Source: Reprinted by JD Cloud developer community, please indicate the source

The third-year junior high school student wrote the web version of Windows 12 deepin -IDE officially debuted, known as "truly independent research and development " . Simultaneously updated", the underlying NT architecture is based on Electron "Father of Hongmeng" Wang Chenglu: The Hongmeng PC version system will be launched next year, and Wenxin will be open to the whole society . Officially released 3.2.0 Green Language V1.0 Officially released
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4090830/blog/10107388