Interview questions

Have you done MySQL read-write separation? How to realize the separation of reading and writing in MySQL? What is the principle of MySQL master-slave replication? How to solve the delay problem of MySQL master-slave synchronization?

Interviewer psychoanalysis

At this stage of high concurrency, reading and writing must be separated. What do you mean? Because in fact, most Internet companies, some websites, or apps, actually read more and write less. So in response to this situation, it is to write a master library, but the master library hangs multiple slave libraries, and then reads from multiple slave libraries, can't it support higher read concurrency pressure?

Analysis of Interview Questions

How to realize the separation of reading and writing in MySQL?

In fact, it is very simple. It is based on the master-slave replication architecture. Simply put, we build a master library and hang multiple slave libraries. Then we just write the master library, and then the master library will automatically synchronize the data to the slave libraries.

What is the principle of MySQL master-slave replication?

The main library writes the changes into the binlog log, and then after connecting from the library to the main library, the slave library has an IO thread that copies the binlog log of the main library to its local and writes it to a relay relay log. Then there is a SQL thread from the library that reads the binlog from the relay log, and then executes the content in the binlog log, that is, executes the SQL again locally, so that you can ensure that the data in the main library is the same.

Insert picture description here

There is a very important point here, that is, the process of synchronizing data from the main library from the library is serialized, which means that parallel operations on the main library will be executed serially on the slave library. So this is a very important point. Due to the characteristics of the slave database copying logs from the master database and executing SQL serially, in a high concurrency scenario, the data of the slave database will be slower than the master database, and there is a delay . Therefore, it often appears that the data just written to the main library may not be readable, and it may take tens of milliseconds or even hundreds of milliseconds to read it.

And there is another problem here, that is, if the main library suddenly goes down, and then the data has not been synchronized to the slave library, then some data may not be available on the slave library, and some data may be lost.

So MySQL actually has two mechanisms in this area, one is semi-synchronous replication , used to solve the problem of data loss in the main database; the other is parallel replication , used to solve the problem of master-slave synchronization delay.

This so-called semi-synchronous replication , also known as semi-syncreplication, refers to the main library after writing binlog log, they will be forced after this time will be immediately synchronized to the data from the library, from the library to log write their own local relay log, then will Return an ack to the main library, and the main library will consider the write operation to be completed after receiving at least one ack from the slave library .

The so-called parallel replication refers to starting multiple threads from the library, reading the logs of different libraries in the relay log in parallel , and then replaying the logs of different libraries in parallel , which is library-level parallelism.

MySQL master-slave synchronization delay problem (essence)

In the past, the online bugs caused by the master-slave synchronization delay problem were indeed handled online, which belonged to small production accidents.

Is this the scene? A classmate wrote code logic like this. Insert a piece of data first, find it out, and then update this piece of data. In the peak period of the production environment, the write concurrency reached 2000/s. At this time, the master-slave replication delay was probably less than tens of milliseconds. Online will find that there is always some data every day. We expect to update some important data status, but it is not updated during peak periods. Users give feedback to customer service, and customer service will give us feedback.

We pass the MySQL command:

show slave status

Check Seconds_Behind_Master, you can see that the data copied from the main library is a few ms behind.

Generally speaking, if the master-slave delay is more serious, there are the following solutions:

Sub-library, split a main library into multiple main libraries, the write concurrency of each main library is reduced several times, and the master-slave delay can be ignored at this time.
Open the parallel replication supported by MySQL, and replicate multiple libraries in parallel. If the write concurrency of a certain library is extremely high, and the write concurrency of a single library reaches 2000/s, parallel replication is still meaningless.
Students who rewrite and write code should be cautious. You may not be able to find it immediately when inserting data.
If it does exist, it must be inserted first, and the query will be found immediately upon request, and then some operations will be performed in reverse immediately to set the direct connection to the main database for this query . This method is not recommended . If you do this, the meaning of separation of reading and writing will be lost.