In-depth detailed explanation of MySQL concurrency control and transaction principles

In today’s Internet business, the most widely used database is undoubtedly the relational database MySQL. The reason for using the word "or" is because the domestic database field has also made some great progress in recent years, such as TIDB, OceanBase, etc. Distributed databases, but they have not yet formed absolute coverage, so at this stage, we still have to continue to learn MySQL database to deal with some problems encountered in the work, and the investigation of the database part during the interview.

There are also various knowledge point modules to organize documents and more real interview questions from major factories. Friends in need can click the link below to get them for free

Link: 1103806531 Password: CSDN

Insert picture description here

Today's content will talk to you about the core issues of concurrency control, transactions, and storage engines in the MySQL database. The knowledge graph involved in this content is shown in the figure below:
Insert picture description here

Concurrency control

Concurrency control is a huge topic. As long as there are multiple requests to modify data at the same time in a computer software system, concurrency control problems will arise, such as multi-thread security issues in Java. Concurrency control in MySQL mainly discusses how the database controls the concurrent reading and writing of table data.

For example, there is a table useraccount with the following structure:

Insert picture description here
At this time, if the following two SQL statements initiate a request to the database at the same time:

SQL-A:

update useraccount t set t.account=t.account+100 where username='wudimanong';

SQL-B:

update useraccount t set t.account=t.account-100 where username='wudimanong'

When the above statements are executed, the correct result should be account=100, but in the case of concurrency, this situation may occur:

Insert picture description here
So how is concurrency control in MySQL? In fact, like most concurrency control methods, the lock mechanism is also used in MySQL to achieve concurrency control.

1. MySQL lock type

In MySQL, concurrency control is mainly achieved through "read-write locks".

Read lock : Also called share lock, multiple read requests can share a lock to read data at the same time without blocking.

Write lock (write lock) : also called exclusive lock (exclusive lock), write lock will exclude all other requests to acquire the lock, and will block until the write is completed and the lock is released.

The read-write lock can achieve parallel read and read, but cannot achieve parallel read, write, and write. The transaction isolation that will be mentioned later is achieved based on the read-write lock!

2. MySQL lock granularity

The above-mentioned read-write locks are divided according to the lock type of MySQL, and the granularity that the read-write locks can impose is mainly reflected in tables and rows in the database, also known as table locks and row locks. ).

Table lock (table lock) : It is the most basic locking strategy in MySQL. It locks the entire table, so that the overhead of maintaining the lock is minimal, but it will reduce the efficiency of reading and writing the table. If a user implements a write operation (insert, delete, update) to the table through a table lock, then first needs to obtain a write lock that locks the table, then in this case, other users' reading and writing to the table will be blocked . Under normal circumstances, statements such as "alter table" will use table locks.

Row locks : Row locks can support concurrent reads and writes to the greatest extent, but the overhead of database maintenance locks will be relatively large. Row locks are the most commonly used lock strategy in our daily life. Generally, row-level locks in MySQL are implemented by specific storage engines, not at the MySQL server level (table locks will be implemented at the MySQL server level).

3. Multi-version concurrency control (MVCC)

MVCC (MultiVersion Concurrency Control), multi-version concurrency control. In most MySQL transaction engines (such as InnoDB), row-level locks are not simply implemented, otherwise there will be such a situation: "During data A is updated by a user (acquiring row-level write lock), other users Reading this piece of data (acquiring a read lock) will be blocked". But the reality is obviously not the case. This is because the MySQL storage engine is based on the consideration of improving concurrency performance. Through MVCC data multi-version control, read and write separation is achieved, so that data can be read without locks and read and write parallel.

Take the MVCC implementation of the InnoDB storage engine as an example:

InnoDB's MVCC is implemented by storing two hidden columns behind each row of records. Of these two columns, one holds the creation time of the row, and the other holds the expiration time of the row. Of course, what they store is not the actual time value, but the system version number. Every time a new transaction is opened, the system version number will be automatically incremented; the system version number at the beginning of the transaction will be used as the version number of the transaction to compare with the version number of each row of the query.

The main means that MVCC relies on in MySQL are "undo log and read view".

  • undo log: undo log is used to record multiple versions of a row of data.
  • read view: used to determine the visibility of the current version of the data

The undo log will be introduced later in the transaction. The schematic diagram of the reading and writing principle of MVCC is as follows:

Insert picture description here
The above figure demonstrates the MySQL InnoDB storage engine. Under the REPEATABLE READ (repeatable read) transaction isolation level, MVCC is implemented by saving two additional system version numbers (row creation version number, row deletion version number), which makes most read operations There is no need to add a read lock. This design makes data reading operations easier and better performance.

So how does the data read operation in MVCC mode ensure that the data is read correctly? Taking InnoDB as an example, each row of records will be checked according to the following two conditions when selecting:

  • Only find data rows whose version number is less than or equal to the current transaction version, which ensures that the rows read by the transaction either existed before the transaction started, or were inserted or modified by the transaction itself.
  • The delete version number of the row is either undefined or greater than the current transaction version number. This ensures that the rows read by the transaction are not deleted before the transaction starts.

Only records that meet the above two conditions can be returned as the result of the query! Take the logic shown in the figure as an example. In the process of changing the account to 200 in the write request, InnoDB will insert a new record (account=200), and use the current system version number as the line creation version number (createVersion=2). At the same time, the current system version number is used as the original row to delete the version number (deleteVersion=2), then there are two versions of data copies for this data, as follows:

Insert picture description here

If the write operation is not over and the transaction is temporarily invisible to other users, according to the Select check condition, only the record with accout=100 meets the condition, so the query result will return the record with account=100!

The above process is the basic principle of InnoDB storage engine about MVCC implementation, but later you need to pay attention to the logic of MVCC multi-version concurrency control can only work under the two transaction isolation levels "REPEATABLE READ (repeatable read) and READ COMMITED (commit read)" . The other two isolation levels are not compatible with MVCC, because READ UNCOMMITED (uncommitted read) always reads the latest data row, not the data row that conforms to the current transaction version; and SERIALIZABLE will add to all read rows The lock does not conform to the MVCC idea.

to sum up

Due to time constraints, the details are not written. The second half of MySQL transactions and MySQL storage engine are not displayed. Friends who need the full version can click the link below to get it for free

Link: 1103806531 Password: CSDN

Insert picture description here
Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_48655626/article/details/109117233