Which locks will the insert statement encounter

insert… select statement

example:

CREATE TABLE `t` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c` int(11) DEFAULT NULL,
  `d` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `c` (`c`)
) ENGINE=InnoDB;
insert into t values(null, 1,1);
insert into t values(null, 2,2);
insert into t values(null, 3,3);
insert into t values(null, 4,4);
create table t2 like t

In the repeatable read isolation level , execute the statement when binlog_format=statement: insert into t2(c,d) select c,d from t; all rows and gaps in table t need to be locked.

Reason: still go to log and data consistency. For example, there is this execution sequence:

The actual execution effect is that if session B is executed first, because this statement adds (-∞,1) the next-key lock to the primary key index of table t, the insert statement of session A will be allowed to execute after the statement is executed.

But if there is no lock, the insert statement of session B may be executed first, but then written to the binlog. Therefore, in the case of binlog_format=statement, such a sequence of statements is recorded in binlog:

insert into t values(-1,-1,-1);
insert into t2(c,d) select c,d from t;

When this statement is executed in the standby database, the row id=-1 will also be written to table t2, and there is an inconsistency between the master and the slave.

Of course, when insert… select is executed, the target table does not lock the entire table, but only the resources that need to be accessed.

insert cycle write

If there is such a requirement now: to insert a row of data into table t2, the value of c in this row is the maximum value of c in table t plus 1. At this point, we can write this SQL statement like this:

insert into t2(c,d)  (select c+1, d from t force index(c) order by c desc limit 1);

The lock range of this statement is the two next-key locks (3,4] and (4,supremum) on table t index c, and the row id=4 on the primary key index.

Its execution process: from table t in reverse order according to index c, scan the first row, get the result and write it into table t2. Slow query log (slow log), as shown below:

Through this slow query log, we see Rows_examined=1, indicating that the number of scan lines for executing this statement is 1.

So, if we want to insert such a row of data into table t:

insert into t(c,d)  (select c+1, d from t force index(c) order by c desc limit 1);

The value of Rows_examined at this time is 5.

You can see the words "Using temporary" in the Extra field, indicating that this statement uses a temporary table. In other words, during the execution process, the content of table t needs to be read out and written into the temporary table. And rows shows 1, which is actually affected by limit 1.

Check the results of Innodb_rows_read before and after executing this statement, you can see that the value of Innodb_rows_read increases by 4 before and after this statement is executed. Because the default temporary table uses the Memory engine , these four rows are all for table t, which means that a full table scan is performed on table t.

In this way, we clarified the entire execution process:

  1. Create a temporary table, there are two fields c and d in the table.
  2. Scan the table t according to the index c, take c=4, 3, 2, 1, and then return to the table, and read the values ​​of c and d into the temporary table. At this time, Rows_examined=4.
  3. Since there is limit 1 in the semantics, only the first row of the temporary table is taken, and then inserted into table t. At this time, the value of Rows_examined is increased by 1 to become 5.

This statement will cause a full table scan on table t, and will add a shared next-key lock to all gaps on index c . Therefore, during the execution of this statement, other transactions cannot insert data on this table.

Why does the execution of this statement need a temporary table? The reason is that this kind of situation is traversing the data while updating the data. If the read data is directly written back to the original table, it may read the newly inserted record during the traversal process and insert a new one. If the record of is involved in the calculation logic, it does not match the semantics.

Optimization method: first insert into to the temporary table temp_t, so that only one row needs to be scanned; then take this row of data from the table temp_t and insert it into table t1.

insert unique key conflict

For tables with unique keys, unique key conflicts are common when inserting data. Let me give you a simple example of unique key conflict.

This example is also executed under the repeatable read isolation level.

In the insert statement executed by session A, when a unique key conflict occurs, it not only reports an error and returns, but also adds a lock on the conflicting index. Session A holds the (5,10) shared next-key lock (read lock) on index c. From a functional point of view, this can prevent this row from being deleted by other transactions.

Note: For unique index conflicts or primary key index conflicts, next-key lock is added.

Let's take a look at the deadlock scenario caused by the unique key conflict:

When session A executes the rollback statement to roll back, session C finds a deadlock almost at the same time and returns. The logic of this deadlock is as follows:

  • At time T1, session A is started, and the insert statement is executed. At this time, a record lock is added to c=5 of index c. Note that this index is a unique index, so it degenerates into a record lock.
  • At time T2, session B needs to execute the same insert statement and finds a unique key conflict and adds a read lock; similarly, session C is also on index c, and the record c=5 has a read lock added. All enter to wait.
  • At T3, session A rolls back. At this time, both session B and session C are trying to continue the insert operation, and write locks must be added. Both sessions have to wait for each other's row lock, so a deadlock occurs.

insert into … on duplicate key update

The above example is to report an error directly after the primary key conflict, if it is rewritten as: insert into t values(11,10,10) on duplicate key update d=100;

Will add an exclusive next-key lock (write lock) to index c (5,10).

insert into… on duplicate key update The semantic logic is to insert a row of data, and if it encounters a unique key constraint, execute the following update statement.

Note that if multiple columns violate the uniqueness constraint, the rows that conflict with the first index will be modified in the order of the index.

Now that there are two rows (1,1,1) and (2,2,2) in table t, let's take a look at the effect of the following statement:

The primary key id is judged first. MySQL thinks this statement conflicts with the row id=2, so the row with id=2 is modified. It should be noted that the affected rows that executes this statement returns 2 which is easy to cause misunderstanding. In fact, there is only one line that is actually updated, but in terms of code implementation, both insert and update consider themselves successful , the update count is increased by 1, and the insert count is also increased by 1.

 

Content source: Lin Xiaobin "45 Lectures on MySQL Actual Combat"

 

 

Guess you like

Origin blog.csdn.net/qq_24436765/article/details/112990536