45 MySQL combat stress study notes: Lock insert statement Why so many? (Lecture 40)

First, this section provides an overview

In the previous article, I mentioned to MySQL auto-increment primary key lock is optimized as much as possible in the application to increase since the id, the lock is released increment.

Therefore, insert statement is a very lightweight operations. However, this conclusion is valid for "normal insert statement." In other words, some insert statements are "special circumstances", the need for additional resources in the implementation process
lock, or can not apply to it since increasing id immediately release increment lock.

Well, this article today, we will work together to talk about this topic.

Two, insert ... select statement

We start talking about it yesterday's problem. Table structure t and t2, the following statement initialization data, today's example we had aimed at two tables.

CREATE TABLE `t` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c` int(11) DEFAULT NULL,
  `d` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `c` (`c`)
) ENGINE=InnoDB;

insert into t values(null, 1,1);
insert into t values(null, 2,2);
insert into t values(null, 3,3);
insert into t values(null, 4,4);

create table t2 like t

Now, we take a look at why, when executed binlog_format = statement in the Repeatable Read isolation level:

insert into t2(c,d) select c,d from t;

When this statement is required for all rows and tables gap t of locking it?

In fact, the question we need to consider is consistency and log data. We look at this execution sequence:

Concurrent insert scene 1

The actual performance is achieved, if the session B to perform, because the statement of the table t primary key index added (-∞, 1] this next-key lock, will When the statement is completed before allowing the insert statement session A is executed.

But if there is no lock, then it may insert statement session B appears to perform, but after writing binlog situation. Thus, in the case where the binlog_format = statement, which is recorded the binlog such a sequence of statements:

insert into t values(-1,-1,-1);
insert into t2(c,d) select c,d from t;

This statement to execute the standby database, the id = -1 will be written to the rows in table t2, the main inconsistency appears prepared.

Three, insert the write cycle

Of course, the implementation of insert ... select when to lock the target table is not a full table, but only lock resources need to access.

1, slow query log - the data into the table t2

If now there is such a demand: to insert a row into table t2, c is the value of the row in the table t is the maximum value of c plus 1.

At this point, we can write this SQL statement:

insert into t2(c,d)  (select c+1, d from t force index(c) order by c desc limit 1);

Locking scope of this statement is (3,4], and (4, supremum] id = 4 rows on the two next-keylock, and the primary key index on the index table t c.

Its implementation is relatively simple process, in accordance with the index from the table t c reverse, a first scanning line, to get the result is written into the table t2.

Therefore the whole number of scanning lines is a statement.

The slow query log statement is executed (slow log), as shown below:

2 slow query log - the data into the table t2

Through this slow query log, we see Rows_examined = 1, the number of scanning lines just to verify the implementation of this statement is 1.

2, slow query log - the data into the table t

So, if we want this row of data inserted into the table t in the words:

insert into t(c,d)  (select c+1, d from t force index(c) order by c desc limit 1);

Execution flow statement is like? Scanning line number is how much?

At this time, we look at the slow query log will find wrong.

FIG 3 slow query log - the data into the table t

You can see, the value of this time is Rows_examined 5.

3, explain results

I mentioned in the previous article, I hope you will be able to learn how to use the results to explain the "brain supplement" during the execution of the entire statement. Today, we'll try together

4 is a result explain this statement.

4 explain the results of FIG.

From Extra Fields can see "Using temporary" word, indicating that the statement uses a temporary table. In other words, the implementation process, the need to read out the contents of the table t, written to a temporary table.

Figure 1 shows the rows, we may wish to execute the process of this statement is to make a guess: If the handle is the result of the query read out (scanning one line), written to a temporary table, and then read out from the temporary table ( scan 1 line), written back to
the table t. So, the number of scanning lines of this statement should be 2 instead of five.

So, this guess is wrong. In fact, Explain the results in the rows = 1 is affected because of the limit 1.

4. Check Innodb_rows_read change

From another perspective, we can look at how many InnoDB scan lines. As shown in FIG, 5 is a view of the results before and after Innodb_rows_read implementation of the sentence.

5 View Innodb_rows_read change

It can be seen before and after this statement is executed, the value Innodb_rows_read increased by 4. Because the default is to use the temporary table Memory engine, so this line of investigation are 4 table t, that is on the table t do a full table scan.

In this way, we put the whole execution process straightened:

1. Create a temporary table, table has two fields c and d.
2. The scan table index c t, sequentially taking c = 4,3,2,1, then back to the table, to read the value of c and d is written to the temporary table. In this case, Rows_examined = 4.
3. Since there are semantic limit 1, so only the first row to take the temporary table, and then inserted into the table t. In this case, the value added Rows_examined 1, into a 5.

In other words, this statement causes do full table scan on the table t, and all the space on the index will add c are shared next-key lock. Therefore, this statement during execution, other transactions can not insert data in this table.

As to why the implementation of the sentence requires a temporary table, because this type of data traversing the side of the case while updating data, if the read out data directly written back to the original table, it is possible to traverse the course, just read the inserted record, new inserted
into the recording if the logic involved in the calculation, just does not match the semantics.

Since the realization of this statement does not directly use limit 1 in the sub-query, leading to the execution of this statement needs to traverse the entire table t. It is relatively simple optimization method is to use the method described earlier, to insert into the first temporary
table temp_t, so that only the scan line; then this line of data extracted from the table into a table inside temp_t t1.

Of course, this statement because the amount of data involved is very small, you can consider using memory temporary table to do this optimization. When using temporary tables to optimize memory, written statement sequence is as follows:

create temporary table temp_t(c int,d int) engine=memory;
insert into temp_t  (select c+1, d from t force index(c) order by c desc limit 1);
insert into t select * from temp_t;
drop table temp_t;

Four, insert unique key conflict

1, the only key lock conflict

The first two examples are the use insert ... select the case, I want to introduce this next example is the case of a unique key conflict of the most common insert statement appears.

For table unique key, unique key conflict is often the case when you insert the data. I would let you give a simple example of unique key conflict.

6 unique key lock conflict

This example also in repeatable read (repeatable read) the next isolation level. We can see, insert statements to be executed session B into the lock wait state.

In other words, insert session A statement is executed, occurs when a unique key conflict is not simply an error return, plus a lock on the index is still conflict. As we have said, a next-key lock is by its right margin
values defined. At this time, (5,10] Shared next-key lock (read lock) held on session A index c.

As for why should we increase the read lock, in fact I could not find a reasonable explanation. From the effect of view, it is done to avoid this line was deleted other matters.

Here the official document has a description error, that if the conflict is the primary key index, to add record locking, a unique index before adding next-key lock. But in fact, these two types of index plus conflict are next-key lock.

Note: this bug, I'm writing this article is found in checking the documents, they have been sent to the official and verified.

2, unique key conflict - Deadlock

Some students in the previous article in the comments section to ask, concurrent insert data in the table has multiple unique index, there will be a deadlock. However, because he did not provide reproducible method or field, I can not do analysis. So, I suggest that you review
area when the hair issue, try to be accompanied reproducible method, or field information, so I go and you analyze the problem together.

Here, I will first share with you a classic deadlock scenario, if you come across other unique key conflict led to the deadlock scenario, and also welcomes give me a message.

When session A rollback of the rollback statement, session C almost simultaneously discovered the deadlock and returns. This logic is such that deadlock:

3, a state transition diagram - Deadlock

 

 

 

7 unique key conflict - Deadlock

When session A rollback of the rollback statement, session C almost simultaneously discovered the deadlock and returns.

This logic is such that deadlock:

insert into t values(11,10,10) on duplicate key update d=100; 

1. At time T1, start session A, and insert statement executed, in this case the index c = c plus 5 record locks. Note that this index is a unique index, therefore degenerate into record locks (if your impression blurred, can look back on
Dir locking rules described in article 21).
2. At time T2, session B to perform the same insert statement found unique key violations, plus read lock; Similarly, session C also the index c, c = 5 on this record, plus read lock.
3. T3 time, session A rollback. At this time, session B and session C are trying to continue to perform inserts, should add write lock. Two session must wait for the other side of the line lock, so there have been a deadlock.

The process state change as shown in FIG.

 

A state transition diagram of FIG. 8 - Deadlock

五、insert into … on duplicate key update

The above example is given directly after the main key violation, if it is rewritten, it would give the index c (5,10] plus an exclusive next-key lock (write lock).

nsert into ... on duplicate key update semantics of this logic is to insert a row of data, if the unique key constraints encountered, it is the update statement later.

Note that if there is more than one column unique constraint violated, it will order the index, modify the line with the first index conflict.

Now t table which already have (1,1,1) and (2,2,2) these two lines, we look at the effect of the implementation of the following statements:

9 The only two keys at the same time conflict

You can see, the primary key id is the first judgment, MySQL think this statement with id = 2 this line of conflict, it is modified row id = 2.

It should be noted that the implementation of this statement is the return of affected rows 2, can easily lead to misunderstanding. In fact, the only real update a row, but in the realization of the code, insert and update consider themselves successful, update
count plus 1, insert also added a 1 count.

VI Summary

Today this article, I introduce you to insert a statement in a few special cases.

  1. insert ... select method is common between the two copies of the data tables. You need to pay attention, in the repeatable read isolation level, this statement will select a table scan to record and read gap plus lock.
  2. And if insert and select the objects are the same table, it may cause the write cycle. In this case, we need to introduce user temporary table to do optimization.
  3. insert statements if the only key violation occurs, plus shared on the unique values ​​conflict next-key lock (S locks). Therefore, due to the unique key constraints encountered the lead to error, or roll back the transaction to be submitted as soon as possible, to avoid the lock for too long.

Finally, I left you a question now.

You usually copy data between two tables using what methods, what precautions do? In your application scenario, this method, compared to what advantage over other methods is that it?

You can put your experience and analysis written in the comments section, I will end the article interesting comments to select and analyze with you the next. Thank you for listening, you are welcome to send this share to more friends to read together.

Seven, the issue of time

We've answered the question period in the article.

Some students mentioned that if there are other threads operating table during the original insert ... select to perform, will lead to logic errors. In fact, this is not going to, if not locked, the snapshot is read.

During the execution of a statement, it's consistent view is not modified, so even if there are other transactions from modifying the original data table, it will not affect the data seen in this statement.

Comments Guest Book thumbs board:

@ Long Jie students answered very accurate.

Guess you like

Origin www.cnblogs.com/luoahong/p/11763351.html