45 MySQL combat stress study notes: Transactions in the end is isolated or not isolated? (Lecture 8)

First, today's content summary

I mentioned in the first three articles, and tell you when the transaction isolation level before, if it is repeatable read isolation level, transaction T will start when creating a view read-view, the period after the transaction T executes even if other transactions from modifying the number of
data, transaction T continues to see like to see at startup. In other words, in a repeatable read transaction carried out under the isolation level, it seems aloof, free from outside influence.

However, in my last article, when you share a row lock also mentioned that a transaction to update a row, if you happen to have another transaction has a row lock this line, it can not be so aloof, and will be locked live, enter the wait state. Asked
question is, since entering the waiting state until this transaction to obtain row locks to update their own data when it reads the value of what is it?

Let me give you an example of it. The following is a statement to initialize only two rows of the table.

mysql> CREATE TABLE `t` (
  `id` int(11) NOT NULL,
  `k` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;
insert into t(id, k) values(1,1),(2,2);

1 Transaction A, B, C execution flow

Here, we need to note that the timing of starting the transaction.

begin / start transaction command is not the start of a transaction, in the implementation of the first operation InnoDB table after their statement, the transaction really start. If you want to start a transaction immediately, you can use the Start
Transaction Snapshot consistent with this order.

The first start-up mode, a consistent view is created when the first implementation of a snapshot reading the statement; 
the second start-up mode, a consistent view is performing start transaction with when consistentsnapshot created.

Also note that throughout the column inside, our example unless otherwise specified, are the default autocommit = 1.

In this case, the transaction C does not explicitly use the begin / commit, represents in itself is a transaction of this update statement, statement completion time will be automatically submitted. Transaction B After updating the row of the query; in a read-only transaction A

Transaction query, and it is chronologically after the query transactions B.
At this time, if I tell you that the transaction be found in the value of k B is 3, and the value of k transaction A is found in 1, you are not feeling a little dizzy it?

Therefore, this article today, I actually just wanted to understand the problem and you say, hopes that by this process to unravel the confusion that can help you lock on the transaction and have a better understanding of InnoDB.

In MySQL, there are two concepts of "views" of:

One is the view. It is a query with the definition of virtual table, execute a query at the time of the call and generate results. Create view syntax is create view ..., and its query methods and tables. 
Another InnoDB is used when implementing MVCC consistent read view that consistent read view, to support the RC (Read Committed, read submission) and RR (Repeatable Read, repeatable read) isolation level implementation.

It is not the physical structure, the role is used to define the "I can see what data" during a transaction.

In the third article, "Transaction Isolation: Why I can not see you changed? "I explained again MVCC implementation logic with you. To illustrate the difference between today's queries and updates, I changed a way to illustrate the read view split
open. You can combine these two articles to illustrate a deeper understanding of MVCC.

Second, the "snapshot" at MVCC in how it works?

In the Repeatable Read isolation level transaction at boot time to "take a snapshot." Note that this is based on a snapshot of the entire library.

At this point, you would say it does not look realistic ah. If a library has 100G, then I start a transaction, MySQL will copy data out of 100G, the process is much more Mana. However, I usually execute transactions quickly ah.

In fact, we do not need to copy the data of this 100G. Let's take a look at this snapshot is how to achieve. InnoDB which each transaction has a unique transaction ID, called the transaction id. It is at the beginning of a transaction
waiting to InnoDB transaction system application, the application is based strictly increasing sequence.

1, a row of data in the table, in fact, there may be multiple versions (row) row trx_id

And each row of data are also multiple versions. Each transaction update data that they will generate a new version of the data, and the transaction id assigned to this version of the data transaction ID, referred to as row trx_id. Meanwhile, the old
version of the data to be retained, and the new version of the data, the information may be able to get it directly.

In other words, a row of data in the table, in fact, there may be multiple versions (row), each version has its own rowtrx_id.

As shown in FIG 2, it is a continuously updated state after the plurality of transaction records.

Figure 2 Figure row status changes

The dashed box in the figure is four versions of the same row of data, the most current version is V4, the value of k is 22, it is 25 transaction id for the transaction updates, so it's also a row trx_id 25.

1, undo log Where is it?

You may ask, not to say that the previous article, the update statement will generate undo log (rollback log) do? So, Use the undo log Where is it?

Indeed, three broken line arrow in FIG. 2, is the undo log; and the V1, V2, V3 is not present in physical reality, but each time calculated on the basis of the required version of the current and undo log. For example, when the need to V2
designate, is followed by the implementation U3 through V4, U2 counted out.

Understand the concept of multi-version and row trx_id, let us think about, InnoDB is a snapshot of how the definition of the "100G" is.

According to the definition of repeatable read, when a transaction is started, you can see the result of all transactions that have been submitted. But then, during the execution of this transaction, update other matters it is not visible.

Thus, a transaction is only required when booting the statement said,

1, the moment I start to prevail, if a version of the data is generated before I start, you know;

2, if it is generated only after I started, I do not know, I have to find a version of it. "

Of course, if the "previous version" is not visible, it would have to keep looking forward. Also, if the transaction is to update their own data, or to recognize its own.

2, active refers to?

On realization, InnoDB constructed for each transaction of an array to hold the moment of starting the transaction, currently "active" all matters ID. "Active" refers to, but have not yet started the submission.

Minimum transaction ID which is referred to as an array of low level, the current system which has been created through the transaction ID plus a maximum value referred to as a high water level.

This view arrays and high water, they form a consistent view of current affairs (read-view).

The version of the data visibility rules, is based on comparing the results of this row trx_id and consistent view of data obtained.

2, for the moment start to the current transaction number, a version of the data row trx_id that there are several possible?

This view all the row trx_id array divided into several different situations.

3 version of the data visibility rules

Thus, for the moment of starting the current transaction, one version of the data row trx_id, there are several possibilities:

1. If you fall on the green part, indicates that the current version is a transaction or transactions that are committed self-generated, this data is visible;

2. If the red part falls, indicates that this version is generated by the start of the transaction in the future, is certainly not visible;
3. If the yellow part falls, it includes two cases

. If the row trx_id A in the array represents the version is not generated by the transaction submitted invisible; 
. B row trx_id if not in the array, indicating that version is already submitted the transaction generated visible.

For example, for the data in FIG. 2, if there is a transaction, which is the low water level 18, then this line when it accesses the data, will be calculated from U3 through V4 V3, so it seems, this line the value is 11.

You see, with this statement, the update system which ensued, not just the transaction is seen unrelated to the content of it? Because after the update, generated version must belong to the above case 2 or 3 (a), and it is, that
some new version of the data does not exist, so a snapshot of this transaction, is "static" of the .

So now you know, InnoDB use of "all the data have multiple versions" of this feature, the ability to achieve a "second-class to create snapshots" of.

3. A statement transaction results returned, why is k = 1 ( human flesh analysis )

Next, we continue to look at Figure 1 three transactions, transactional analysis A statement of return, why is k = 1.

1, the analysis of human flesh

Here, we wish to make the following assumptions:

1. A transaction before the start, there is only one active system is a transaction ID 99;
2. Services A, B, C are 100, 101, the version number and the current system, only four transaction;
3. Three before the start of the transaction, (1,1) row trx_id this line of data is 90.

Thus, view transaction array A is [99,100], is the view of an array of transaction B [99,100,101], C is a view of an array of transaction [99,100,101,102].

To simplify the analysis, I first remove other interference statements, only to draw with transaction A query logic operations related to:

 FIG 4 A query data transaction logic diagram

Can be seen from the figure, the first transaction is a valid update C, the data from (1,1) into the (1,2). At this time, the latest version of the data row trx_id is 102, and 90 this version has become a historical version.

The second transaction is valid updates B, from the data (2) into a (1,3). At this time, this latest version of the data (ie row trx_id) is 101, while the 102 has become the historical version.

You may have noticed that, when the transaction A query, in fact, transaction B has not been submitted, but it generates (1,3) This version has become the current version. But this version of the transaction A must not be visible, otherwise it becomes dirty read.
Well, now I want to read the data transaction A, and its view of the array is [99, 100]. Of course, reading the data is read from the current version played. Therefore, the transaction A query of data read process is like this:

  1. Found (1,3) when it is determined that row trx_id = 101, the water level high and large, the red area is not visible;
  2. Next, find a historical version, see row trx_id = 102, the water level high and large, the red area is not visible;
  3. And then move forward to find, finally found (1,1), its row trx_id = 90, less than the low water level, in the green zone, visible

Such execution down the line, although data for the period has been modified, but the transaction A query at any moment, to see the results of this line of data are the same, so we call consistency read.

4, the results returned A transaction statement, why k = 1 ( code logic analysis)

This judgment rule is a direct translation from the code logic over, but as you can see, visibility analysis for human flesh is very troublesome.
So, I'll give you translate. A version of the data for a transaction view it, in addition to their own updates are always visible, there are three cases:

1. uncommitted version, invisible;
2. version has been submitted, but was submitted after the view is created, not visible;
3. version has been submitted, and is submitted in the front view of creation, visible.

Now, we use the rules to determine the query results in Figure 4, view an array of query transaction A is generated when the transaction A start, this time:

(3) submission of not belonging to the case 1, is not visible;
(2) Although the author, but the view is submitted after creating an array, are 2 cases, is not visible;
(1,1) is a view submitted before array creation, visible.

You see, after removing the digital contrast, only a chronological order to determine, analyze it is not much easier. So, we have to analyze the use of this rule later.

Third, the update logic

Observant students may have doubts: Update statement transaction B, if read in accordance with consistency, the result seems wrong Oh?

1, transaction B update statement will be how to deal with it (the current read)?

Figure 5 you view an array of transaction B is Mr. Cheng's, C after the transaction was submitted, should not invisible (1,2) do, how can calculate the (1,3) to?

 FIG 5 FIG update logic Transaction B

Yes, if the transaction B queries the data before the update, this query returns the value of k is indeed 1.

But when it's time to go to update the data, you can not update the version in history, or update transaction C is lost. Therefore, the transaction in this case B set k = k + 1 is the operation performed on the basis of (1,2) on.

So, here it is used in such a rule: the updated data is written in the first reading, and this reading, only read the current value, called the "current reading" (current read).

Therefore, when updated, the current read to get the data (1,2), became the new updated version of epigenetic data (1,3), this new version of the row trx_id is 101.

Therefore, the enforcement branch B query time, a look at their version number is 101, the version number of the latest data is 101, their own updates, can be used directly, so the value of k queries get is 3.

Here we refer to a concept called the current reading. In fact, in addition to the update statement outside, select statement if locked, also the current reading.

2. Transaction B update statement will be how to deal with it (two-phase locking)?

So, if the query select the Transaction A * from t where id = 1 changed a bit, plus lock inshare mode or for update, also you can read the version number is 101 data, the value returned is 3 k. Under
the surface these two select statements that were added to the read locks (S locks, shared locks) and a write lock (X lock, exclusive lock).

Going one step further, assuming that the transaction is not immediately submitted C, but became following transaction C ', what will happen?

 FIG 6 Transaction A, B, C 'execution flow

Different transactions C 'is not submitted immediately after the update, before it submits, transaction B update statement first launched. As we mentioned before, although the transaction C 'did not commit, but (1,2) This version has been generated, and is currently the
latest version. So, transaction B update statement will be how to deal with it?

At this time, we mentioned in the previous article "Two-phase locking protocol" will play it . Transaction C 'did not commit, that is (1) a write lock on this version yet released. The current transaction B is read, it is necessary to read the latest version, and
must be locked, and therefore it is locked, it must wait until the transaction C 'to release the lock, in order to continue its current reading.

 7 B transaction update logic (fits transaction C ')

Here, we read consistency, and current read row locks on the string up.

Now, we go back to the beginning of the article in question:

Ability repeatable read transaction is how to achieve?

The core repeatable read consistency is read (consistent read); and transaction updates data when read only with the current. If the row lock the current record is occupied by other matters, then you need to enter the lock wait.

3, the main difference between the read committed and repeatable read?

The logical reads and repeatable read submitted a similar logic, their main difference is:

In the Repeatable Read isolation level, only need to create at the beginning of a transaction consistent view, after the affairs of other queries are common in this consistent view; 
submitted under isolation level in reading every statement will be executed prior to re-calculate a new view.

So, let's look at, filed under isolation level in reading, transaction A and transaction B's query found in k, respectively, should be how much?

It should explain, "start transaction with consistent snapshot; " the meaning of this statement is from the beginning, to create a consistent snapshot continued throughout the transaction. Therefore, reading is filed under isolation level, this usage would
not make sense, it is equivalent to a normal start transaction.

Here is a state diagram when reading submitted, you can see the opportunity to create a view of the array of these two queries has changed, that is, the figure read view box. (Note: Here, we use the C logical transaction or direct submission, rather than affairs C ')

 8 read committed transaction isolation level state diagram in

At this time, the view of an array of query transaction A is in the implementation of this statement is created, the sequence (1,2), (1,3) generation time before this moment view of the array is created. However, at this time:

(3) submission of not belonging to the case 1, is not visible; 
(2) submission, and 3 belonging to the situation seen

So, this time the transaction A query returns k = 2. Obviously, transaction B results k = 3.

IV Summary

InnoDB row of data there are several versions, each version of the data has its own row trx_id, each with its own statement or transaction consistent view. Common query read consistency, consistency will be read in accordance row trx_id and consistent view of
determining the visibility of the data release.

For repeatable read, query only recognizes already submitted complete data before the transaction starts; 
for reading submitted, the query has been submitted only recognize the complete data before the statement starts;

The current reading, always reading the latest version has been submitted to complete.

You can also think about why table structure does not support the "Repeatable Read"? This is because the table structure there is no line data corresponding to, nor row trx_id, it can only follow the logic of the current read.

Of course, MySQL 8.0 can already put on the table structure InnoDB dictionary, perhaps in the future will support the structure of the table repeatable read.

He went to Questions time. I use the following table structure and initialization statement as a test environment, the transaction isolation level is repeatable read. Now, I want all the value of c 'and c is equal to the field id value line "is clear, but found a
Ge" strange "and get rid of the situation. Please constructed out of this situation, and explain its principles.

mysql> CREATE TABLE `t` (
  `id` int(11) NOT NULL,
  `c` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;
insert into t(id, c) values(1,1),(2,2),(3,3),(4,4);

  

After reproduced to you please think again, there is not likely to encounter this situation in the actual business development? Your application code will not fall into the "pit" where, how did you solve it?

You can put your thoughts and ideas written in the comments section, I will end next article and you discuss this issue. Thank you for listening, you are welcome to send this share to more friends to read together.

Fifth, on the issue of time

I'm on the last article, leaving your problem is: how to delete the first 10,000 rows of the table. More messages have chosen the second way, namely: the loop executes 20 times delete from T limit 500 in one connection.
Is indeed the case, the second way is relatively good.

The first embodiment (namely: direct execution delete from T limit 10000) inside, a single statement occupied for a long time, the lock time is relatively long; large transaction but also causes a delay from the master.
The third embodiment (i.e.: 20 simultaneously in the delete connection from T limit 500), artificially cause lock conflicts.

Sixth, selection message

This theoretical knowledge is very rich, you need to summarize:

1.innodb support RC and RR isolation level is to achieve a consistent view (consistent read view) with

2. affairs at startup will take a snapshot, this snapshot is based on the entire library.

That is within a transaction, for modifying the entire library is not visible to the transaction (for a snapshot read) based on the meaning of the entire library
if select t table within a transaction, the transaction execution further DDL t table, according to time of occurrence , or to go or to go lock error (see Chapter 6)

3. MVCC transaction is how to achieve it?

(1) Each transaction has a transaction ID, called the transaction id (strictly increasing)
(2) the transaction when you start, find the maximum transaction ID committed denoted up_limit_id.
(3) updating a transaction statement, such as id = 1 instead of id = 2. Will id = 1 and the previous row trx_id lines to the undo log in, and the value on the data page id changed to 2 and to modify this statement transaction id recorded in the trekking head
(4) and then set a rule, a transaction to view a data, you must first use the transaction up_limit_id do comparison with the transaction id of the line,
if up_limit_id > = transaction id, it can be seen if up_limit_id <transaction id, you can only go to undo log taken. Find time to undo log data, that needs to be done than to be up_limit_id> transaction id, it returns the data

4. What is the current reading, due to the current reading is read before write, only read the current value, the current reading. Up_limit_id for the transaction will be updated in the transaction transaction id

5. Why can achieve repeatable rr rc not be read, two cases

The case of (1) a snapshot reading, rr up_limit_id can not be updated within a transaction, and each will up_limit_id rc update transaction id latest committed transaction before the snapshot read, you can not re-read rc
(2) the current reading next, rr using record lock + gap lock is achieved, but no GAP rc, it is not repeatable read rc

Guess you like

Origin www.cnblogs.com/luoahong/p/11606766.html