mysql combat 45 | increment id run out how to do?

MySQL has a lot of auto-incremented id, id is incremented each defines the initial value and then kept adding up the steps. Although there is no upper limit of the natural numbers, but in the computer, as long as the definition of this number represents the length in bytes, and that it has the upper limit. For example, an unsigned integer (unsigned int) is 4 bytes, the upper limit is 2 ^ 32-1

Since the increment id has an upper limit, there is likely to be used up. However, since increasing id run out what will happen?

Today this article, we take a look inside several MySQL increment id, analyze together their values ​​after reaching the upper limit of what happens.

Since the definition of value-added table id

Speaking increment id, your first thought should be the structure defined in the table increment field, that is my first 39 article "Why increment primary key is not continuous? "And introduce you to the auto-increment primary key id.

Defined logical value table from the limit is reached: to apply a next id, the obtained value remains unchanged.

We can verify the following sequence of statements:

create table t(id int unsigned auto_increment primary key) auto_increment=4294967295;
insert into t values(null);
// 成功插入一行 4294967295
show create table t;
/* CREATE TABLE `t` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4294967295;
*/

insert into t values(null);
//Duplicate entry '4294967295' for key 'PRIMARY'
复制代码

It can be seen after the first successful data INSERT statement to insert this AUTO_INCREMENT table has not changed (or 4294967295), has led to a second insert statement and get the same id value increment, and then trying to execute insert statement, report primary key violation error.

2 ^ 32-1 (4294967295) is not a particularly large number, for a frequent insertion and deletion of data tables, it is likely to be used up. So when construction of the table you need to look at your watch if it is possible to reach this limit, if possible, should be created to 8 bytes bigint unsigned.

InnoDB system increment row_id

If you create InnoDB tables do not specify a primary key, then InnoDB will give you create an invisible, a length of 6 bytes row_id. InnoDB dict_sys.row_id maintains a global value, all non-primary key InnoDB table, each row of data is inserted, all the current dict_sys.row_id row_id value as the data to be inserted, and the value 1 is added dict_sys.row_id.

Indeed, row_id achieved when the code length of 8 bytes is an unsigned long integer (bigint unsigned). However, InnoDB at design time, to leave only row_id a length of 6 bytes, writes only put last 6 bytes of data in the table, it can be written row_id value data table, there are two feature:

  1. row_id value written in the table range, is from 0 to 2 ^ 48-1
  2. When dict_sys.row_id = 2 ^ 48, if another data is inserted to act to apply row_id, later taken to get the last 6 bytes of the word is 0.

In other words, row_id written to the table is from 0 to 2 ^ 48-1. After reaching the upper limit, the next value is 0, then the cycle continues.

Of course, this value is 2 ^ 48-1 itself has been great, but if a MySQL instance to run long enough, or may reach this limit. After InnoDB logic, the application to row_id = N, the data line will be written to the table; if the table row_id = N row already exists, the new line written will overwrite the original line.

To verify this conclusion, you can modify the system to achieve self-energizing row_id by gdb. Note that the amount of change with gdb this operation is to allow us to reproduce the problem can only be used in a test environment.


                                                        FIG 1 row_id authentication sequence Spent

 

                                FIG 2 row_id spent verification results

Can be seen, after the dict_sys.row_id gdb I to 2 ^ 48, and then insert the row a = 2 appears in the first row of the table t, row_id because this value = 0. After re-insertion of the line a = 3, since row_id = 1, it covers the previous row a = 1, a = row_id 1 because this line is 1.

From this perspective, we should take the initiative to create auto-increment primary keys in InnoDB tables. Because, after the table increment id limit is reached, and then insert the data Times primary key violation error, it is more acceptable.

After all coverage data, means data loss, affecting the reliability of the data; reported primary key violation, is inserted into the failure, affecting availability. Under normal circumstances, the availability of reliable precedence.

Xid

In the first 15 article "Q & A article (a): log and index-related issues," I and introduce you to redo log and binlog mating time, mentioned that they have a common field called Xid. It is used in MySQL the corresponding transaction.

So, Xid inside MySQL is how to generate it?

MySQL internal maintains a global variable global_query_id, each time the statement is executed it will be assigned to Query_id, then add 1 to this variable. If the current statement is the first statement of this transaction execution, then MySQL will be assigned to the same time Query_id Xid this transaction.

And global_query_id is a pure memory variable, it clears up after the restart. So you know, in the same database instance, Xid different transactions is likely the same.

But will regenerate after the restart a new MySQL binlog file, which ensures, with a binlog file, Xid must be unique.

Although the restart does not result in the same MySQL binlog which there are two identical Xid, but if global_query_id reached the limit, it will continue to start counting from 0. In theory, the same scene appears inside a binlog Xid same thing occurs.


Global_query_id defined because the length is 8 bytes, the upper limit of this value is from 2 ^ 64-1. To this, the process must be such as the following:

  1. Performing a transaction, it is assumed Xid A;
  2. Next to perform 2 ^ 64 times the query, let global_query_id back to A;
  3. And then start a transaction, this transaction is Xid A.

However, this value is 2 ^ 64 is too large as you can think of this possibility only exists in theory.

Innodb trx_id

Xid and InnoDB are trx_id is easy to confuse the two concepts.

Xid maintained by the server layer. InnoDB internal use Xid, is to be able to make an association between InnoDB transaction and server. However, InnoDB own trx_id, is another maintenance.

In fact, you should be very familiar with this trx_id. It is in our first 8 article "affairs in the end is isolated or not isolated? "When talking about the visibility of the transaction, the transaction id used (transaction id).

Internal max_trx_id InnoDB maintains a global variable, each time you need to apply for a new trx_id, to get the current value of max_trx_id, and then add 1 max_trx_id.

InnoDB data visibility core idea is: each row of data records to update its trx_id, when one transaction reads a line of data to determine whether this method of data visible is through a consistent view of transaction data with this line trx_id do comparison.

For the transaction is executed, you can see trx_id affairs from information_schema.innodb_trx table.

I leave you thinking questions at the end of the previous article, is found on the inside trx_id the table from innodb_trx. Now let's look at a transaction site:


                                                             trx_id 3 Affairs

 session B, I find out from innodb_trx list of these two fields, the second field trx_mysql_thread_id is the thread id. Display thread id, to illustrate the transaction corresponding to these two queries to see the thread id is 5, which is the thread session A is located.

Can be seen, trx_id T2 time display is a large number; trx_id T4 is the time display 1289, it appears to be a relatively normal numbers. This is what causes it?

In fact, at time T1, session A yet involves updating, is a read-only transaction. For read-only transactions, InnoDB does not allocate trx_id. In other words:

  1. At time T1, trx_id value is actually zero. And this great number, only for display. While you and I will talk about this generation logic data.
  2. Until the time session A execute insert statement at time T3, InnoDB really allocated trx_id. So, T4 time, session B found in this value is trx_id 1289.

Note that, in addition to the obvious modifications class statement, if coupled for update after the select statement, this transaction is not a read-only transaction.

In an article in the comments area, some students suggested experiments, it was found more than plus 1. This is because:

  1. update and delete statements in addition the transaction itself, but also to remove old marker data, data that is put into the queue wait for subsequent purge physically deleted, this operation will put max_trx_id + 1, so in a transaction plus at least 2;
  2. InnoDB background operations, such as index information table statistics such operations, will also start in internal affairs, so you may see, trx_id not in accordance with the value of 1 plus increasing.

Well, T2 time found in great numbers is how this come about?

In fact, this figure is calculated each time a query is temporarily out of the system. It algorithms are: the variable pointer address trx converted into an integer of the current transaction, plus 2 ^ 48. Using this algorithm, we can ensure that the following two points:

  1. Because the same read-only transactions during the execution, its pointer address will not change, whether it is still in innodb_locks table in innodb_trx, with a read-only transaction check out trx_id would be the same.
  2. If there are multiple concurrent read-only transaction, address trx pointer variables each transaction is certainly different. In this way, different concurrent read-only transactions, check out trx_id is different.

So, why plus 2 ^ 48 it?

Inside the display value plus 2 ^ 48, the purpose is to ensure that the read-only transaction display trx_id value is relatively large, different from the read and write transactions will normally id. However, trx_id row_id similar with the logic, the defined length is 8 bytes. Thus, the same trx_id case a read-write transaction with a read-only transactions displayed or may occur in theory. But this probability is very low, and also there is no real harm, whatever it may be.

Another problem is , read-only transaction does not allocate trx_id, what good is it?

  • One benefit is that doing so can reduce the size of the transaction array of affairs inside view active. Because the read-only transactions currently running, does not affect the visibility of the judgment of the data. So, when you create a consistent view of transactions, InnoDB will only need to copy the read and write transactions trx_id.
  • Another benefit is that you can reduce the number of applications trx_id. In InnoDB, the execution even if you're just a normal select statement, in the implementation process, also corresponds to a read-only transaction. Therefore, read-only transaction optimized, common query need not apply trx_id, greatly reducing lock conflicts of concurrent transactions apply trx_id.

Since the read-only transaction does not allocate trx_id, a natural consequence is trx_id increase slows down.

However, persistent storage max_trx_id will restart and will not reset to 0, then in theory, as long as a MySQL service run long enough to reach the upper limit of 2 ^ 48-1 max_trx_id may occur, then starting from 0 Happening.

When you reach this state, MySQL will continue to appear a dirty read bug, we look to reproduce this bug.

First, we need to change the current max_trx_id Pre 2 ^ 48-1. Note: in this case using the Repeatable Read isolation level. The specific procedure is as follows:


                                                                FIG 4 dirty reads reproduction

Since we've set the system became max_trx_id 2 ^ 48-1, so the low water level session A transaction that was started TA is 2 ^ 48-1.

At time T2, session B performs the first update statement transaction id is 2 ^ 48-1, and the second transaction update statement 0 is the id, trx_id version of the data generated after performing this update statement is 0 .

At time T3, when executed session A select statement, determining visibility found, c = 3 in this version of the data trx_id, TA is less than the low water level transaction, so that the data is visible.

However, this is a dirty read.

Due to the low water level will continue to increase, and the transaction id start counting from zero, has led to the system after this moment, all queries will appear dirty read.

And, when MySQL restart max_trx_id not cleared to 0, that is to restart MySQL, the bug is still there.

So, this bug also exists only in theory it?

Assuming a MySQL instance TPS 500,000 per second for this pressure, then at 17.8 years, there will be the case. If higher TPS, the shorter the life naturally. However, the real beginning of the popular MySQL to now, I am afraid no instance went over this limit. However, this bug is just an example MySQL service long enough, it will inevitably arise.

Of course, this example more realistic sense, can deepen our understanding of low water levels and data visibility. You can also take this opportunity to review Dir 8 article "matters in the end is isolated or not isolated? "In the relevant content.

thread_id

Next, we look at the thread id (thread_id). In fact, the thread id MySQL is the most common form of increment id. Usually we check in a variety of field when the first column show processlist inside, is thread_id.

thread_id logic is well understood: The system keeps a global variable thread_id_counter, each new connection, it will be assigned to this thread variable thread_id_counter new connection.

thread_id_counter defined size is 4 bytes, and therefore reaches 2 ^ 32-1, it will be reset to 0, and then continues to increase. However, you will not see two identical thread_id in the show processlist years.


This is because MySQL is designed with a unique array of logic, when thread_id to allocate new threads, logic code like this:

do {
  new_id= thread_id_counter++;
} while (!thread_ids.insert_unique(new_id).second);
复制代码

The code logic is simple and elegant realization, I believe you understand at a glance.

summary

Today this article, I introduce you to the different MySQL auto-increment id reached the upper limit future behavior. As a database system may need 7 * 24/7 service, consider these boundaries are very necessary.

Each increment id have their own scenarios, performance after reaching the upper limit is also different:

  1. Id table increment after reaching the upper limit, when to apply its value will not change, which led to the wrong data Times continues to insert the primary key conflict.
  2. After row_id reach the limit, it will return 0 re increments, if the same row_id appear, after write data overwrites the previous data.
  3. Xid not only need to repeat the same value to a binlog file appears. Although in theory there will be duplicates, but the probability is very small, negligible.
  4. InnoDB's max_trx_id increment value each restart MySQL will be saved, so dirty read our example mentioned in the article is a must now a bug, but fortunately left us plenty of time.
  5. thread_id we use is the most common, but also best handled by a self-id of the logic.

Of course, there are still other MySQL increment id, such as table_id, binlog file number, etc., leaving you to validate and explore.

Different self-energizing id have different upper limit and the upper limit of the length depends on the type of declaration. The upper limit our statement id column is 45, so this article today is our last technical article up.

Since there is no next id, and there will be no after-school thinking the subject. Today, we change a light topic, please say to you, after reading columns what do you think of it.

The "feelings", either change the understanding of some of the knowledge that occurred before and after you read the column, it can be a good way to learn you accumulate column, of course, also be tucao or expectations for the future.

Reproduced in: https: //juejin.im/post/5d060a58e51d4556db694a07

Guess you like

Origin blog.csdn.net/weixin_34220623/article/details/93183438