Normal Index VS Unique Index

query performance

Suppose we have a column of value of int type to query it (VALUE has no repeated fields)

SELECT value FROM table where value = 8;

If it is an ordinary index, after finding the record with value = 8, it will continue to search until it encounters the first record that does not satisfy the condition of k=5.

If it is a unique index, if you find the record value = 8, you will not be able to search further

We can see that the unique index is indeed a little better than the ordinary index, but this is negligible because the unit of MYSQL when loading data is 'page', and the entire page is loaded into memory 

So, when a record with value = 8 is found, the data page where it is located is all in memory. Then, for ordinary indexes, the extra operation of "finding and judging the next record" requires only one pointer search and one calculation.

Of course, if this record happens to be the last record of this data page, then to remove the next record, the next data page must be read, and this operation will be a little more complicated.

However, for integer fields, nearly a thousand keys can be placed in a data page (the default data page size is 16kb), so the probability of this situation will be very low. Therefore, when we calculate the average performance difference (the operation on the memory is very fast), we can still consider that the operation cost is negligible for the current CPU.

update process

When a data page needs to be updated, if the data page is in memory, it will be updated directly, and if the data page is not in memory, InooDB will cache these update operations in the change buffer without affecting data consistency , so that this data page does not need to be read from disk. When the next query needs to access this data page, read the data page into the memory, and then execute the operations related to this page in the change buffer. In this way, the correctness of the data logic can be guaranteed.

It should be noted that although the name is called change buffer, it is actually persistent data. In other words, the change buffer is copied in memory and written to disk.

The process of applying the operations in the change buffer to the original data page to get the latest result is called merge. In addition to accessing this data page will trigger the merge, the system has a background thread that will merge periodically. The merge operation is also performed during a normal database shutdown.

Obviously, if the update operation can be recorded in the change buffer first to reduce disk reads, the execution speed of the statement will be significantly improved. Moreover, reading data into memory needs to occupy the buffer pool, so this method can also avoid occupying memory and improve memory utilization.

Under what conditions can the change buffer be used?

For a unique index, all update operations must first determine whether the operation violates the unique constraint. For example, to insert a record, you must first determine whether the same record already exists in the table, and this must be done by reading the data page into memory. If it has already been read into the memory, it will be faster to update the memory directly, and there is no need to use the change buffer.

Therefore, the update of the unique index cannot use the change buffer, and in fact only ordinary indexes can be used.

So compare their performance

In the first case, the target page for this record to be updated is in memory . At this time, the processing flow of InnoDB is as follows:

  • For a unique index, find the position between 3 and 5, judge that there is no conflict, insert this value, and the statement execution ends;
  • For ordinary indexes, find the position between 3 and 5, insert this value, and the statement execution ends.

From this point of view, the difference between the ordinary index and the unique index on the performance of the update statement is just a judgment, which will only consume a small amount of CPU time.

However, this is not the focus of our attention.

The second case is that the target page for this record to be updated is not in memory . At this time, the processing flow of InnoDB is as follows:

  • For a unique index, it is necessary to read the data page into the memory, judge that there is no conflict, insert this value, and the statement execution ends;
  • For ordinary indexes, the update is recorded in the change buffer, and the execution of the statement ends.

Reading data from disk into memory involves random IO access, which is one of the most costly operations in the database. The change buffer improves update performance significantly because it reduces random disk access.

wait ah wait

I stroke it

There is a dirty page mechanism in the Buffer Pool, which is also used to reduce disk I/O operations. It is directly modified on the current page when an update operation occurs, so the current page is different from the storage page, that is, the dirty page is then flushed in at an appropriate time. dirty page

 Then there is a change buffer, which is to directly store the update behavior in the change buffer when the update operation occurs, and then write the update when the data is read next time (if the update is written, it will directly trigger an I/O operation instead of When reading, there will be an I/O operation no matter what (these two are put together to save one time)

To sum up, the change buffer is to "slow down" the update operation, and the buffer pool is to slow down the entire page.

In all scenarios of ordinary indexes, can the change buffer be used to speed up?

Because the merge is the moment when the data is actually updated, and the main purpose of the change buffer is to cache the record change actions, so before a data page is merged, the more changes recorded in the change buffer (that is, the number of changes on this page) The more times you update), the greater the benefit.

Therefore, for businesses that write more and read less, the probability of the page being accessed immediately after writing is relatively small, and the use of change buffer is the best at this time. This kind of business model is common for billing and log systems (OLTP).

Conversely, assuming that the update mode of a business is to query immediately after writing, even if the conditions are met, the update will be recorded in the change buffer first, but the merge process will be triggered immediately because the data page will be accessed soon. In this way, the number of random access IOs will not be reduced, but the maintenance cost of the change buffer will be increased. Therefore, for this business model, the change buffer has a side effect.

おすすめ

転載: blog.csdn.net/chara9885/article/details/131614832