Mysql advanced - index optimization and query optimization (3)

9. How to add index to string

9.1 Prefix index

MySQL supports prefix indexes. By default, if you create an index without specifying a prefix length, the index will contain the entire string.

mysql> alter table teacher add index index1(email);
#或
mysql> alter table teacher add index index2(email(6));

If you are using index1 (that is, the index structure of the entire email string), the execution sequence is as follows:

  1. Find the record whose index value is '[email protected]' from the index1 index tree, and obtain the value of ID2;
  2. Go to the primary key and find the row whose primary key value is ID2, judge that the value of email is correct, and add this row of records to the result set;
  3. Get the next record at the just found position in the index1 index tree, and find that the condition of email='[email protected]' is no longer met, and the loop ends.

During this process, you only need to retrieve data from the primary key index once, so the system thinks that only one row has been scanned.

If you are using index2 (i.e. email(6) index structure), the execution sequence is as follows:

  1. Find records that satisfy the index value 'zhangs' from the index2 index tree, and the first one found is ID1;
  2. Go to the primary key and find the row whose primary key value is ID1. It is judged that the value of email is not ' [email protected] ', and this row of records is discarded;
  3. Get the next record at the position just found on index2 and find that it is still 'zhangs'. Take out ID2, then get the entire row on the ID index and judge that this time the value is correct. Add this row of records to the result set;
  4. Repeat the previous step until the value obtained on idxe2 is not 'zhangs', and the loop ends.

In other words, using a prefix index and defining the length can save space without adding too much additional query cost . We have already talked about discrimination before. The higher the discrimination, the better. Because the higher the distinction, the fewer duplicate key values.

9.2 The impact of prefix index on covering index

Conclusion:
Using prefix indexes eliminates the need for covering indexes to optimize query performance. This is also a factor you need to consider when choosing whether to use prefix indexes.

10. Index pushdown

Index Condition Pushdown (ICP) is a new feature in MySQL 5.6. It is an optimization method that uses indexes to filter data at the storage engine layer. ICP can reduce the number of times the storage engine accesses the base table and the number of times the MySQL server accesses the storage engine.

10.1 Scanning process before and after use

In the process of not using ICP index scanning:

Storage layer: Only the entire row of records corresponding to the index records that meet the index key conditions are taken out and returned to the server layer.

Server layer: Use the subsequent where condition to filter the returned data until the last row is returned.

The process of using ICP scanning:

  • storage layer:

First, determine the index record interval that satisfies the index key condition, and then use index filter on the index to filter. Only the index records that meet the indexfilter conditions are returned to the table and the entire row of records is returned to the server layer. Index records that do not meet the index filter conditions are discarded and will not be returned to the table or server layer.

  • server layer:

For the returned data, use table filter conditions for final filtering.

Cost difference before and after use. Before use
, the storage layer returned many rows of records that needed to be filtered out by the index filter.
After using ICP, records that did not meet the index filter conditions were directly removed, eliminating the need for them to be returned to the table and passed to the server layer. the cost of.
The acceleration effect of ICP depends on the proportion of data filtered out by ICP in the storage engine.

10.2 Conditions of use of ICP

Conditions for using ICP:
① Can only be used for secondary index (secondary index)

②The type value (join type) in the execution plan displayed by explain is range, ref, eq_ref or ref_or_null.

③ Not all where conditions can be filtered by ICP. If the field of the where condition is not in the index column, the records of the entire table still have to be read to the server for where filtering.

④ ICP can be used for MyISAM and InnnoDB storage engines

⑤ MySQL version 5.6 does not support the ICP function of partition tables, and version 5.7 starts to support it.

⑥ When SQL uses covering index, ICP optimization method is not supported.

11. Ordinary index vs unique index

From a performance perspective, should you choose a unique index or a normal index? What is the basis for selection?

Assume that we have a table with the primary key column as ID. There is field k in the table, and there is an index on k. Assume that the values ​​in field k are not repeated. The table creation statement for this table is:

mysql> create table test(
    id int primary key,
    k int not null,
    name varchar(16),
    index (k)
)engine=InnoDB;

The (ID,k) values ​​of R1~R5 in the table are (100,1), (200,2), (300,3), (500,5) and (600,6) respectively.

11.1 Query process

Assume that the statement to execute the query is select id from test where k=5.

  • For a normal index, after finding the first record (5,500) that meets the condition, you need to find the next record until you encounter the first record that does not meet the k=5 condition.
  • For a unique index, since the index defines uniqueness, the search will stop after the first record that meets the conditions is found.
  • So, what is the performance gap caused by this difference? The answer is, minimally.

11.2 Update process

In order to illustrate the impact of ordinary indexes and unique indexes on update statement performance, let's introduce the change buffer.

When a data page needs to be updated, if the data page is in memory, it will be updated directly. If the data page is not yet in memory, InooDB will
cache these update operations in the change buffer without affecting data consistency. , so there is no need
to read this data page from disk. When the next query needs to access this data page, read the data page into memory, and then perform
operations related to this page in the change buffer. In this way, the correctness of the data logic can be ensured.
The process of applying the operations in the change buffer to the original data page and obtaining the latest results is called merge. In addition to triggering merge when accessing this data page
, the system has background threads that merge regularly. During the normal shutdown of the database, the merge
operation will also be performed.

If the update operation can be recorded in the change buffer first to reduce disk reads, the execution speed of the statement will be significantly improved. Moreover,
reading data into memory requires occupying the buffer pool, so this method can also avoid occupying memory and improve memory utilization.
The change buffer cannot be used to update the unique index. In fact, only ordinary indexes can be used.

12. Other query optimization strategies

12.1 The difference between EXISTS and IN

I don't quite understand in which case EXISTS should be used and in which case IN should be used. Is the selection criterion based on whether the index of the table can be used?

12.2 COUNT(*) and COUNT (specific fields) efficiency

Question: There are three ways to count the number of rows in a data table in MySQL: SELECT COUNT(*), SELECT COUNT(1) and
SELECT COUNT (specific fields). What is the query efficiency between these three methods?

12.3 About SELECT(*)

In table queries, it is recommended to specify the fields. Do not use * as the field list of the query. It is recommended to use SELECT <field list> query. Reasons:
① During the parsing process, MySQL will convert "*" into all column names in order by querying the data dictionary, which will greatly consume resources and time
.
② Covering index cannot be used

12.4 Impact of LIMIT 1 on optimization

It is aimed at SQL statements that scan the entire table. If you can be sure that there is only one result set, then adding LIMIT 1 will not continue scanning when a result is found, which will speed up the query.

If the data table has established a unique index for the field, you can query through the index. If the entire table is not scanned, there is no need to add
LIMIT 1.

12.5 Use COMMIT more

Whenever possible, use COMMIT as much as possible in the program, so that the performance of the program is improved and the demand is
reduced due to the resources released by COMMIT.

Resources released by COMMIT:

  • Information on the rollback segment used to recover data
  • Lock acquired by program statement
  • Space in redo / undo log buffer
  • Manage internal spending on the above 3 resources

Guess you like

Origin blog.csdn.net/qq_51495235/article/details/133149779