mysql tuning two-index optimization

mysql tuning-index optimization


1. Pre: Index related knowledge

1. Index usage / advantages / classification

        Index advantages:
    1>, greatly reducing the amount of data that the server needs to scan.
    2>. Help the server avoid sorting and temporary tables.
    3>. Turn random io into sequential io.


        Index purpose:
    1>, quickly find rows that match the WHERE clause.
    2> If you can choose between multiple indexes, mysql usually uses the index that finds the fewest rows.
    3>, if the table has a multi-column index, the optimizer can use any leftmost prefix of the index to find the row.
    4>, when there is a table connection, retrieve row data from other tables.
    5>, find the min or max value of a specific index column.
    6>. If sorting or grouping is done on the leftmost prefix of the available index, sort and group the table.
    7> In some cases, the query can be optimized to retrieve the value without querying the data row.


        Index classification:
    primary key index, unique index, ordinary index, full-text index, composite index

2. The phenomenon of indexing process

        Back to the table: The currently used index is a normal index, and the query column is a non-indexed column. Two steps are required to locate the data. First, locate the primary key value through the normal index, and then locate the corresponding through the primary key index (clustered index) according to the primary key value Row records. Back to the table phenomenon will reduce the query efficiency, because two index scans.
·
        Covering index: Triggered when the queried field is the primary key + the indexed field (in the where condition, the indexed column field is used to query). Because the date stored in the ordinary index is the primary key id and the field value. For example, the id column in the user table is the primary key index, and the name column is a normal index, then the select id, name from user where name='zhangsan'; query will trigger the covering index.
·
        Leftmost match: The precondition uses the index as a composite index. When the first field when creating an index is used in the query condition, the index will be used. Follow the leftmost prefix set when using the composite index, and put the most frequently queried conditions on the left when building the composite index.
·
        Index push down: After the index push down after 5.6, when the non-primary key index was used for query before, the storage engine retrieves the data through the index, and then returns it to the MySQL server. The server judges whether the data meets the conditions;
   after pushing out the index push down, When there are some judgment conditions for the indexed columns, the MySQL server passes this part of the judgment conditions to the storage engine, and then the storage engine judges whether the index meets the conditions passed by the MySQL server, and only when the index meets the conditions will the data be retrieved Come out and return to the MySQL server.
   Index pushdown can reduce the number of times the storage engine queries the basic table, and it can also reduce the number of times the MySQL server receives data from the storage engine.

3. Index data structure

The specific data structure of the index varies depending on the storage engine.

        Memory storage engine:
   The storage engine is based on memory and uses the hash table index format. Using hash storage requires adding all data files to the memory, which consumes memory space. If the query is an equivalent query, the hash method is extremely fast.

        Mysam/Innodb storage engine:
   The index data structure of these two types of storage engines is B+ tree structure. B+ tree is an optimization based on B tree. The main changes: 1>, each node of B+ tree can contain more More nodes can reduce the height of the tree and change the data range into multiple intervals. The more intervals, the faster the data retrieval. 2>, non-leaf nodes store keys, and leaf nodes store keys and data. 3>. The two pointers of the leaf nodes are connected to each other, and the sequential query performance is high. B+ tree
   The index of the Mysam storage engine is a non-clustered index, and its index file and data file are separated. The data value stores the physical address of the column corresponding to the primary key, and then the corresponding row data can be found through the address.
   The index of the Innodb storage engine is a clustered index. The index file is not separated from the data file. The data file itself is the index file, and all data information is completely stored in the data field.


4. Index matching method

For example: create a composite index name, phone, age for the user table.
      Full value matching: match all columns in the index --explain select * from user where name ='zhangsan' and phone = '13333333333' and age = '22' ;
      The leftmost prefix match: only match the first few columns--explain select * from user where name ='zhangsan' and phone = '13333333333'; explain select * from user where name ='zhangsan';
      match column prefix: match a certain The beginning part of the value of a column --explain select * from user where name like'zhang%';
      match range value: find a certain range of data --explain select * from user where name>'zhangsan';
      exactly match a column and The range matches another column: query all of the first column and part of the second column --explain select * from user where name ='zhansgan' and phone> '13333333333';
      query that only accesses the index: equivalent to a covering index.

2. Description of hash/(non)clustered/covering index

1. Hash index

   Features:
   1>, based on the implementation of the hash table, only queries that exactly match all the columns of the index are valid; no range query is supported;
   2>, in mysql, only the memory storage engine explicitly supports hash indexes;
   3 >, the hash index itself only needs to store the corresponding hash value, so the structure of the index is very compact, so the hash index search speed is very fast;
   4>, the hash index only contains the hash value and the row pointer, and does not store the fields Value, the index cannot use the value in the index to avoid reading rows;
   5>, the hash index data is not stored in the order of the index value, so it cannot be sorted;
   6>, the hash index does not support partial column matching search, ha The Greek index uses the entire contents of the index column to calculate the hash value;
   7>, access to the data of the hash index is very fast, unless there are many hash conflicts, when there are hash conflicts, the storage engine must traverse all the linked list Row pointers are compared row by row until all eligible rows are found;

   application scenarios:
   more URLs are stored, and there are more searches based on URLs. You can use CRC32 to hash the URLs, and you can use a small index to Complete the search;

2. (Non) clustered index

   Clustered index: It
   is a data storage method, which refers to the compact storage of data rows and adjacent key values.
   Advantages: Keep related data together; data access is faster because the index and data are stored in the same tree; queries that use a covering index scan can directly use the primary key value in the page node.
   Disadvantages: 1>. Clustered data maximizes the performance of IO-intensive applications. If the data is all in memory, then the clustered index has no advantage.
              2>, the insertion speed is heavily dependent on the insertion order, inserting in the order of the primary key is the fastest way.
              3>, the cost of updating clustered index columns is high, because each updated row will be forced to move to a new location.
              4>. Tables based on clustered indexes may face page splits when new rows are inserted, or the primary key is updated and rows need to be moved.
              5> The clustered index may slow down the full table scan, especially when the rows are sparse or the data storage is not continuous due to page splits.
   Non-clustered index: data files are stored separately from index files.

3. Covering Index

   As mentioned above, it should be noted that memory does not support covering indexes.
   Features:
   1>. Index entries are usually much smaller than the size of the data row. If you only need to read the index, then mysql will greatly reduce the amount of data access.
   2>, because the index is stored in the order of column values, so the IO-intensive range query will be much less than the IO to randomly read each row of data from the disk.
   3> Some storage engines such as MYISAM only cache indexes in the memory, and the data depends on the operating system for caching. Therefore, a system call is required to access the data, which may cause serious performance problems.
   4>, due to INNODB's clustered index, covering index is particularly useful for INNODB tables.

Three, index optimization

1. Specific optimization of small details

   1>. Try not to use expressions when using index columns for queries, and put calculations in the business layer instead of the database layer.
   2>, try to use the primary key query instead of other indexes, because the primary key query will not trigger back to the table query.
   3>, use prefix index. If a field is longer, then use the partial string at the beginning of the column as the index-alter table table name add key (field name (6));
   4>, use index scan to sort. Use the same index as much as possible to satisfy both sorting and finding rows. When the column order of the index is exactly the same as the order of the order by clause, and all columns are sorted in the same way, mysql can use the index to sort the results (need to meet the requirements of the leftmost prefix).
   5>, union all, in, or can use the index, it is recommended to use in.
   6>, the range column can use the index, but the column behind the range column cannot use the index, the index is used for at most one range column. (<,<=,>,>=,between)
   7>, the forced type conversion will scan the entire table
   8>, the update is very frequent, and it is not suitable to build indexes on the fields with low data discrimination. The reason is that updates will change the B+ tree, and indexing frequently updated fields will greatly reduce database performance. Generally, the index can be created when the degree of discrimination is above 80%, and the degree of discrimination can be calculated using count(distinct(column name))/count(*).
   9>. Create the index column, it is not allowed to be null, and you may get unexpected results.
   10>. When you need to join tables, it is best not to exceed three tables, and the data types of the fields that need to be joined must be consistent.
   11>, when you can use limit, try to use limit
   12>, single table index is recommended to be controlled within 5, the number of index fields for composite index is not allowed to exceed 5

to sum up

Index optimization is for table and related query operations, and there is no need for premature optimization. This is especially important when considering the execution plan when indexing optimization.

Guess you like

Origin blog.csdn.net/weixin_49442658/article/details/112369023