MySQL --- Index

1. Index

B + Tree principles

1. Data Structure

  B Tree refers to the Balance Tree, which is balanced multi-binary search tree . Balanced tree is a search tree, and all the leaf nodes at the same level.

A m-order B-tree has the following characteristics:

  1. The root node has at least two children.

  Each intermediate node contains 2 k-1 elements and k children, where m / 2 <= k <= m

  3. Each leaf node contains k-1 elements, where m / 2 <= k <= m

  4. All the leaf nodes are in the same layer.

  5. Each node element in ascending order, the nodes of which elements of the k-1 k exactly range of elements contained child partition .

  B + Tree is based on B Tree and sequential access pointer is a leaf node for implementation, having a balance B Tree, and to improve the query performance by sequentially accessing pointers.

A B + tree of order m has the following characteristics:

  1. There are the k th node of the tree contains k intermediate elements (B tree is k-1 elements), each element of data is not saved, the index only used, all data stored in the leaf node.

  2. All the leaf nodes contains information of all the elements , and a pointer to the record containing these elements, and the leaf node itself according to the size and large keyword brought sequentially linked .

  3. All elements of the intermediate nodes exist in the child node, the maximum (or minimum) in the child element node element.

B + tree advantages:

  1. More single-node storage elements, making fewer number of IO query.

  2. All queries must find a leaf node, query performance stability.

  3. All leaf nodes to form an ordered list, to facilitate the range query.

2. Operation

  When the seek operation proceeds, first performed root binary search to find a key pointer is located, and then recursively node in the pointer points was to find until you found a leaf node, and then perform a binary search on a leaf node, identify key Data corresponding .

  Insert delete operation will destroy the balance of the balance of the tree, so after insertion and deletion operations, the need for a tree to split, merge, rotate, etc. to maintain balance.

3. Compared with the red-black tree

  Red-black trees can also be used to achieve a balanced tree index, but the file system and database system widely used as a B + Tree index structure, mainly in the following benefits:

(1) fewer lookups

  Balanced tree search operation time and complexity of the relevant tree height h, O (h) = O (logdN) , where d is the degree of each node. Out of the red-black tree is 2, B + out of the Tree are generally very large, so the height of the red-black tree + Tree is very big than B, and therefore the more the number of lookups.

(2) the use of disk read-ahead characteristic

  In order to reduce disk I / O operations, disk read often not strictly needed, but each time the pre-read. Pre-read process, the disk sequential reads, disk seeks to be read sequentially, and only a short rotation time, the speed will be very fast.

  The general operating system memory and disk divided into fixed-size blocks, each block is called a, and the disk memory in page units to exchange data, the size of the database system is provided for each node of a size such that every I / O will be able to fully load a node. And may use the reservation characteristic, adjacent nodes can also be pre-loaded.

MySQL Index

  Index in achieving the storage engine layer , not implemented in the server layer, different storage engines having different index type and implementation.

1.B + Tree index

  The default index type MySQL storage engine.

  The use of B + Tree as a data structure of the index, the data in the query do not need a full table scan, just you need to search the tree, so a lot of speed to find. In addition to used to find , it can also be used for sorting and grouping .

  You can specify multiple columns as index columns, multiple columns together constitute the index key.

  InnoDB the B + Tree index into the main index and a secondary index . Primary index leaf node data field records the complete data record , which is called indexing clustered index (primary key index). Because there is no line data stored in two different places, so a table can have only one clustered index .

  Secondary index is also called non-clustered index , the index value of each node in the tree structure from the table index field, for example, user name field to the table, plus index, the index has a value that is the name field configuration, if a table plurality of fields indexed and the index of the plurality of independent then there will be no correlation between each index. Each time a new index to create a field, the data field will be copied out, for generating an index. So to add an index table, will increase the volume table, taking up disk space.

  Secondary index leaf node data field records the primary key value , and therefore when a secondary index to find, it is necessary to find the value of the primary key, and then to a clustered index lookup .

  One exception may not be able to use a clustered index to query the data needed, such non-mainstream approach, called a covering index inquiry , which is what we usually say the composite index or a multiple-field index query . When the indexing field, contents of the field will be synchronized to the index in , if you specify two fields into one index, the content of these two fields will be synchronized to the index into.

2. hash indexes

Hash index can be O (1) lookup time, but lost the ordering :

  • It can not be used for sorting and grouping
  • Only support exact search, to find and not part of the scope for search.

  InnoDB storage engine has a special feature called " adaptive hash index ," when an index value is used very frequently, will then create a hash index on a B + Tree index , so let the B + Tree index Some of the advantages of having the hash index, such as fast hash lookup.

Index Tuning

1. independent column

  During the query, the index column can not be part of an expression, not a function of the parameters , you can not be responsible for the use of the index.

  For example the following queries can not use the index actor_id columns:

select actor_id from mytable where actor_id+1=5;

2. Multi-column index

  When queried as conditions require the use of multiple columns, multi-column index is better than using multiple separate index performance.

SELECT film_id, actor_ id FROM sakila.film_actor
WHERE actor_id = 1 AND film_id = 1;

3. The order of the index column

  Make the most selective index of the column on the front.

  Selectivity index means: Unique index value and the ratio of the total number of records . Maximum value of 1, each case record has a unique index corresponding thereto. The higher the selectivity, the higher the efficiency of the query.

4. Prefix index

  For BLOB , TEXT and VARCHAR type column, you must use the prefix index , only the index portion of the character begins .

  We need to be determined based on the index selected prefix length selectivity.

5. Cover Index

  Index contains all the fields to be queried value.

It has the following advantages:

  • Index usually much smaller than the size of the data line, read the index only can greatly reduce the amount of data access.
  • Some storage engines in-memory cache only the index and the data relies on the operating system to cache. Therefore, the only access to the index without using system calls .
  • For InnoDB engine, if the secondary index can cover the query, you do not need to access the main index .

The advantage of index

  • Greatly reducing the number of rows of data servers need to be scanned .
  • Help server avoid sorting and grouping , as well as to avoid creating a temporary table (B + Tree indexes are ordered, and can Order By Group By operations, temporary tables are created mainly in the sorting and grouping process, since no grouping and sorting, so you do not need to create a temporary table).
  • The random I / O becomes the sequential I / O (B + Tree indexes are ordered, the data will be stored together adjacent).

Conditions Index

  • For very small tables, in most cases a simple full table scan than indexing more efficient
  • For medium to large -type table, the index is very effective
  • But for large tables, expensive to build and maintain an index will increases. In this case, it is necessary to use a technique can distinguish between direct a set of data to be queried, but not a matching record a record, for example, may be used partitioning .

Guess you like

Origin www.cnblogs.com/yjxyy/p/11131614.html