MYSQL Hash index and index number B

B + Tree
B + Tree is an optimization based on the B-Tree to make it more suitable for implementing the external memory index structure, InnoDB storage engine is to use a B + Tree index implemented its structure.

We can see not only the key values ​​in each node contains data, and data value from the B-Tree in a configuration diagram in the upper. Each of the memory page is limited, if the data will result in large data for each node (i.e., a page) can store a small number of key, when a large amount of data stored will also lead to B- Tree of greater depth, increasing the disk when I query / O times, thereby affecting the query efficiency. In the B + Tree, all the nodes are in accordance with the key-data records stored in the order of the leaf nodes of the same level, not only stores the key value of the leaf node information, which can greatly increase the number of key values ​​stored at each node reduced height + Tree B.

B + Tree with respect to the B-Tree have different points:

Non-leaf nodes only store key information.
It has a chain of pointers between all leaf nodes.
Data records are stored in the leaf node.
The B-Tree is an optimization, since the B + Tree of non-leaf nodes store only the key information, assuming that each disk can store four key block and pointer information, becomes the B + Tree structure is as follows Figure: 

 

 


There are usually two head pointer on the B + Tree, a pointer to the root node, pointing to another keyword minimum leaf node, and a ring structure is a chain among all the leaf nodes (i.e., node data). Thus two search operation can be performed on the B + Tree: one is for the primary key range search and find tab, the other is from the root node, random search.

22 may be data records in the above example, see the advantages of the B + Tree, below a projection made:

InnoDB storage engine page size is 16KB, the general type of the primary key table INT (4 bytes), or BIGINT (8 bytes), a pointer type is also typically 4 or 8 bytes, that is to say a page (B + Tree is a node) is stored in approximately 16KB / (8B + 8B) = 1K key-value (as is the estimate for the convenience of calculation, K where the value of 〖〗 10 ^ 3). That is a depth of 3 B + Tree index can maintain 10 ^ 3 * 3 * 10 ^ 10 ^ 3 = 1 billion records.

Each node in the actual situation may not be filled up, and therefore in the database, B + Tree height generally in the 2 to 4 layers. The InnoDB storage engine mysql in the design is the root of permanent memory, which means that a maximum of only 1 to 3 times the disk I / O operations when looking for a key value rows.

B + Tree index database can be divided into clustered index (clustered index) and secondary indexes (secondary index). B + Tree achieve the above example in FIG database index is the aggregate, the aggregate index B + Tree of leaf nodes is stored in the data rows of the entire table. The index difference between the auxiliary leaf nodes clustered index is a secondary index that does not contain all the data of the rows, but is stored the data clustered index key corresponding row, i.e., the primary key. When to query the data through secondary indexes, InnoDB storage engine will traverse the secondary index to find the primary key, and then find a complete line of data recorded by the primary key clustered index.

 

HASH
hash table (Hash table, also called hash table), based on the key code value (Key value) to directly access a data structure. In other words, to access the records by key values are mapped to table a position to speed up the search. This mapping function called a hash function, recording storage array is called a hash table.

Hash table hash table (key, value) actually very simple, the Key is converted into an integer number by a fixed algorithm function called a hash function both, then the digital array modulo length, I take it as a result of the subscript of the array, the array will be the subject of space to the value stored in digital for the next years.

And when a query using a hash table, the hash function is to use the key again converted to the corresponding index array, and the target value acquisition space, this way, we can take full advantage of the capability array to locate data positioning.

Hash table of the biggest advantages, is to greatly reduce data storage and time consuming to find, almost as a constant time; but the cost is just more memory consumption. However, in the current more and more memory available, the use of space for time approach is worth it. In addition, codes is easy also one of its features. Also known as hash table hash table, divided into "open hashing" and "closed hashing."

 

hash index B-tree index difference with the
particularity Hash index structure, the retrieval efficiency is very high, the index can be retrieved once positioned, unlike B-Tree index needs from the root to the branch node, the last node to access to the page so much IO access times, so the query efficiency Hash index is much higher than B-Tree index.

(1) Hash index can satisfy only "=", "IN" and "<=>" query, the query can not be used range. 
Since the Hash index Hash value comparison is carried out after the Hash operation, it can only be used for filtering equivalent, based on the range of the filter can not be used, because the size relationship Hash values after processing via the corresponding Hash algorithm, and can not be exactly the same as before and ensure Hash operation.

(2) Hash index can not be used to avoid the data sorting operations. 
Because the index is stored Hash Hash value calculated through the following Hash, Hash value and the magnitude relationship and values are not necessarily exactly the same as before the Hash operation, so the database can not use the index data to avoid any sort of operation;

(3) Hash indexes can not use part of the index key query. 
For the combination of the time index, the index when calculating the Hash Hash value is calculated and then combined with the composite index key Hash value, instead of calculating the Hash value alone, the query index through the front of one or several of the composite index key, the index Hash can not be used.

(4) Hash index at any time can not avoid a table scan. 
As already known, Hash index after the index key by the Hash operation, the Hash value Hash calculation results and row pointers corresponding to the information stored in a Hash table, due to the presence of the same Hash value different index keys, even if the take satisfies certain the number of records of data a Hash keys can not be done from Hash index directly query, or to make the appropriate comparisons by accessing the actual data in the table, and the corresponding results.

Performance will not necessarily higher than the index B-Tree (. 5) Hash index encountered large Hash values are equal. 
For relatively low selectivity index key, if you create an index Hash, then there will be a large number of records stored in the pointer information associated with a Hash value. In this way to locate one of the logs will be very troublesome, will waste many visits to the table data, resulting in poor overall performance.
----------------
Disclaimer: This article is the original article CSDN bloggers "DoDo-Baron", and follow CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source and link this statement.
Original link: https: //blog.csdn.net/Baron0071/article/details/86089914

Guess you like

Origin www.cnblogs.com/2661314cn/p/12590303.html