MySQL (Analysis of InnoDB): Hash Algorithm and Adaptive Hash Index

One, hash algorithm

  • Hash algorithm is a common algorithm, time complexity is O(1), and not only exists in the index, the database structure exists in every database application

 

Second, the hash table

  • The hash table is also called the hash table, which is improved from the direct addressing table. First look at the direct addressing table. When the global U of the keyword is relatively small, direct addressing is a simple and effective technique. To join an application, a dynamic set is used, in which each element has a keyword taken from the global U={0,1,...,m-1}. Also assume that no two elements have the same keyword
  • An array (ie, direct addressing table) T[0...m-1] is used to represent the dynamic set, where each position (or slot or bucket) corresponds to a key in the global U. Figure 5-38 illustrates this problem. Slot K points to an element of the set whose key is k. If the set has no element of key k, then T[k]=NULL

  • There is an obvious problem with direct addressing. If the domain U is very large, under the limitation of the available capacity of a typical computer, it is a bit impractical, or even impossible, to store a table T of size U in the machine. If the actual keyword set K to be stored is relatively small relative to U, then most of the space allocated to T will be wasted
  • Therefore, the hash table appears. In the hash mode, the element is in h(k), that is, the hash function h is used to calculate the slot position according to the key k, and the function h maps the key field U to the hash function h. In the slot of Greek table T[0...m-1], as shown in the figure below

 

 

  • Hash table technology solves the problems encountered by direct addressing very well. But there is still a small problem. As shown in the figure above, two keywords may be mapped to the same slot. This situation is generally referred to as a collision. The simplest collision resolution technology is generally used in the database. This technology is called the link method.
  • In the linking method, all elements hashed to the same slot are placed in a linked list. As shown in the figure below, there is a pointer in slot j, which points to the head of the linked list for all elements hashed to j. If there is no such element, then j is NULL

 

 

  • The last thing to consider is the hash function. The hash function h must be able to hash well. The best case is to avoid collisions. Even if it cannot be avoided, the collision should be caused under the smallest Chengdu. Generally speaking, keywords are converted into natural numbers, and then implemented through division hashing, multiplication hashing or global hashing. The method of triggering hashing is generally used in the database
  • In the divisional hashing algorithm of the hash function, by taking the remainder of k divided by m, the key k is mapped to one of the m slots, that is, the hash function is

 

Third, the hash algorithm in the InnoDB storage engine

  • InnoDB uses a hash algorithm to look up the dictionary , its conflict mechanism uses a linked list method , and the hash function uses a division hash method

Take the buffer page in InnoDB as an example

  • For the hash table of the buffer page, the Page page in the buffer pool has a chain pointer, which points to the page with the same hash function value . For the divided hash, the value of m is a prime number slightly larger than 2 times the number of buffer pool pages
  • For example, if the size of the current parameter innodb_buffer_pool_size is 10M, there are a total of 640 16KB pages:
    • For the hash table of the buffer pool page memory, 640*2=1280 slots need to be allocated, but because 1280 is not a prime number, a prime number slightly larger than 1280 needs to be taken, which should be 1399, so 1399 will be allocated at startup The hash table of the slot, used to hash the page in the buffer pool where the query is located
  • So how does InnoDB's buffer pool look up the pages in it? The above is just a general algorithm, how to convert the page to be searched into a natural number?
    • In fact, it's very simple. InnoDB's table space has a space_id. What the user wants to query should be a continuous 16KB page of a table space, that is, the offset offset. InnoDB shifts the space_id to the left by 20 bits, then adds the space_id and offset, that is, the keyword K=space_id<<2+space_id+offset, and then hashes it into each slot by division

Four, adaptive hash index

  • The adaptive hash index adopts the implementation of the hash table discussed earlier. The difference is that this is only created and used by the database itself , and the DBA itself cannot intervene in it
  • Since the hash index is mapped to a hash table by the hash function, it is very fast to look up the dictionary type
  • Through the "SHOW ENGINE INNODB STATUS" command, you can view the current usage of the adaptive hash index, for example :
    • You can see the size and usage of the adaptive hash index, and the use of adaptive hash index searches per second
    • hash searched/s and non-hash searches/s: represents the usage and efficiency after hash index

  • The adaptive hash index is controlled by the InnoDB storage engine itself, so the information here is for reference only
  • Note: Hash index can only be used to search for equivalent queries. For range lookup, hash index cannot be used
-- 下面的查询语句可以使用自适应哈希索引
select * from table where index_col='xxx';

innodb_adaptive_hash_index参数

  • Whether the adaptive hash index is enabled can be controlled by this parameter
  • Default is on

 

 

 

Guess you like

Origin blog.csdn.net/m0_46405589/article/details/113815441