Hash algorithm understanding

When I encountered an algorithmic problem on the buckle, to use the hash function, I found a video to understand

Basic concepts of hash function

Generally, in the process of searching, it is to compare the key values. If you do n’t wait, move to the next position and then search until you find it. Then at this time, we can imagine whether we can search for the key value I can find the record I want. To realize this idea, we want to be able to establish a certain correspondence between the key value of the record we want to find and its storage location, then search without using keywords Comparison between the values.
Let's take an example to illustrate
Where storage address =
where the student number is the key value, then the storage address = key value -32001, then this operation is called a hash function
but the key value and the storage address value must correspond to each other, so each time It is impossible to find the hash function as easy as the above example


Storage address = single digit in the key value (this is the hash function we constructed), the hash table is as follows:

0
1 5666551 Xiao Li
2 5666552 Xiao Li
3
4 5666554 Xiaoli
5 5666554 Xiaozhou
6 According to the
chart, the keyword values ​​of Xiaoli and Xiaozhou are the same, then the corresponding storage addresses should also be the same, but one location is impossible Two people can be accommodated, which will form a conflict, so Xiao Zhou will look for a position down.
Define
the data elements in the lookup table to be stored in a limited continuous space according to the set hash function and the method of handling conflicts , You get the hash table

Basic methods for dealing with conflicts

Dealing with conflicts means that for a data element to be inserted into the hash table, if the hash address obtained according to the given hash function is already occupied, the next hash address is obtained according to certain rules, and so on, until one is found Available address to save the element

1. Open address method:
let H = (H (key) + di)% m, i = 1,2.3.4 ..., m-1, where H (key) is the hash function and m is the hash table length, di is the incremental sequence
1. If di = 1,2,3, ..., m = 1, it is called linear detection and then hashing (commonly used)
2. If di = 1, ^ 2, -1 ^ 2, 2 ^ 2, -2 ^ 2, ..., ± k ^ 2, called the secondary re-hash probe (common)
linear probing
Insert picture description here
using modulo method, take the remainder seven address hash value obtained key value, below
Insert picture description here
which is No conflicts! The following will show the problem of address conflict after taking the remainder in the hash table.
Insert picture description here
Obviously, the 67 and 18 are the hash address value of 4, so this time we need to use the method of dealing with conflicts. Add 1 to the hash address value to get the hash as shown in the figure below.
Insert picture description here
If the address value obtained after adding 1 is still occupied, add 2 and 3 to the original hash address in sequence until the address value It has not been occupied, such as 28 in this example:
Insert picture description here
the comparison between the second detection and the linear detection!
Insert picture description here
Using linear detection to obtain the following:
Insert picture description here
This process is very easy, it does not explain the
second detection processing process.
Insert picture description here
Insert picture description here
If the digital modulus is divided by 11 The remainder happens to have this vacancy, and it is directly placed in that position. If the position is occupied, add 1 to the position first. If it does not work, add -1. If the position is still occupied, add 4 again. By analogy, it is obvious that the number of comparisons of the second type is significantly less than the number of comparisons of the first type.
2. Chain address method
Store all keywords with the same hash address obtained by a given hash function in the same linear linked list, and make the linked list ordered by keywords.
Insert picture description here
Insert picture description here
Divide the 84 module by 7 = 0, and put it in the position of the hash table address 0. The following discusses the special case, how to deal with when the two numbers of addresses are the same:
Insert picture description here
As shown in the figure above, 18 and 67 are divided by 7 and are both 4, and they must be placed at the address of the hash table at 4, so 67 must be moved back one position, so that 18 and 67 are placed in a When the chain is on, this is called the chain address method.
3. The public overflow area method
Insert picture description here
Insert picture description here
7, 8, 9 is the public overflow area. When 18 and 67 conflict, put 18 in the public overflow area, that is, position 7, as shown in the following figure:
Insert picture description here

How to find elements in a hash table

1. According to the value of the key to be
searched , according to the given hash function, find the hash address 2. If there is no data element at the address, the search fails
3. If there is a data element at the address, the key value is interleaved comparison
if equal, then find success
if other press collision handler seeking the next possible memory address

Published 9 original articles · praised 9 · 229 views

Guess you like

Origin blog.csdn.net/qq_44231964/article/details/103898484