Hash table--Linear detection method

I recently reviewed the hash table in the data structure, and found that I was confused when calculating the average search length of unsuccessful searches under the condition of equal probability, and I don't know how to calculate it. Now I finally know how to calculate it by looking up the data, so I record it for later reference.

   Let's take a look at a hash table question in the 2010 2010 National Master's Entrance Unified Examination for Computer Science and Technology.

Question1:

Hash the key sequence (7, 8, 30, 11, 18, 9, 14) into a hash table. The storage space of the hash table is a one-dimensional array with subscripts starting from 0. The hash function is: H(key) = (keyx3) MOD 7. The linear detection and re-hash method is used to deal with conflicts, and the loading (loading) factor is required to be 0.7.

(1) Please draw the constructed hash table.

(2) Calculate the average search length of successful search and unsuccessful search under equal probability conditions.

Years:

(1) First define a concept loading factor. The loading factor refers to the degree of saturation of all key subs after filling the hash table, which is equal to the total number of keys/length of the hash table. According to the meaning of the title, we can determine that the length of the hash table is L = 7/0.7 = 10; therefore, the hash table that needs to be constructed in this question is a one-dimensional array with subscripts 0~9. According to the hash function, the following hash function value table can be obtained.

H(Key) = (keyx3) MOD 7, for example, when key=7, H(7) = (7x3)%7 = 21%7=0, the same is true for other keywords.

Key 7 8 30 11 18 9 14
H(Key) 0 3 6 5 5 6 0

(Table 1)

The collision is handled by linear detection and re-hash method, and the constructed hash table is:

address 0 1 2 3 4 5 6 7 8 9
keywords 7 14   8   11 30 18 9  

(Table 2)

The construction method of the hash table will be explained below. Note that the keywords 7 and 14, 30 and 9, and 11 and 18 in Table 1. The H(Key) values ​​of these three groups of keys are the same, which means that when the hash table is constructed, Conflicts will occur, because their addresses are the same, so this problem must be solved by a certain conflict handling method. According to the question, the linear detection and hashing method is used to deal with the collision. The following details how to construct the hash table:

       The first key 7, its address is 0, so it is placed at the position of 0 in the following table of the hash table array. There is no keyword in this position, so there is no conflict and can be directly filled in;

       The second key 8, its address is 3, so it is placed in the position of 3 in the following table of the hash table array. There is no keyword in this position, so there is no conflict and can be directly filled in;

       The third key 30, its address is 6, so it is placed at the position of 6 in the following table of the hash table array. There is no keyword in this position, so there is no conflict and can be directly filled in;

       The fourth key 11, its address is 5, so it is placed at the position of 5 in the following table of the hash table array. There is no keyword in this position, so there is no conflict and can be directly filled in;

       The fifth key 18, its address is 5, so it is placed at the position of 5 in the array of the hash table, but there is already keyword 11 in this position, and a conflict is encountered. At this time, we re-hash according to the linear detection method. To deal with this conflict, detect the keyword 30 already exists in the next position 6, 6, then continue to increase the step size by 1, so the new address should be 7, there is no keyword in position 7, just put it in, that's it conflict has been resolved;

       The sixth key 9, its address is 6, so it is placed at the position of 6 in the following table of the hash table array, but there is already a keyword 30 in this position, and a conflict is encountered, and the next position 7, 7 is detected. If the keyword 18 already exists, continue to increase the step size by 1, so the current new address should be 8, and there is no keyword in position 8, just put it in;   

       The seventh key 14, its address is 0, so it is placed at the position of 0 in the array table of the hash table, but there is already keyword 7 in this position, and a conflict is encountered, the next position 1 is detected, and there is no position 1. keyword, you can put it in;   

       At this point, all keywords have been filled in, and the hash table has been constructed, as shown in Table 2.

(2) The average search length of successful search under equal probability:

        This question can be solved according to the construction process of the first question:

        The key7 is filled into the table once, so the number of searches is 1. Similarly, the number of searches for 8, 30, and 11 is 1; the key18 has been put into operation 3 times, and the detection positions are 5, 6, and 7, so the number of searches is 1. is 3; key9 is also 3 times; key14 is probed twice, so the number of lookups is 2. The frequency table is shown in Table 3

Key 7 8 30 11 18 9 14
Count 1 1 1 1 3 3 2

(table 3)

        So ASLsuccess= (1+1+1+1+3+3+2)/7 = 12/7.  

        Average search length for unsuccessful searches with equal probability:

        Next, we will discuss the unsuccessful situation. See Table 2. To calculate the number of unsuccessful searches, we can directly find the distance between the keyword and the empty keyword on the first address. However, according to the hash function, the address is MOD7, so the initial only May be in the position 0~6. In the case of equal probability, the number of failed searches for positions 0 to 6 is:

   Looking at address 0, the distance to address 2 where the first keyword is empty is 3, so the number of unsuccessful searches is 3.     

        Address 1, the distance to address 2 where the first key is empty is 2, so the number of unsuccessful searches is 2.

        Address 2, the distance to address 2 where the first key is empty is 1, so the number of unsuccessful searches is 1.

        Address 3, the distance to address 4 where the first key is empty is 2, so the number of unsuccessful lookups is 2.

        Address 4, the distance to address 4 where the first key is empty is 1, so the number of unsuccessful lookups is 1.

        Address 5, the distance to the address 2 where the first key is empty (note that it is not address 9, because it can only be between 0 and 6 initially, so loop back) is 5, so the number of unsuccessful searches is 5.

        Address 6, the distance to the address 2 where the first key is empty (note that it is not address 9, because it can only be between 0 and 6 initially, so loop back) is 4, so the number of unsuccessful searches is 4.

        Therefore, the number of times of unsuccessful search is shown in the following table

Key 7 8 30 11 18 9 14
Count 3 2 1 2 1 5 4

(Table 4)

       So ASLunsuccess= (3+2+1+2+1+5+4)/7 = 18/7.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325813025&siteId=291194637