Chapter 7 Finding (hash table)

  Chapter 6 in the previous study several basic types of data structures, some of which operate to find, Chapter 7 is devoted to a more specific search algorithm, as well as various optimization.

  First, binary search and sequential search is more familiar, the usual use are large, but the two search methods have obvious disadvantages, the binary search time complexity of O (log 2 N), relatively high search efficiency , but it is restricted to a sorted list. To search algorithm to solve the limitations: only 1 orderly storage; 2 dynamic insertion problems; 3 too much data, etc.; with the latter binary sort tree, balanced binary tree, B- tree, B + trees and. hash table (hash lookup method), the gradual deepening of the difficulty of the algorithm is also increasing.

  Binary sort tree to achieve a dynamic insertion function, according to its nature, we insert data into the binary tree in time, has been the equivalent of a sorted traversal, basic insertion process is to find the time complexity is O ( log 2 N). Before a practice questions, regarding the use of the properties of binary sort tree (a string of the same data, different input sequence, create a tree structure may be different), established to determine whether the tree is the same tree.

Whether with a binary search tree  this question mainly in the insertion algorithm is to find the critical time, but also can be seen binary search tree also has shortcomings, when there is a large amount of data, n is large, Find time also increases. So, another search algorithm, hash table lookup method (Hash Search) said that the theory can achieve O (1) time complexity, is quite controversial. Problems in different element values have the same hash value, then the need to deal with these collisions, which are put into different address, so when looking for need to traverse through a certain time, to find the address information needed.

  And there is the worst case of one kind, we used method link address conflict, such as the n elements which are successively inserted. This value of n elements are not the same, but all have the same hash value, that is inserted up to the same location. (Eg: 2,4,6,8,10 .....)  each insertion, first of all to find the same address, and then determines whether the current value does not in the table, it will be added to this new element hash behind the single address list, so that, inserted into the first element 1 comparison operations required; inserted into the second element 1 + 1 requires comparison operation; ......; comparison operation requires n times n-th element is inserted. Finally, a total of 2 + ... + +. 1 n = 0.5N (. 1 + n) = O (n 2 ) time of the operation constants to the n-th element is inserted into the hash table. Therefore, the time complexity so inserted to O (n- 2 ), find the traversal time O (1) that is less likely.

  However, to avoid a large number of collisions in the establishment of the hash table is achievable. The method of handling conflicts address method and an open-chain address method, open linear address method and the second detection method detection method, several are better or worse, the specific application to be particularly suitable method is selected. In this chapter exercises, it is the use of a secondary detection method (incremental positive).

 Hashing

The task of this problem is simple: insert a sequence of distinct positive integers into a hash table, and output the positions of the input numbers. The hash function is defined to be H(key) = key % TSizeH(key)=key%TSize where TSizeTSizeis the maximum size of the hash table. Quadratic probing (with positive increments only) is used to solve the collisions.

Note that the table size is better to be prime. If the maximum size given by the user is not prime, you must re-define the table size to be the smallest prime number which is larger than the size given by the user.

Input Specification:

Each input file contains one test case. For each case, the first line contains two positive numbers: MSizeMSize (\le 10^4104​​) and NN (\le MSizeMSize) which are the user-defined table size and the number of input numbers, respectively. Then NN distinct positive integers are given in the next line. All the numbers in a line are separated by a space.

Output Specification:

For each test case, print the corresponding positions (index starts from 0) of the input numbers in one line. All the numbers in a line are separated by a space, and there must be no extra space at the end of the line. In case it is impossible to insert the number, print "-" instead.

Sample Input:

4 4

10 6 4 15

Sample Output:

0 1 4 -

 Just started the topic did not see that the number of records if there is a conflict will not be inserted into the back, later found to deal with conflict detection using a secondary method, problem-solving process is a refreshing idea, my idea is relatively straightforward and wanted address data corresponding hash record together with an array, traverse the array can know the address information of the corresponding data or can not be inserted.

While the input data and the results calculated by the way version:

 

#include <the iostream>
 the using  namespace STD;
 int Visit [ 10000 ] = { 0 }; // access flag array 

int Prime ( int A) // determines whether or not a prime number 
{
     int I;
     IF (A <= . 1 ) return  0 ; // this step is important to judge, a test subject has a minimum point is detected 
    for (I = 2 ; I <a; I ++ ) 
    {     
        IF (a% I == 0 )
         return  0 ;     // prime not return 0 
    }
     IF (a> =I)
         return A; // primes 
} 

void hashlocate ( int m, int n-, int C) 
{     
    int I, H, D, Data; 
    
    for (I = 0 ; I <n-; I ++ ) 
    { 
        CIN >> Data; 
        D = 0 ; // every time the input data, d increments, increases from 0 
        H% = data C; D ++; // to calculate a hash value corresponding to the first address (subscript) 
        IF (I) COUT < < "  " ; // no space before the first output of 
        
      the while (D <m && Visit [H]) //Address cycle is already visited operation, can be placed in position until it finds 
        { 
            H = (D * Data + D)% C; D ++ ; 
        } 
        
        IF (! Visit [H]) // to address unvisited output 
        { 
            COUT << H; 
            visit [H] = . 1 ; // corresponding to the array access value output flag after referred to as a position. 1 
        }
         the else  
            COUT << ' - ' ; // still no data position after while, represents not insert 
    } 
} 

int main () 
{ 
    int m, n-, C; 
    CIN >> m >> n-; 
    C =m;
     the while (Prime (C) == 0 ) // number is not a prime number if the input, find the smallest prime number larger than this number 
    { 
        C ++ ;     
    } 

    hashlocate (m, n-, C); 
    return  0 ; 
}
View Code

 

 

 

There step is to find a topic number of prime numbers together next immediately, I began to feel very troublesome, later found to determine if a number is a prime number of code changes look like, so my method is calculated using a while.

Auxiliary saved version addresses an array of output:

#include <the iostream>
 the using  namespace STD;
 int Visit [ 10000 ] = { 0 }; 

void Hashlocate ( int m, int n-) 
{ 
    int I, In Flag = 0 , H, D;
     int * Hl, * H; 
    Hl = new new  int [n-]; 
    H = new new  int [n-];
     for (I = 0 ; I <n-; I ++ ) 
    {     
        CIN >> Hl [I]; 
        H = Hl [I]% m; // first calculate a first hash value corresponding to the address (index)
        = D 0 ; D ++; // every time the input data, d increments, increases from 0 
        
        the while (D <m && Visit [H]) // can be placed in the position of the address cycle is already visited instructions until found 
        { 
            H = (Hl [I] + D * D)% m; D ++ ; 
        } 
        
        IF (visit [H]!) // for unvisited address outputs 
        { 
            H [I] = H; visit [H] = . 1 ; // save the address of the secondary array, and then outputs a corresponding position of the array access value of a flag referred to. 1 
        }
         the else H [I] = - . 1 ; // after the second detection method, still no position data can not be inserted at marker 
    } 
    
    for (I = 0 ; I <n-; I ++ ) 
    { 
        IF(H [I] = -! . 1 ) 
        { 
            IF (In Flag == 0 ) // process output space 
            { 
                COUT << H [I]; In Flag = . 1 ; 
            } 
            the else COUT << "  " << H [I]; 
        } 
        the else 
        COUT << "  " << ' - ' ; // no position data can be placed, can not be inserted indicates 
    } 
    
} 

int prime ( int a) // determines whether or not a prime number 
{
     int I;
     iF (a <=. 1 ) return  0 ; // this determination step is very important, the test subject has a minimum point is detected 
    for (I = 2 ; I <A; I ++ ) 
    {     
        IF (A% I == 0 )
         return  0 ;     
    } 
    IF ( a> = I)
         return a; 
} 


int main () 
{ 
    int m, n-; 
    CIN >> m >> n-;
     the while (prime (m) == 0 ) // number is not a prime number, if entered, this large number found the minimum prime number 
    { 
        m ++ ;     
    }

    Hashlocate (m, n); 
    return  0 ; 
}
View Code

 

 

Difficulties there is an increase in computing, for each input data, it requires a hash address, if conflict, use the second method to detect re-find location, this thinking for a long time, and finally with a while aiding, array, table visit combined length m determines, after a few debug, test data sets only do this step. Another details, the card I have for a long time did not notice, that is, the minimum test point 2, because ignoring the minimum value of 1, no treatment has been in check for the rest of the error function to determine prime numbers, the wait finally know the details of the problem, the problem is not on their own thoughtful, later need to pay attention to this.

 

 

Guess you like

Origin www.cnblogs.com/chenzhenhong/p/10962451.html