Data structure - a hash table (hash table)

Brief introduction

   Hash table (Hash table, also called hash table ), is the basis of the key value (Key value) to directly access a data structure. In other words, to access the records by key values are mapped to table a position to speed up lookups . This mapping function called a hash function , recording storage array is called a hash table .

  Recording the storage location = f (key)

  Where f is called the correspondence between the hash function, also called a hash (Hash function), using the hashing technique records are stored in a contiguous storage space, this storage space continuously or hash table called a hash table ( Hash table).

  Hash table hashtable (key, value):

  1. Key by a fixed algorithm function of both a so-called hash function is converted into an integer number ,

  2. Then it will be the number of array length modulo , modulo result on as the array index ,

  3. The value is stored in to the digital space for the next target array.

    

  Obviously the left is an array , each member of the array, including a pointer , points to a head of the list , of course, this list may be empty, it may be a lot of elements.

  We , according to some characteristic elements of the element assigned to a different list to go, but also according to these features, finding the right list, and then find the element from the list.

  Hash Table query speed is very fast, almost O (1) time complexity. hash is to find a mapping relationship between data content and data storage address.

   Hashing: wherein element array into the subject method.


 

Hash collision

  After hashing will inevitably lead to a problem, that is, for different keywords , you may get the same hash address, that is the same array index , a phenomenon known as hash collision.

  So how do we do to deal with conflict?

  Method 1. Opening address: find another slot array by the method of the system, the data is filled in, then the array without subscript hash function obtained, as this location has the data;

  2. Method Chain Address:   Create a stored list of array, the array data is not directly stored, so that when a conflict occurs, the new data item directly to the list referred to in the array subscript.

  3. Public overflow area law : the establishment of a special storage space dedicated to storing data conflicts. This method is suitable for data and with less conflict.

  4. Hashing again (re-hashing): preparing a plurality of hash function, using the first hash function if a conflict occurs, using a second hash function on the second conflict also use a third ......


Hash Functions 

  hash address the probability of conflict is closely related to the hash function. Below we look at the function used to calculate the hash address

  (A) Direct addressing method

    Take a keyword or keyword as a linear function of Hash address, that is address(key) = a*key + b; if you know the number of students in the school are from 2000, up to 4000, it can be address(key) =key-2000used as Hash address.

  (B) square reindeer method

    Of keywords for square calculation intermediate, and then the results were as Hash several addresses, if the following key string {421,423,436}, the result is then squared {177241,178929,190096}, the middle digit may be taken {72,89,00}as a Hash address.

  (C) folding process

    The split key into sections, and these sections together, form transformed address Hash in a particular manner. For example, if a known book number SBN: 8903-241-23may be address(key)=89+03+24+12+3used as Hash address.

  (D) In ​​addition to specimens from Conormal

    If the known maximum length m of Hash table, may take a maximum of not greater than m prime number P , and then keywords modulo operation , address(key)=key % p.

 

pending upgrade

 

Reference: https: //blog.csdn.net/eson_15/article/details/51138588 https://www.jianshu.com/p/5a2a5f6de440

Guess you like

Origin www.cnblogs.com/FondWang/p/11910355.html