Hash table (hash table) based on the concept

What is hash

Introduction: when we learn of the array, use the array element index value to access to the element, the time spent is O (1), no relation to the number n of array elements, which is the core of the hash method thought.

Hash Method: In the key value K is the independent variable, through a certain function H (K) ( hash function ) to calculate the corresponding function value, this value is interpreted as the storage address of the node, the node key ( Key ) and attribute data ( value stored in the memory cell together). When retrieved, the address is calculated using the same function, to find the corresponding data.

Hash table: by storing the storage structure constructed of a hash table called a hash (hash table)

Example: Known linear key value table set is S = {and, begin, do , else, for, golang}, the hash function is assumed that key K is the first letter in the alphabet sequence number, which it can establish the following hash table:

hash table

Of course, this example is particularly simple, attentive students soon discovered two problems:

  1. If there are similar and, apple such key S in which this table also how it is stored, or the apple store and in position 1 it? (A phenomenon known as conflict )
  2. Space allocated to larger (in general, a set of (M) hash table space (H) greater than the junction than actually required, so that although the waste of space, but in exchange for retrieval efficiency, said α = M / H is a hash table load factor )

In fact, this is the choice and want to talk about the next conflict resolution strategies hash function.

Hash function

Hashing the core is selected hash function, the hash function should be ideal that the node "uniformly distributed", and less conflict

For simplicity, we assume the following hash function key values ​​are integers (if not an integer, which also has a specific way to convert it to an integer, after all, in the computer world, everything is composed of 01 strings)

In addition to Conormal:

Key code k values ​​divided by M (often taken hash table length), and the remainder taken as a hash address.

Take more than rounding method

Let key k multiplied by a constant A (0 <A <1), the fractional part of the extracted product. Then, this value is then multiplied by the integer n, the result is rounded down to the hash address.

There are many kinds of hash functions to choose from, each with its applicable scenarios, but as a software development engineer, as long as we understand what it thought would be, as to what the scene select what hash function, which is a high probability mathematicians should be studied The problem.

Excellent hash function may be possible to avoid conflict, but conflict is inevitable, which is the next to talk about conflict resolution strategies

Conflict Resolution Strategies

Conflict resolution techniques can be divided into two categories, open the fastener method and address method, except that these two methods, the method zipper key store conflict occurs outside the main table in the hash table, the method to open and address occurs in another table storage tank key conflict.

Zipper law

A simple method is to form each fastener slot defined as a list header of the hash table, the hash to a particular track of all this list are placed in the grooves.

Suppose S = {and, apple, begin, do, dog, else, for, go, golang}, a hash function or key number K, the first letter in the alphabet. Which it can be established as follows hash table:

Zipper law

Opening address method

Opening address method to all the records stored directly in the hash table. Each record has a key K calculated by the hash function group position, i.e. h (k). If you want to insert a key k, and the key of another record group has occupied the position k (conflict) in other address table, the other put the k address how to select storage, there are a variety of algorithms, we in order to detect method as an example.

Given a set of keys is {26,36,41,38,44,15,68,12,06,51,25}, the hash table length L = 15, is configured to resolve the conflict by linear probe method hash table the process is as follows:

Using the remainder of the division as a hash function, it is assumed M = 13, the hash function as: h (k) = k% 13, the insertion order each node:

h(26)=0, h(36)=10,h(41)=2, h(38)=12, h(44)=5,

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
26 25 41 44 36 38

h (15) = 0, collision occurs, so need to be probed, in the order of detection method, apparently free address 3 is open, so it can be placed in three units.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
26 25 41 15 44 36 38

68 and 12 similarly, the final hash tables are as follows:

1 2 3 4 5 6 7 8 9 10 11 12 13 14
26 25 41 15 68 44 6 36 38 12 51

summary

This paper describes the basic concepts and ideas hash table, as software engineers, this knowledge enough for us to engage in the development of the project, as some knowledge of mathematics biased, such as the efficiency of hashing method of analysis, as well as some of the related optimization, article no longer talk.

Shenkaoziliao: "Data Structures and Algorithms" edited by Zhang

Guess you like

Origin www.cnblogs.com/yahuian/p/11575672.html