Hash table related concepts

Reference:
https://www.cnblogs.com/songdechiu/p/6954038.html
https://github.com/CyC2018/CS-Notes/blob/master/docs/notes/ Algorithm - symbol table .md

1. The first step is to use a hash function to convert a key array index, ideally a different key can be converted to a different index value, but there are actually a plurality of keys hash to the same index value to .

Therefore, the second step is the process of collision conflict. There are two methods of conflict handling collisions: separate chaining (zip method) and linear probing (linear probe method)

Hash table is a classic example of an algorithm to make a trade-off in time and space.

If there is no memory limit, the algorithm can directly key as an array (possibly super) index, then all of the search operation requires only one memory access to complete. But this is generally not the ideal situation, because a lot of the time when the key need of memory too.

If there is no time limit, the algorithm can be used to find order in a disordered array and so requires very little memory.

Hash tables use the appropriate space and time and to find a balance between these two extremes.

A good hash methods need to meet three conditions:

Consistency - equivalent keys must produce equal hash values

Efficiency - the calculation is simple

Uniformity - all evenly hash key

Hypothesis

Assumptions: hash function algorithm can be used to bond all uniformly distributed and are independently between 0 and M-1.

This assumption is in fact the ideal model can not be achieved, but it is a hash function algorithm guiding ideology.

Third, based on a hash table zipper law

1. Introduction
hash function keys into array index, the second step is the need for collision handling.

One of the most straightforward approach is the size of each element of the array pointing to a linked list of M, each node of the linked list are stored at the hash value of the target position in the array pairs.

This method is called the fastener method (separate chaining). Because the elements of conflict are placed on the same list.
This method of thinking is: choose a sufficiently large M makes the list are as short as possible in order to ensure efficient search.

Find the order: first find the appropriate list based on the hash value, and then traverse the list to find the appropriate key.

Fourth, the linear detection method based on the hash table

1, Introduction
Another way hash table key is stored on the N by M array size, where M> N. This approach relies on an array of vacancies resolve collision conflict.

When a collision occurs, the algorithm checks a position (index plus 1).

Such linear probing has three possible results:

Hit the same key positions and key to be found.

Misses, the key is empty.
Continue to find, the location of the key and the key to look for is not the same.


And zippers method as linear probing Faha Xi performance table is also dependent on the ratio N / M of.

But not the same meaning here is N / M is the ratio of space occupied in the hash table, the algorithm used to dynamically adjust the size of the array methods to ensure usage between 1/8 to 1/2.

And zippers law as dynamically adjust the array size need to re-hash all the keys.

See connection code implementation details.

Published 39 original articles · won praise 6 · views 10000 +

Guess you like

Origin blog.csdn.net/poppyl917/article/details/89408208