Python algorithm-hash table

Today we will learn about hash tables in python syntax

Hash table

Hash table is one of the most useful basic data structures.
Hash function: The
hash function maps the input to the number output by the digital hash function. There is no law, but it must meet certain requirements:
1. It must be consistent every time When the second input is the same, the result must be the same
2. It maps different inputs to different numbers.
We can accurately find the storage location in the hash table. The reasons are as follows:
1. The hash function always maps the same input To the same index 2. The hash function maps different inputs to different indexes
3. The hash function knows how big the array is and only returns a valid index

A hash table is a data structure created by combining a hash function and an array. It is a data structure that contains additional logic. It is also called a hash map, mapping, dictionary and associative array. The hash table provided by python is implemented as a dictionary and can be used as a function dict to create (if you don’t understand, please click the link below)
Introduction to dictionaries in python

conflict:

Example: We have a friend named Tom, his age is 18 years old, we store his data in a hash table, we also have a friend named marry, his age is 20 years old, we also store her data in In the hash table, but we also have a friend named tom. He is 19 years old this year. If we still use tom as a key to store information in the hash table, we can only find that the age of tom is 19 years old. The tom less than 18 years old is called conflict

In order to solve this problem, we can say that both toms are mapped to the same location. Store a list in this location, and we can find the information of the two toms, but the query speed may be slower than when querying marry.

But we should also pay attention to:
1. The hash function is very important, we should say that the hash function is evenly mapped to different positions of the hash table 2. If the linked list stored in the hash table is very long, the speed of the hash table will drop sharply

performance:

The running time of simple search is linear time. The running time of
binary search is logarithmic time. The time
to perform various operations in the hash table is O(1), which is called constant time, regardless of whether the hash table contains 1 element or 10000 The time required to obtain the data is the same.
Insert picture description here
In the process of using the hash table, it is important to avoid the worst case. For this reason, conflicts need to be avoided.
1. Low filling factor
2. Good scatter Column function

Filling factor:

When the filling factor is greater than 1, it means that the number of elements in the hash table is greater than the number of positions in the hash table, and positions need to be added to the hash table. This is called adjusting the length. The
smaller the filling factor, the lower the possibility of conflict. , the higher the performance of the underlying hash
small experience (once the filling factor is greater than 0.7 on the adjustment of length)

Good hash function:

A good hash function makes the values ​​in the array evenly distributed, a bad hash function makes the values ​​pile up, leading to a lot of conflicts

Guess you like

Origin blog.csdn.net/Layfolk_XK/article/details/108306690