Python and achieve hash lookup hash table

Copyright: welcome the exchange of learning, please indicate the source. https://blog.csdn.net/qq_23869697/article/details/90612439

Wall crack recommend reading: . Problem-Solving-with-Algorithms-and-the Using the Data-Structures-Python-5.5 Hashing

Why propose hash lookup

lis array in python and are common linear structure, create an array of time, memory, open up a continuous, determine the size of the space used to store data.
Say that list, since the list includes a pointer field and a data field, do not require continuous list in memory, regardless of where the next node, a node always carries the location of the next node.
Return to the topic, we create a contiguous memory and fixed size, which add a number to the list using the append and subscript index time complexity of O (1). However to find ways to use the query whether the list contains a value, we need to check it again from beginning to end, when the time complexity of O (n). When n is large orders of magnitude, when the query time unacceptable.

Can the time complexity of the query also do O (1)?
The answer is yes, use a hash table.

A simple idea is that when there is a value that can not be calculated directly where it should be stored according to the worth of this value, then store?
If you can, then I will be able to directly implement the time complexity of O (1) by the following table index. There is a concept called mapping Mathematically, we can establish a relationship mapping to associate the value and location to be stored.
If we use the method as a mapping relationship between the remainder, i.e., y = x% _list, len_list number of data need to be stored. There are 10 values, when a value of x = 2, then the storage position y = 2% 10 = 2.
So I want to take a look at the list, there are no 2, get his long index by mapping the relationship above, taken out compare the time complexity is only O (1) is.
In fact, this so-called professional argument is a hash function mapping relationship, but this function will not like me for example so simple.

hash function

Hash function in the design need to consider the problem space.
Also in this issue it is to use a hash map needs to be considered will happen two or more values conflict with a map location.
Finally, consider a hash function computational complexity can not be too high.
Design a perfect hash function is unlikely, but the principle is to minimize the number of collisions, easy to calculate, and uniform distribution of items in the hash table.
Common simple hash functions:
(1) packet summation method
such as phone number 436-555-4601, to obtain a set of two points per 43,65,55,46,01, summation 43 + 65 + 55 + 46 + 01, we get 210. Suppose the hash table has 11 slots, divided by 11, the 11 to 210% of 1, so the hash to the tank 1 436-555-4601.
(2) middle-square method
, such as 44, 44 first ^ 2 = 1,936 calculate the intermediate result obtaining two numbers 93, assuming the hash table has 11 slots. To give 93% of 11 = 5.

Hash lookup

Time complexity An important feature of the hash table is that the query is O (1). So in case of need to check on the rafts handy.
Check weight / de-emphasis

  1. Given two columns, the number of columns is determined difference between two elements.
    Suppose that two columns of length n, using the for-loop one by comparison, the time complexity is O (n ^ 2).
    Use of such a hash table data structure to traverse a hash table columns to create a time complexity of O (n-); second column then query whether each element in the table, the time complexity is O (n), so the total time is O (2n). Many efficiency.
  2. Whether there is a series of repetitive elements
    idea and 1) similar.

Python implementation hash table

Excerpt from http://interactivepython.org/courselib/static/pythonds/SortSearch/Hashing.html

class HashTable:
    def __init__(self):
        self.size = 11
        self.slots = [None] * self.size
        self.data = [None] * self.size

    def put(self, key, value):
        hashvalue = self.hashfunction(key, len(self.slots))
        if self.slots[hashvalue] is None:
            self.slots[hashvalue] = key
            self.data[hashvalue] = value
        else:
            if self.slots[hashvalue] == key:
                self.data[hashvalue] = value

            else:
                nextslot = self.rehash(hashvalue, len(self.slots))
                while self.slots[nextslot] is not None and self.slots[nextslot] != key:
                    nextslot = self.rehash(nextslot, len(self.slots))

                if self.slots[nextslot] is None:
                    self.slots[nextslot] = key
                    self.data[nextslot] = value
                else:
                    self.data[nextslot] = value

    def rehash(self, oldhash, size):
        return (oldhash + 1) % size

    def hashfunction(self, key, size):
        return key % size

    def get(self, key):
        startslot = self.hashfunction(key, len(self.slots))
        data = None
        found = False
        stop = False
        pos = startslot
        while pos is not None and not found and not stop:
            if self.slots[pos] == key:
                found = True
                data = self.data[pos]
            else:
                pos = self.rehash(pos, len(self.slots))
                # 回到了原点,表示找遍了没有找到
                if pos == startslot:
                    stop = True
        return data

    # 重载载 __getitem__ 和 __setitem__ 方法以允许使用 [] 访问
    def __getitem__(self, key):
        return self.get(key)

    def __setitem__(self, key, value):
        return self.put(key, value)


if __name__ == '__main__':
    H = HashTable()
    H[54] = "cat"
    H[26] = "dog"
    H[93] = "lion"
    H[17] = "tiger"
    H[77] = "bird"
    H[31] = "cow"
    H[44] = "goat"
    H[55] = "pig"
    H[20] = "chicken"

    print(H.slots)  # [77, 44, 55, 20, 26, 93, 17, None, None, 31, 54]
    print(H.data)  # ['bird', 'goat', 'pig', 'chicken', 'dog', 'lion', 'tiger', None, None, 'cow', 'cat']
    print(H[20])  # 'chicken'
    H[20] = 'duck'
    print(H[20])  # duck
    print(H[99])  # None

Guess you like

Origin blog.csdn.net/qq_23869697/article/details/90612439