Hash, hash function, hash algorithm analysis

table of Contents

background

Hashing: Hashing

Hash table

Perfect hash function

application

example

The coolest application of hash functions-blockchain

meaning

Essential characteristics

Design of hash function

Folding method

Square taking method

Non-numerical treatment

Hash conflict resolution

Jump detection method

Rehash

Second detection

Data Necklace-A compromise between hash and list, a compromise between O(1) and O(n)

Mapping abstract data types and their implementation

Abstract data type "mapping": ADT Map

Code example

Hash algorithm analysis

Sorting and searching summary


background

Hashing: Hashing

  • As mentioned above, if the data items are sorted by size, binary search is used to reduce the complexity of the algorithm
  • Construct a new data structure to reduce the complexity of the search algorithm to O(1)
    • Need to have more prior knowledge of the location of the data item

Hash table

  • Every storage location-slot, each slot has a unique name

Using the method of finding the remainder, get the hash table

  • Load factor: the proportion of the slot occupied by the data item 

    Perfect hash function

If a hash function can map each data item to a different slot, then this hash function is called a "perfect hash function"

Data items often change, how to design a perfect hash function?

A good hash function has characteristics

  1. Least conflict (near perfect)
  2. Low computational difficulty (small additional overhead)
  3. Fully disperse data items (save space) 

     

application

"Fingerprint function"

  1. Compressibility-"fingerprints" obtained from data of any length have a fixed length
  2. Ease of calculation-it is easy to calculate the "fingerprint" from the original data, and it is almost impossible to calculate the original data from the fingerprint
  3. Modification resistance-minor changes to the original data will cause a huge change in the "fingerprint"
  4. Anti-conflict-Knowing the original data and the "fingerprint", it is very difficult to find the data with the same fingerprint (forgery)

example

import hashlib
hashlib.md5("hello world!").hexdigest()
hashlib.sha1("hello world!").hexdigest()
# 还可以用update方法
m = hashlib.md5()
m.update("hello world!")
m.update("this is part #2")
m.hexdigest()

 

  • Save password in encrypted form
  • Prevent file tampering
  • Lottery betting application

The coolest application of hash functions-blockchain

meaning

Blockchain is a distributed database

The nodes connected through the network, each node saves all the data of the entire database, and the data stored in any location will be synchronized

Essential characteristics

Decentralization: There is no control center or coordination center node. All nodes are equal and cannot be controlled

 

 

 

Proof of workload: Whoever has a lot of workload will master the modification of the entire network;

 

Isn't hash calculation very easy to calculate? Why pay massive calculations? 

Because it is difficult to calculate, the speed of new block generation is controlled to facilitate synchronization in the entire distributed network

Design of hash function

Folding method

Square taking method

Non-numerical treatment

Increasing weight is a good way to deal with anagrams, but it increases the amount of calculation

Therefore, the hash function cannot become the computational burden of the stored procedure and the search process, otherwise you can directly perform sequential search and binary search.

Hash conflict resolution

Skip detection method

Rehash

The length of the hash table is set to a prime number to ensure uniform distribution

Second detection

 

Data Necklace-A compromise between hash and list, a compromise between O(1) and O(n)

Mapping abstract data types and their implementation

Abstract data type "mapping": ADT Map

Code example

H=HashTable()
H[54]="cat"
H[24]="dog"
print(H.slots)
print(H.data)
print(H[24])  # dog
print(H[20])  # None

class HashTable:
    def __init__(self):
        self.size = 11
        self.slots = [None]*self.size
        self.data = [None]*self.size

Hash algorithm analysis

Sorting and searching summary

Guess you like

Origin blog.csdn.net/yiweixiaomiandui/article/details/115253584