Hash algorithm (scenarios)

1. Introduction hash algorithm

Binary value maps strings of arbitrary length to binary string of fixed length, the mapping rule is the hash algorithm.

 

2. hashing algorithm to meet the conditions

1 . Not reverse the hash value derived from the original data (hash algorithm, a one-way hash algorithm)
 2 . Very sensitive to the input data, even if the original data is modified only a 'bit, the resulting hash value is also quite the same
 3 probability of hash collisions to be very small, for different raw data, the same hash value is very small probability
 4 . hashing algorithm efficiency as high as possible for longer texts, can quickly calculate the hash value

 

3. hash algorithm application scenarios

3.1. Security Encryption

    The most commonly used cryptographic hash algorithm is MD5 (MD5 message digest algorithm) and SHA (Secure Hash Algorithm). For encryption, the two important aspects:
     1 . Push hard to reverse based on the hash value to the original data (the original purpose of encryption is to prevent data leakage)
     2 minimize the collision probability collision (theory there is no way to do it. totally conflict). Here you can understand the pigeonhole principle, if there are 10 pigeonholes, there are 11 pigeons. 

    In a real project, in order to protect data more secure, using raw data + salt (salt) way, way hash value is calculated together

 

3.2. Unique identification

    If you want to massive gallery, search for an image that exists. Whether to compare directly through the meta-information images (such as image name)? 
    The answer is: no. Because there may be same name, but different picture content; or different names, but the same picture of content. 
    So how to implement search it? 
    
    Answer: acquired image of binary information (binary code string take the beginning 100 bytes + 100 bytes + intermediate take the last fetch 100 bytes), the byte 300 are spliced together by a hash algorithm, the hash to give value. As picture unique identification can be.

 

3.3. Data check

    EDonkey BT download software so we have used. We know, BT download principle is based on P2P protocol. Our parallel on multiple machines from a 2GB download movies, the movie file may be divided into a lot of file blocks (such as can be divided into 100 blocks, each about 20MB). Etc. After all file blocks are downloaded, and then assembled into a complete movie file on the line. 
    We know that the transmission network is not secure, download the file block there may be maliciously modified host machine, or an error occurred during download, so download the file block may not be complete. If we are not able to detect this malicious file downloads or modify wrong, it will lead to the final merger can not watch movies, and even cause the computer poisoning. 
    The question now is, how to check the security file blocks, correct and complete it? 
    We hashing algorithm, 100 file blocks were taken hash value, and stored in a seed file. Hash algorithm has a characteristic, very sensitive data. As long as the contents of file blocks have the slightest change, the last calculated hash value will be completely different. 
    So, when a block file download is complete, we can find one by one the hash value of the downloaded file blocks through the same hash algorithm, and then save the file with seed hash value comparison. If different, then the file blocks that have been tampered with or is incomplete, the need to re-download the file from the other host machine block.

 

Guess you like

Origin www.cnblogs.com/itall/p/11220620.html