Hash (hash function)

Hash, generally translated as "hash", is also directly transliterated as "hash", that is, the input of any length (also known as pre-map, pre-image) is transformed into a fixed-length output through a hash algorithm. The output is the hash value. This transformation is a compressed map, that is, the space of the hash value is usually much smaller than the space of the input, and different inputs may hash to the same output, so it is impossible to determine the unique input value from the hash value. Simply put, it is a function that compresses a message of any length into a message digest of a fixed length.

Common HASH functions

·Direct remainder method: f(x):= x mod maxM ; maxM is generally a prime number that is not too close to 2^t.
·Multiplication rounding method: f(x):=trunc((x/maxX)*maxlongit) mod maxM, mainly used for real numbers.
·Squaring the middle method: f(x):=(x*x div 1000 ) mod 1000000); After squaring, take the middle one, each bit contains more information.
 

Construction method

Hash functions can make the process of accessing a sequence of data more efficient and efficient, and through hash functions, data elements will be located faster.
(For the detailed construction method, please refer to [Hash table construction method] in the hash function)
1. Direct addressing method: Take the key or a linear function of the key as the hash address. That is, H(key)=key or H(key) = a·key + b, where a and b are constants (this hash function is called its own function)
2. digital analysis
3. 3. Square numbering method
4. folding method

How to handle conflict

1. Open addressing method; Hi=(H(key) + di) MOD m,i=1,2,…,k(k<=m-1), where H(key) is the hash function and m is the hash table long, and di is an incremental sequence, which can be taken in the following three ways:
1). di=1,2,3,..., m-1, called linear detection and rehashing;
2). di=1^2,(-1)^2,2^2,(-2)^2,(3)^2,…,±(k)^2,(k<=m/2) is called quadratic Probe Rehashing;
3). di = pseudo-random number sequence, called pseudo-random detection and re-hashing.
 
..

Hash function application

Due to the variety of applications of hash functions, they are often designed for a certain application. For example, cryptographic hash functions assume that there is an enemy who wants to find the original input with the same hash value. A well-designed cryptographic hash function is a "one-way" operation: for a given hash value, there is no practical way to calculate a raw input, which is difficult to forge. Functions designed for cryptographic hash purposes, such as MD5, are widely used as checking hash functions. In this way, when the software is downloaded, the correct part of the file will be downloaded after checking the verification code. This code may change due to changes in environmental factors, such as machine configuration or IP address changes. to ensure the security of source files.
Error detection and repair functions are primarily used to identify instances where data has been perturbed by random processes. When hash functions are used for checksums, relatively short hash values ​​can be used to verify that data of any length has not been altered.
error correction
Using a hash function can intuitively detect errors in data transmission. On the sender side of the data, a hash function is applied to the data to be sent, and the result of the calculation is sent along with the original data. On the receiving side of the data, the same hash function is applied to the received data again. If the results calculated by the two hash functions are inconsistent, it means that there is an error somewhere in the data transmission process. This is called redundancy check.
For error correction, a distribution of likely perturbations is assumed at least approximately. Perturbations to a message string can be divided into two categories, large (impossible) errors and small (probable) errors. We redefine the second type of error as follows. Given H(x) and x+s, as long as s is small enough, we can efficiently calculate x. Such a hash function is called error correction coding. There are two important classes of these error correction codes: Cyclic Redundancy Checks and Reed Solomon Codes.
Speech Recognition
For applications like matching an MP3 file from a known list, one possible solution is to use a traditional hash function - such as MD5, but this will The compression algorithm or the implementation mechanism of the volume adjustment are very sensitive. Using some methods similar to MD5 is beneficial to quickly find those audio files that are exactly the same (from the binary data of the audio file), but to find all the same audio files (from the content of the audio file), you need to use other methods. advanced algorithms.
Those who don't follow the trends of the IT industry can often do the opposite, and hash functions that are robust enough to those small differences do exist. Most existing hashing algorithms are not robust enough, but there are a few hashing algorithms that are robust enough to discriminate music played from speakers in a noisy room. A practical example is the Shazam[1] service. The user can dial a specific number with the phone and hold the phone's microphone close to the speaker for playing music. The service analyzes the music being played and compares it to a known hash value stored in a database. The user will be able to receive the title of the identified music (a fee will be charged)
information security
The application of Hash algorithm in information security is mainly reflected in the following three aspects:
(1) File verification
The check algorithms we are familiar with include parity check and CRC check. These two kinds of checks do not have the ability to resist data tampering. They can detect and correct channel errors in data transmission to a certain extent, but they cannot prevent data tampering. Malicious destruction of data.
The "digital fingerprint" feature of the MD5 Hash algorithm makes it the most widely used file integrity checksum (Checksum) algorithm at present. Many Unix systems provide commands to calculate the md5 checksum.
(2) Digital signature
Hash algorithm is also an important part of modern cryptography. Due to the slow operation speed of asymmetric algorithms, one-way hash functions play an important role in digital signature protocols. Digitally signing a hash value, also known as a "digital digest", is statistically equivalent to digitally signing the file itself. And there are other advantages to such a protocol.
(3) Authentication Protocol
The following authentication protocol is also called challenge-authentication mode: it is a simple and secure method in the case that the transmission channel can be intercepted but not tampered with. The above are some basic preparatory knowledge about hash and its related.
..

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325267629&siteId=291194637