what is hash calculation

Cryptographic hash calculation methods generally need to have the following properties:

The input to the function can be an arbitrarily long string;

The output of the function is fixed length;

· The computation process of the function is efficient.

This statement is more academic. To put it bluntly, it is to calculate a fixed-length value from an arbitrary input string through a method, which is equivalent to calculating an ID number. The result calculated by the hash algorithm can no longer restore the original data through an algorithm, that is, it is one-way, so it is
suitable for some authentication occasions, and because the hash value can play a role similar to an ID card. Therefore, it can also be used to judge the integrity of the data. Even if the data changes slightly, the recalculated hash value will be different from the previous one.

Generally speaking, in order to ensure the cryptographic security of a hash function, the following three conditions must be met.

1) Anti-collision (collision-resistance). In simple terms, hash function anti-collision means that different inputs cannot produce the same output. It's like going to the cinema to buy a ticket to watch a movie. For those who paid real money to buy a movie ticket, their seat numbers cannot be the same. ? At the same time, it must be noted
that anti-conflict does not mean that there will be no conflict, but the cost of finding two conflicting inputs is very high and unbearable. It is like brute force cracking a password valid for 20 years. The whole cracking process lasts for 30 years. Although the password is finally cracked, because the validity period of the password has expired, it loses
its meaning.

2) Information hiding. This property means that if the output of the hash function is known, it is impossible to reverse deduce the input. This is well understood in cryptography: even if an enemy intercepts a public channel (such as radio waves) and obtains the transmitted hash information, it is impossible for the enemy to restore the plaintext based on this
information .

3) Can be hidden (puzzle friendly). If someone wants the output of the hash function to be a specific value (meaning that someone knows the output of the hash function in advance), as long as the input part is random enough, it will be impossible to crack in a reasonable enough time. This feature is mainly to deal with
counterfeiting and imitation. Recently a popular singer's concert tickets are super expensive, 10,000 yuan a piece. This gave birth to the fake ticket business: fake tickets for individual concerts. The tickets here are public, and everyone knows what they look like and what materials they use, which is equivalent to the output of a known hash function. The hidden feature is that those who want to make fake tickets clearly know what
the output looks like, but they don't know what "raw materials" and "techniques" to use to create the exact same ticket.

Notice

Since the output value of the hash algorithm is fixed, while the length of the original data is varied, it is doomed that in theory, different original data may output the same hash value. It occurs when the amount of data is extremely large. For example, in the anti-spam
algorithm , we generally calculate a hash value for each email address and store it as a filter library, but there are so many email addresses in the world, and there are various formats. The address performs multiple hash calculations, and the calculated values ​​are combined to determine whether there is an email address
. This is also the basic principle of the Bloom filter. In Bitcoin, the Bloom filter is used to enable SPV nodes to Quickly retrieve and return relevant data.

 

Types of Hash Algorithms


Commonly used hash algorithms in cryptography are MD5, SHA1, SHA2, SHA256, SHA512, SHA3, RIPEMD160, which will be briefly introduced below.

·MD5 (Message Digest Algorithm5). MD5 is an algorithm that inputs variable length information and outputs a fixed length of 128 bits. After the program flow, four 32-bit data are generated, and finally combined into a 128bits hash. The basic method is to calculate the remainder, take the remainder, adjust the length, and perform
cyclic obtain the result. The MD5 algorithm has been widely used, but it has been proven to be an insecure algorithm at present. Professor Wang Xiaoyun has already cracked the MD5 algorithm in 2004.

·SHA1. SHA1 is widely used in many security protocols, including TLS and SSL. In February 2017, Google announced that it had broken SHA1, and was preparing to gradually reduce the security index of SHA1 certificates in its Chrome browser products, and gradually stop supporting certificates using the SHA1 hash algorithm.

·SHA2. This is the second generation of the SHA algorithm family, which supports longer digest information output, mainly SHA224, SHA256, SHA384 and SHA512. The digital suffix indicates the length of the hash digest result they generate.

·SHA3. As you can see from the name, this is the third generation of the SHA algorithm family. It was previously called the Keccak algorithm. SHA3 is not intended to replace SHA2, because SHA2 currently has no obvious weaknesses.

·RIPEMD-160 (RACE Integrity Primitives Evaluation Message Digest160) RIPEMD160 is a 160-bit cryptographic hash function. It is designed to replace the 128-bit hash functions MD4, MD5 and RIPEMD-128.

In fact, in addition to the above algorithms, there are many hash algorithms, some of which are not very particular about encryption characteristics, such as the consistent hash algorithm commonly used in the field of load balancing, the purpose is to quickly calculate a digest of the server address value instead of encryption, so some other fast hashing algorithm
is .

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324583347&siteId=291194637