## First, what hash collisions are?

The so-called hash (the hash), the input is mapped to different unique, fixed-length value (also called "hash value"). It is one of the most common software operations.

If you get a different input the same hash value, it took place "hash collision" (collision). For example, many network services will use the hash function to generate a token, to identify the user's identity and authority.

``````
AFGG2piXh0ht6dmXUxqv4nA1PU120r0yMAQhuc13i8
``````

This string is above a hash value. If two different users, got the same token, a hash collision occurs. The server will these two users as the same person, which means that user B can read and change the information of the user A, which undoubtedly is a big security risk.

One method of hacking is trying to manufacture a "hash collision" and then invade the system, steal information.

## Second, how to prevent hash collisions?

The most effective method of preventing a hash collision, is to expand the value of space hash value.

16 bit hash value, the likelihood of collision is one of 65536 points. That is, if there are 65,537 users, they will collide. The length of the hash value is extended to 32 bits, the likelihood of collision will drop to one 4,294,967,296 points.

Longer hash value means more storage space, more calculations will affect the performance and cost. Developers must make a choice, to find a balance between security and costs.

Here are just, under the premise of how to meet the safety requirements, find the shortest length of the hash value.

## Third, the birthday attack

Hash collision probability depends on two factors (assuming reliable hash function, generation probability of each value are the same).

• Size value of the space (i.e., the length of the hash value)
• Throughout the life cycle, the number of calculated hash values

This issue has long been on the mathematics prototype called " birthday problem " (birthday problem): How many people need to have a class, in order to ensure each student's birthday is different?

The answer is surprising. If at least two classmates birthday the same probability of not more than 5%, then this class can only have seven people. In fact, a class 23 has a 50% probability of at least two identical classmates birthday; 50% probability of class 97, class 70 is a probability of 99.9% (calculated see below).

This means that if the value of space hash value is 365, so long as the hash value calculation 23, there is a 50% chance collision. In other words, the possibility of a hash collision than imagined high. In fact, there is a similar formula. The above formula can be calculated, the number of calculated hash collision probability of 50% of the desired, N denotes the value of the hash space. Birthday problem of N is 365, calculated is 23.9. This formula tells us that the number of computations required to spend a hash collision, with the square root of the value of space is an order of magnitude.

This use of the hash space is not large enough, and the manufacturing method of attack collision, it is called the birthday attack (birthday attack).

## Fourth, the mathematical derivation

This section presents the mathematical derivation of the birthday attack.

The same birthday probability that at least two people, everyone birthday mutually different probabilities can be calculated first, and then one minus the probability.

We put this question to imagine, everybody line up in order to enter a room. The first person to enter the room, and the room already people (0 people), the probability is not the same birthday `365/365`; the second man entered the room, unique birthday probability `364/365`; third is `363/365`, and so on .

So, everyone's birthday is not the same probability, is the following formula. The above formula n is the number of people entering the room. As can be seen, the more people entering the room, the smaller the probability birthday different from each other.

The following formula can be derived to form. Well, there are at least two people the same birthday probability is 1 minus the above formula. ## V. hash collision formula

The above formula may be derived to a further general, a form easy to calculate.

According to Taylor formula, E exponential function X may be a polynomial expansion. If x is a very small value, then the above equation is approximately equal to the following form. Now the birthday problem of `1/365`substitution. Therefore, the probability formula birthday problem, it becomes so below. D is assumed that the value of space (birthday problem in 365), you get a generic formula. Above is the hash collision probability formula.

## Sixth, the application

The above formula written as a function.

``````
const calculate = (d, n) => { const exponent = (-n * (n - 1)) / (2 * d) return 1 - Math.E ** exponent; } calculate(365, 23) // 0.5000017521827107 calculate(365, 50) // 0.9651312540863107 calculate(365, 70) // 0.9986618113807388 ``````

``````
calculate(62 ** 3, 10000) // 1 ``````

``````
calculate(62 ** 5, 10000) // 0.05310946204730993 ``````

22个字符的哈希值，就能保证300万亿次计算里面，只有1000亿分之一的概率发生碰撞。常用的 SHA256 哈希函数产生的是64个字符的哈希值，每个字符的取值范围是0~9和a~f，发生碰撞的概率还要低得多。

## 七、参考链接

（完）

### Guess you like

Origin www.cnblogs.com/appcx/p/11014587.html
Recommended
Ranking
Daily