Rainbow table Making a Faster Cryptanalytic Time-Memory Trade-Off summary

Speaking rainbow table have to mention three, Martin Hellman (co-author of the Diffie-Hellman key agreement algorithm), Rivest (the R in RSA), Philippe Oechslin, hereinafter referred to as M, R, P. Although the concept of rainbow tables is P raised, but also the enormous contribution of our predecessors

Rainbow result table is used to break and restore the plaintext hash algorithm hash before, the plaintext pre-computed, rather than the plaintext hash value pair.

hash hash algorithm:

Hash function of the hash algorithm is a mapping from a large value is set to a smaller set Q P a fixed length, the approximate value of the process according to the padding bits, packets, bit operations, eventually to give a fixed number of bits value, the value for each p has a value of q via mapping and determined uniquely. Assuming that the hash algorithm is H, a condition qualified hash algorithm is required to meet

  1. p q calculated by the speed should be fast
  2. I.e. unidirectional plaintext p can quickly calculate a hash value q = H§, difficult to obtain by using mathematical methods p q
  3. According to one q = H§, difficult to find a different such that p1 q = H (p1)

Due to the irreversibility hash algorithm, the ciphertext break the hash algorithm to encrypt basically only use brute force way, the rainbow table is. The most primitive way of brute force and dictionary method is exhaustive method. (See below with reference to the specific Detailed literature and website). Since the law requires exhaustive lot of time, time to break the billion years as a unit, and dictionary method requires a lot of memory, if the plaintext space 14 of uppercase and lowercase letters and numbers stored, the amount of memory needed is 10 ^ 14TB, this is obviously very difficult.

hash list

1980, M proposes a method of time and memory mutual compromise, time memory trade-off (TMTO ), which is the first hash list. In a hash hash, it is assumed plaintext P . 1 , hash algorithm H, the encrypted ciphertext H . 1 , i.e.
Here Insert Picture Description
it is stored in the dictionary table in a manner, and stores a plaintext hash value pair. hash list using the reduction function R. R is a function of a hash function value is re-mapped to the plaintext space, and therefore a function of range and domain R and H is a function of the reverse. With the above example, R may be a function of P . 1 the hash value H . 1 is mapped to another P 2 , wherein P . 1 is not equal to P 2 . P 2 then passes into the hash value may be a hash H H 2 . Therefore, the structure of the hash chain as follows: Hash chain
wherein is stored (P . 1 , H . 1 ) (P 2 , H 2 ) ...... (P n- , H n- ) of n-hash value, but a hash stored only the beginning and end of the list plaintext character P . 1 and P n-+. 1 .

R function

Hash linked list wherein a plaintext success rate depends on the coverage, which shows the importance of a good R function. A good function should satisfy R
1. reduce collisions mapped back to the plaintext space;
2. the same hash function as random mapping plaintext plaintext space as possible;
3. limit the range of characters to a desired value of the plaintext;

But a table is too long chain chain will cause too much impact plaintext, then the remaining value will be a chain collision, which caused a lot of unnecessary calculations. In which the paper P in a note, this time caused by the collision of consumption accounted for more than 50%. Thus hash list of recommendations in accordance with the plaintext N is the number of (1/3) tables, each table has N (1/3) chains, each chain has N (1/3) of the plaintext hash value .

Crack process

After the hash list is established, the rest is to break the cipher text s.

First, the ciphertext s mapped by the R function to the plaintext space has a value s1, to find out whether the end point of all chains have the s1 same, if the same, then a high probability that the ciphertext s corresponding plaintext is on this chain P n- , if not found to the same value, this s1 again once H and R arithmetic and expressly s2, s2 compared with the end points of all chains, if the same then a high probability that the ciphertext s corresponding plaintext is this strand the P n--. 1 , if the cycle until it finds a value of the end point of the same chain of the same, if not found, the lookup fails.

distinguished point checkpoint

The concept checkpoint is the optimization of R Mhash the list
because of crack passwords, the biggest bottleneck is not the processor at the time, but the memory access. Because plaintext value to compare every step necessary to get to the end point of all chains, which caused a large number of memory accesses. Therefore, in order to reduce the memory access speed bottleneck, R proposed the concept checkpoint, is the end point of each chain defines a rule, each chain must end with an end point matching the rule. It is not necessary every time access memory, if the resulting plaintext does not comply with the rules of the end point, we do not need to compare directly calculate the next plaintext.

rainbow table

In fact, a rainbow table is optimized hash linked list, only fresh stool, different from direction R.
Optimization rainbow table is presented in a chain, each R function should be unique. Namely:
Here Insert Picture Description
Collision This can obviously reduce the chain: if both strands produced the same plaintext in different columns, but due to the different R a next function, different from the next plaintext. Only when two identical plain text appear in the same column, the two chains will produce collisions. And so the probability is only 1 / t (t is the length of the list).
The following t: chain length m: number of chains
in the paper P has said, t success rate of size and the size of the hash list mxt success rate mt xt is about equal, but there is no explanation is given in the paper.
However, if it is not bound by formal verification you can think: R using a different hash function between t a linked list, each hash list corresponding to a rainbow table are in, so t a plaintext hash list total coverage should and a function of R using t rainbow tables similar.
But if only this, is not too great innovation. In theory, rainbow compared to the hash table to reduce the list of half time break, and reduce a lot of collisions, which can also reduce a large part of the time. A hash linked list traversal from 1 to n, a complex of O (t) of time, but because of t tables, it is necessary t 2 time. A rainbow table walk from 1 to n required t (t-1) / 2 , and therefore reduced by approximately half the time.

The paper P can follow N = t 1/3 , m = N 2/3 of the configuration, which should be the stage time and memory to compromise it, if in the future which direction, if any significant breakthrough, I think this ratio will change.

to sum up

Although the use of pre-computed hash list or rainbow table can greatly reduce the break time, but the amount is calculated to make a rainbow table have not been reduced even more. This is because a rainbow table if you want to reach more than 50 percent an acceptable success rate, the number of plaintext to be calculated at least the number of all the plaintext space, because each value of R mapping function is not necessarily in plaintext space. So often we need more rainbow tables to improve the success rate of break.

Success rate calculation

Each column is the success rate of single rainbow table can be calculated to get a table, which can be seen as a problem of classical table (ie Hellman proposed hash list). In the first column of the table rainbow, we m1 = m, i.e. the number of chains, different plaintext as the first column. The second column of m . 1 plaintext plaintext space to a randomly distributed within a range N, the generated m 2 different plaintext, wherein:
The second column the number of different plaintext
each column has a m i i different plaintext, the success rate of a rainbow tables the formula is as follows:
Success rate is calculated
to understand this formula: each rainbow table to start the search from the last one, if the plaintext resulting not found, then go to the penultimate column of the search, and so forth, and finally to the first column if the search is still not found explicitly, the search fails. Search failure is only one situation in which all the search fails in each column, the final success rate is equal to one minus the probability of failure. The probability of failure is i-th column-m. 1 i / N, so the probability of failure for all of the search (. 1-m . 1 / N) (. 1-m 2 / N) ...... (. 1-m T / N), the success namely probability of failure probability of 1.
The following is a calculation table of the plaintext space rainbow N number of chain length t m chain success rate achieved in python function I

def success_rate(N, t, m):
    keys = []
    keys.append(m)
    for i in range(1,t):
        key_i = N*(1-(1-1/N)**keys[i-1])
        keys.append(key_i)
    result_fail=1
    for i in range(t):
        result_fail *= (1-keys[i]/N)
    return 1-result_fail

if __name__=="__main__":
    result_success = success_rate(7555858447479,15200,805306368)# lm_ascii-32-65-123-4 1-7
    #result_success =success_rate(6704780954517120,422000,67108864*360)#md5_ascii-32-951-8
    print(result_success)

The success rate of more rainbow table is calculated:
in fact, a rainbow table and calculate the success rate of the method have every thing in common, so-called success, failure only one. Find all is the case of failure of the query failed in all rainbow tables. Due to the different set of functions different R rainbow tables employed, there is no influence between the plurality of tables rainbow. Rainbow table provided for each success rate S 1 , S 2 ......, S n- , each N tm rainbow table are equal, so each individual table rainbow success rate equal to s, the probability of failure of a -s, a plurality of probability of failure when rainbow lookup table (. 1-S) n- , i.e., the probability of success of l- (. 1-S) n- .

doubt

1, I wrote this function with a lot of data inspected, (data from rainbowcrack) such as the first number is 7555858447479 plaintext space above to give a success rate of 55.8%, using eight rainbow tables to calculate the success rate can indeed 99.9%. But the plaintext space 95 characters of the second md5 obtain success rates vary widely. There are big brother, then seek to inform.
2, and the formula P has proposed a premise that if the mapping function wherein the R value is not a clear text space, then all values are behind it should not be counted as a valid plaintext. But why not a binary string a plaintext space, and will not re-enter the plaintext space after a hash, and then map it.

References:
http://www.h-online.com/security/features/Hellman-and-Rivest-746294.html
https://blog.csdn.net/Saintyyu/article/details/102583941
HTTPS: // Blog .csdn.net / Saintyyu / Article This article was / Details / 102 583 941
Making A Faster cryptanalytic-Time Memory Trade-Off

Published 15 original articles · won praise 3 · views 10000 +

Guess you like

Origin blog.csdn.net/biziwaiwai/article/details/105374501