Hashtable hash Story

Antecedents:

  hashtable is very important data structure inside a concept, before a very, very long time, have repeatedly heard this concept, so check the information in the past have also had some understanding, but hashtable the concept is not very clear. Today saw the Screen explain hashtable aspects, to understand the characteristics of a simple learning before I add to similar MD5 hash encryption algorithm, and then there was a little bit of their own understanding of this hashtable data structure.

Some of the process hashtable before learning some of the concepts :

  Container: hash is a data storage container, some of these data are random data, for example, a pile of scattered integer (1,333,566,7342,6565,3342,25,34242,436,657,879,6589,6,8,654,69,9045. ..), these things after taking over operations in a container (modulo have to say is a very important concept, but it is also because of this so I have been puzzled to hashtable).

  Basket bucket: a very important concept of hashtable, hashtable is composed of a basket, the basket can not store the same data, this is the hashtable features.

  If the data has the same approach: Since the above-mentioned data obtained through the inside hashtable modulo, but after taking over easily produces two identical data, e.g. 100,000,001% 10001% 10 == 10 (10 here is bucket_size ), produce the same data is to approach the hashtable is stored inside a bucket chain, the same data is added on the list.

Some know that day after learning cryptographic hashing algorithm cryptography generated (as mentioned above have in common is that hash algorithm, this is today's main understanding ):

  MD5 encryption: the sequence of characters 01 of different lengths generated by the MD5 after 128bit sequence, i.e. is a 32-byte. There is a very important thing is that the encryption process, he uses a hash algorithm, then the character sequence of different lengths all become 128bit, even a character 'a', after the hash algorithm has become 128bit, and a string "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" after a hash algorithm became 128bit (string obvious than 128bit (32 bytes) is much larger), which is very interesting.

Then went on to say after this understanding came in contact with hashtable:

  ① described above to generate a MD5 encrypted data 128bit, 128bit if the hashtable is related to what? In fact, 2 128 == bucket_size . It's a bit amazing, bucket_size could be so big, there is an understanding of the bucket_size this concept, then we want to know why MD5 128bit, and this is for security reasons myself. If MD5 generated last bucket_size small (e.g. 8bit, 2 . 8 = 256), then if there are 257 different strings are encrypted using MD5, then bound to produce two identical encryption result, it will not then it should be called MD5, because it is too unsafe (of course, not rely on the length of 128bit MD5 to become relatively safe, his background knowledge of mathematics algorithm is more important). Here consider an extreme result, when there are two 128 when a different string of +1, then after a hash MD5 will produce at least two of the same string.

  ② by bucket_size MD5 me a new understanding of the hashtable: hashtable most part of the core is actually a hash algorithm (called a hash function in c ++), the above mentioned take the remainder is actually a hash algorithm (hash algorithm because it is too too simple, thus causing great distress to me when self hashtable). In c ++, we can use a very common container vector to represent the hashtable (vector_size == bucket_size), which already can be expressed as a hashtable of . I'm trying to make is this, to construct a hashtable very simple, it is nothing more than a container even we can use a list to represent the hashtable. But it is important, when we have a bunch of unorganized data (even this may not be an heap data type), how we store in the hashtable (ie the vector)? hash function is particularly important , for example, the hash function MD5 mentioned above can actually almost arbitrary length string generates a 128bit string. Unorganized digital After hash function into a 128bit fixed length (c ++ program normally smaller than this, because the hashtable used vector slowly growing up in c ++, the use of near twice find ways prime number increase) it was amazing, it stressed the importance of hash function again ( perhaps this realization hash function is not difficult, but the concept is very important ).

  Finally c ++ ③ with a bottom support implemented as a hashtable unorder_set, unorder_map like data structure, which is used when implementing the bottom of the vector data structures that do bucket, container supported hashtable used the most important is that the incoming hash function (function said object or functor).

 

Guess you like

Origin www.cnblogs.com/Ccluck-tian/p/11930365.html