HASHMAP (JDK1.7) the most detailed analysis of the principles of (a)

No public: large sub-trotters programmer

In JDK1.7, HASHMAP + is achieved by an array list, the principle is as follows:

HashMap map = new HashMap(); // 伪初始化
map.put("键""值"); // 真初始化
复制代码


HashMap configuration method when executed initializes an array table, of size 0.
A HashMap PUT method when executed first determines whether the table size is 0, if initialized to 0 will be true, also called lazy initialization.
When true initialization, the default size of the array is 16, of course, you can call HASHMAP have parameters to specify the constructor to initialize an array capacity by you, but note that not what you really have the final say, for example, you now want the array the initial capacity of 6, then HASHMAP generates an array of size 8, and if you want to array initial capacity of 20, then HASHMAP generates an array of size 32, i.e. that you want to initialize a size n of the array, However HASHMAP initializes a size larger than the number n is equal to the square of an array. As to why it is, we'll deal with.

For PUT method, when the table does not need to be initialized or has been initialized over its main task is the next key value and stored in the array or list. So how will a keyvalue to deposit into an array go?
We know that if we want to go into the data array, we first have to have an array index, while we conduct PUT when a parameter does not need to send it as an array index, so the index is how HASHMAP to get it? The answer is a hash algorithm, which is why the name is not called HASHMAP other MAP.

For hashing algorithm, to find out about people should know that it takes a hash function that takes one argument returns a HashCode, the characteristics of the hash function is the same argument for the return of HashCode must be the same for different the argument, the function will not return as much as possible to the same HashCode, so for a point of understanding, for the hash function to the same parameters may not return the same HashCode, this collision is called a hash or hash conflict.

Then we can come directly to this HashCode it as an array index, another very important issue isIn the end we should be key to make hash or of value to make hash, or for keyvalue make hash at the same time?

Well, this time we will take into account the GET method, because GET only need to pass a key as an argument, but in fact the logical GET method is through the key hashing to quickly get the array index to quickly find key the corresponding value. So for the PUT method, although passed two parameters, but can only obtain key hashing array index, so as to facilitate the GET method to quickly find.

But there is a problem, HashCode it directly as an array subscript it?
HashCode It is usually a relatively large number, for example:

System.out.println("键".hashCode()); // 38190
// 为什么是这个结果,大家自行去看String类中的hashCode方法
复制代码

So we can not put so much as a digital array subscript, how to do?
It may usually think modulo operation, but did not use modulo HashMap, but:

static int indexFor(int h, int length) {
	// assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
	return h & (length-1);
}
复制代码

This method is the subject method under the array (PUT and GET JDK1.7HashMap the two methods PUT and GET methods to obtain have to go to get subscript? Yes, if you see here do not understand, then you go to think about the above speaking GET improving efficiency of the process logic of it), this method h represents hashcode, length representative of the length of the array. We found it to be used in logic operation, then the problem comes, and logic operations can be accurately calculated an array subscript? Let's calculate, assuming hashcode is 01010101 (binary representation), length is 00010000 (16 in binary representation), then h & (length-1) was:

h:  0101 0101
15: 0000 1111
  &
    0000 0101
复制代码

The values for the above method of operating results to our discussion: High nibble because 15 is 0, the lower four bits are 1, while logic and arithmetic operations are two bits are only 1 result is 1, for the above calculation results are certainly high nibble 0, and the lower four bits of the lower four bits and h are the same, so the range is the result of a range of values h nibble: 0000-1111, it is 0-15, so this result is in line with the target at the range of the array.
So it is assumed that length is 17? So h & (length-1) was:

h:  0101 0101
16: 0001 0000
  &
    0001 0000
复制代码

When the length is 17, the result of the above operation range only two values, either 00000000, 0001000 either, which is not good.

So we found that if we want to convert to cover HashCode index array index value range, with our length is very relevant, if the length is 16, then after that is a minus 15 (0000 1111), it is this 0 species are high, the low number of two are made of only one that makes it possible for any hashcode through a logical aND operation results obtained we want array subscripts. That is why in the HashMap initialization when true, the length of the array must be a number of secondary side, the secondary side and the number of arithmetic index is closely related to the group, and this bit modulo arithmetic is faster than.

So we can reason about this: when calling the PUT method, have passed key hashing get a hashcode, and then get an array index by logic and operations, and finally there is keyvalue at this array subscript.

After determining that the memory locations keyvalue, we mentioned above, different parameters may be obtained HashCode same, i.e. hash collision occurs, the reaction to the HashMap is, when the PUT two different key might get the same HashCode to obtain the same array index, in fact, even if the key corresponding to HashCode not the same as in the HashMap, it is also possible to get the same array index after a logical aND operation, then the time HashMap is how to deal with it? Yes, it is a linked list, specifically how to achieve it? Next article continue it.

I believe we do not like in a small cell phone screen also see a chunk of code reading experience, so I'm writing style will be a little above normal text.

Guess you like

Origin juejin.im/post/5cf3a11851882566477b7a02