HashMap layman's language

HashMap data structure

  HashMap is the essence of an array plus a linked list, an array of default length is 16, the storage element 75% of the total length of the expansion will be doubled. map.put (key, val), in fact, is based on the hash hash modulo the length of the array to a uniform hit on each index, each index position to fill the array. But things are not so perfect as possible after the hash modulo index will be two elements, in order to avoid conflicts hash, hashmap to maintain the data structure of a link, subject to a list of elements stored in the same case. But this get (key) when there will be a problem, if the element of speed on the array will just get quickly, but the elements on the list will get very time consuming, if the depth of the list is n, then get the required the time complexity is O (N), when doing so jdk1.8 an optimized, is to add a red-black tree in the original data structure, when the length of the list> = 8, will turn into red-black tree.

  

Why, if the initial capacity integer power of 2

/**
 * The default initial capacity - MUST be a power of two.
 * / 
Static  Final  int DEFAULT_INITIAL_CAPACITY = 1 << 4; // AKA 16 

  This is the initial capacity jdk8 in, and tell us the default initial capacity - must be a power of two. But we create a HashMap <String, String> = the Map new new HashMap <String, String> (7 ); not being given ah. Indeed at initialization, the following code initializes it to a value greater than us, and closest to it a power of a 2. For example, we specify 7, its initial value is 8, so we specify the initial value of 17 is 32.

/**
 * Returns a power of two size for the given target capacity.
 */
static final int tableSizeFor(int cap) {
    int n = cap - 1;
    n |= n >>> 1;
    n |= n >>> 2;
    n |= n >>> 4;
    n |= n >>> 8;
    n |= n >>> 16;
    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
    
So the question becomes, why jdk8 have to turn about it? Do not just use the default smell?
    In fact, we put (key, value) when the point into the open at the source will know, it is actually a bit by the arithmetic to calculate the next target, such efficiency is higher than the modulus. Of course, why this initial length is an integer and a power of two does not matter, but if the length is not an integer power of 2, then, out of the bits of arithmetic modulo operation and subscripts will be different, so this is the genius jdk8 .
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
/**
 * Returns index for hash code h.
 */
static int indexFor(int h, int length) {
    return h & (length - 1);
}

 HashMap load factor

    Why load factor is 0.75? There is in fact related to the balance of the time complexity and space complexity. Since then loadfactor to 1, which means that it must take to fill the entire array expansion, under normal circumstances, it is difficult to fill, there will certainly be a lot of hash collision, resulting in the list is too long. If loadfactor 0.5, then fill half of the array expansion, but also have space utilization is not high. Therefore, a compromise value taking 0.75

JDK8 optimized data structures hashmap

   /**
     * nodes in bins follows a Poisson distribution
     * (http://en.wikipedia.org/wiki/Poisson_distribution) with a
     * parameter of about 0.5 on average for the default resizing
     * threshold of 0.75, although with a large variance because of
     * resizing granularity. Ignoring variance, the expected
     * occurrences of list size k are (exp(-0.5) * pow(0.5, k) /
     * factorial(k)).
     The first values are:
     *
     * 0:    0.60653066
     * 1:    0.30326533
     * 2:    0.07581633
     * 3:    0.01263606
     * 4:    0.00157952
     * 5:    0.00015795
     * 6:    0.00001316
     * 7:    0.00000094
     * 8:    0.00000006
     * more: less than 1 in ten million
     * / 
  When the chain length reaches our 8, it will be converted to red-black tree structure. The source also said in comments, this is based on a probability conclusion "Poisson distribution" learn it, the formula is (exp (-0.5) * POW (0.5, k) / * factorial (k)), and a depth to a depth of 8 list you want, and it's only the probability of 0.00000006, so the red-black tree data structure for the performance of the entire hashmap no major improvement. Many blog said load factor to do is to satisfy the Poisson distribution, which is wrong, they do not have a cent relationship.

 HashMap thread-safety issues

[1 .] Put time, resulting in data inconsistencies.
1. Thread 1 calculate the index coordinate of the tub, it is desirable to insert a key- value pairs to the HashMap, but this time the cpu time slice is exhausted into the blocked state.
2 Thread 2 began operation being performed put, if it's just the bucket index and thread 1 are the same, and data insertion is successful.
3 before Thread 1 is now awakened to run again, he still holds the head of the list, continue to calculate a good place to insert data.
4 . Finally, it covers the thread 2 inserted record, leading to 2 thread insert records to suddenly disappear.

[ 2 .] Security issues when the array expansion
When the expansion is in fact generate a new array, and then written to the keys before re-calculate the new array. But when multiple threads simultaneously detected capacity is needed, ultimately only one thread assignment is successful, other threads will all be lost

 。

 

Guess you like

Origin www.cnblogs.com/wlwl/p/11954343.html