Analysis of the working principle of HashMap in JDK8


In the Java language, HashMap is undoubtedly a very frequently used class, and understanding its internal implementation will help you use it better.


HashMap in jdk8 is composed of three data structures: array + (linked list or red-black tree)

as shown below:






Before jdk8, there were only two data structures of array + linked list, here is a brief mention of the difference between array and linked list :


Array

Advantages

: continuous physical address + high efficiency O(1) random access by subscripts

The access efficiency is low O(n) , and the hash table (Hash data structure) combines the advantages of the two, and an efficient data storage structure derived from it is essentially a space-for-time method to improve the efficiency of reading and writing. The inheritance structure of HashMap is as follows:










`````
public class HashMap<K,V>
extends AbstractMap<K,V>
implements Map<K,V>, Cloneable, Serializable
`````


Here we can find that K and V in HashMap are generic, so any type can be supported as key or value, but in actual development, strings of type String are used the most as keys.


The hashCode and equals methods of the generic Key are very important here, because they will affect the data distribution and reading and writing stored in HashMap. As


mentioned above , HashMap can be regarded as a large array, and then the type of each array element is Node type , the source code is defined as follows:
````
    transient Node<K,V>[] table;
````
Note that the Node class has two subclasses: TreeNode and Entry
````
TreeNode<K,V> extends Entry<K,V> extends Node<K,V>
````


The linked list in the above figure is the Node class, and the red-black tree is the TreeNode class.


Next , let's look at some member variables of HashMap:

````
     //The number of default table array buckets must be the square of 2, the default value is 16
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;

    //The maximum length of the default table is about 1 billion (1073741824) and the maximum number of buckets
    static final int MAXIMUM_CAPACITY = 1 << 30;

    //default load factor
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    //When the number of a single linked list exceeds 8 nodes, it is converted into a red-black tree storage
    static final int TREEIFY_THRESHOLD = 8;

    //If it turned out to be a red-black tree, after some nodes are deleted later, there are only 6 or less left, which will be re-converted to linked list storage
    static final int UNTREEIFY_THRESHOLD = 6;

    //When the length of the array (note that it is not the size of the map but table.length) is greater than 64,
    // will treeize the linked list greater than 8 in a single bucket
    static final int MIN_TREEIFY_CAPACITY = 64;
    
    //The set used to traverse the map, in turn traverse the node or treeNode in all buckets in the table
    transient Set<Map.Entry<K,V>> entrySet;

    //The actual amount of data currently stored = map.size instead of table.length
    transient int size;

    //The number of modifications is used to determine whether the map is operated by multiple threads at the same time,
    //The exception ConcurrentModificationException will be thrown in a multi-threaded environment
    transient int modCount;

    //Threshold table.length * loadFactor in the current array, if it exceeds
    / / After this threshold, it is necessary to expand (resize)
    int threshold;

    //load factor
    final float loadFactor;
    
    
````



Member variables are mainly composed of two parts, one is the constant during processing, and the other is the variable that will change at runtime. It should also be noted here that HashMap itself is not thread-safe, so try to avoid using it in a multi-threaded environment. To use, use a thread-safe Map, as follows:
````
 ` (1)Map m = Collections.synchronizedMap(new HashMap(...))
  (2)ConcurrentHashMap map=new ConcurrentHashMap();
````


In addition, HashMap has several constructors:

````
    `   //1
 `   public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
    }
    
    
    //2
     public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }
     
     //3
        public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false);
    }
    
    //4
        public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }
    
````



(1) The first is the default construction method. We see that only the load factor affecting the expansion is initialized. The default is 0.75

. (2) The second and fourth are actually a method logic, which can be passed in the specified table. The size of the array, with the default load factor.

(3) The third one can pass in a Map collection and assign it directly to the map, which uses the putMapEntries method. This method can be understood as iterating the incoming Map and then assigning the data to the new Map

(4) The fourth is at the same time Specify the size and load factor of the initialized table array. There are some logical judgments in the middle. Here we need to mention the tableSizeFor method: the

source code is as follows:

````
    /**
     * Returns a power of two size for the given target capacity.
     */
    static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }
````


This method ensures that the length of the specified table array must be the nth power of 2. For example, if you initialize and pass in 5, but after the actual operation, you will find that the length of the table array is 8. This is because the n times of 2 Square, it has great advantages for the expansion and reassignment of the array, so if you pass in not the nth power of 2, then the value obtained through this method is generally larger than the parameter you pass in. The closest 2 is n. power.



Let's take a look at how HashMap stores data?

The put method in the source code is as follows:

````
    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }
````



Here we see that the hash method is called in the put method to get the hashCode of the key, then let's see how it is implemented internally:
````
    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }
````
It can be seen here that the hash method is not the hashCode value directly taken, but is implemented by the high 16-bit XOR of the hashCode() and the low 16-bit. This can ensure that both high and low bits can participate in the Hash calculation. In one sentence This is to reduce the chance of hash collisions.


Then in the putVal method, the data insertion operation is implemented. Note that the calculation method of the subscript of the array is:
````
i = (table.length - 1) & hash
````
Equivalent to modulo the length of the array using the converted hash value
````
h % table.length
````
However, bit operation is more efficient than modulo operation. In the method of inserting data


in putVal, the expansion method is called for the first time. In addition, when inserting, it will also judge whether the node is a linked list or a red-black tree. They correspond to different assignment methods, and if If the number of nodes in a single bucket is greater than 8, the linked list will be converted into a red-black tree, and after the insertion is completed, it will continue to judge whether the next time expansion is required.

Here is the focus of the expansion method:
````
    final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        //The length assignment of the last table
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        //The threshold is equal to the value of the current member variable
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            //Otherwise, expand cap and thr by *2  
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1;
        }
        else if (oldThr > 0) //If the old table is not initialized, use the threshold as the length of the table
            newCap = oldThr;
        else { //When no constructor is passed in for the first time, table.length=16
            newCap = DEFAULT_INITIAL_CAPACITY;
            //Threshold=16*0.75=12
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
        	//If the new threshold is equal to 0, it will be recalculated according to the new table.length and load factor
        	//And judge whether it exceeds the maximum value, if it exceeds, take the maximum value
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        //Reassign the newly calculated threshold to the member variable
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
        // Construct an array with the size of the new table
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;//Assign to member variable
        if (oldTab != null) {
        	//Recalculate the data in the old table to the new table array
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    //If it is a red-black tree, perform tree-related operations
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                    	//Indicates the node position in the original linked list, either unchanged or table[j + oldCap]
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab [j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab [j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
         //Return the new table array after expansion
        return newTab;
    }
````



Note that the length of the expanded table array must be 2 to the nth power.


Finally, let's look at the get method:

````
    public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
````


You can see that the get method also calls the hash method to get the hashCode, and then calls the getNode method:


````
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        / / Judging table is not equal to null, table.length must be greater than 0
        //Then the index of the array access is also obtained through the bit operation and then the first element taken from the array
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
        	//If the hash values ​​are equal, the keys are equal, and the key is not equal to null, the retrieved key is equal to the key of the first element,
        	// just return to this node
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
            	/ / Determine whether it is a red-black tree, if so, search from the red-black tree species
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {//If not, traverse the entire list until you find a node that meets the conditions
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        //If the final result is not found, return null
        return null;
    }
````


The efficiency of HashMap reading:

(1) If it hits the first node, it is O(1)

(2) If it is queried in the red-black tree, it is O(logn)

(3) If it is queried in the linked list, That is O(n).

Here, we will find that the introduction of the red-black tree structure is actually to improve the retrieval efficiency.

Note that there is another small detail in the above query process, which is to determine whether the key is null, because both the key and value in HashMap
can be allowed to be null values, sometimes you get a null value, there may be two cases, then the value It is null,
or because your key is passed in null, and just this null key, the corresponding value is also null.

The demo code is as follows:
````
        HashMap<String, Integer> map=new HashMap<String, Integer>();
        map.put(null, null);//K and v are allowed to be null
        map.put("5", null);
       
       / / Determine whether there is a null key
        System.out.println(map.containsKey(null));
        
        System.out.println(map.get(null));//null
        System.out.println(map.get("5"));//null
````


Here you can use the containsKey method to determine whether there is a null key




. Summary:

This article analyzes the working principle of HashMap in JDK8, and introduces some core methods and precautions. By understanding its internal operating mechanism, it helps us to be more Reasonable to use in actual development.



Reference articles:

https://www.jianshu.com/p/aa017a3ddc40

https://www.geeksforgeeks.org/internal-working-of-hashmap-java/

https://www.cdn.geeksforgeeks.org/java- util-hashmap-in-java/

https://www.javacodegeeks.com/2017/11/java-hashmap-detail-explanation.html

http://blog.csdn.net/zxt0601/article/details/77413921

http: //lingmeng.github.io/archives/

If you have any questions, you can scan the code and follow the WeChat public account: I am the siege division (woshigcs), leave a message in the background for consultation. Technical debts cannot be owed, and health debts cannot be owed. On the road of seeking the Tao, walk with you.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326336065&siteId=291194637