HashMap principle (ii) expansion of access mechanisms and principles

Us "in a chapter HashMap principle (a) concepts and underlying architecture ," explained the data structure stored in HashMap and common concepts and variables, including capacity capacity, threshold variables and loadFactor variables. This chapter explains expansion mechanism and the principle of access to the HashMap.

First look at the basic concepts:

Variables table: underlying HashMap data structure is an array of the entity of the Node class, for storing key-value pairs;

capacity: it is not a member variable, but it is a must to know the concept, indicates the capacity;

variable size: indicates the number of the HashMap stored key-value pair;

loadFactor variables: load factor, used to measure the full extent;

variable threshold: the threshold value, and when this value is exceeded, indicating that the table indicates the expansion;

A. Put method

HashMap obtained using a hashing algorithm stored in the array position, then the method call put key-value pairs stored in the variable table. Let's walk through the map about the stored procedure.

Simple explanation:

1) a hash value obtained by hash (Object key) algorithm;

2) to determine whether the table 0, if the implementation of a resize () is null for expansion or length;

3) a hash value obtained by the array and the length of the insert table array index i, determining the array table [I] is empty or is null;
. 4) if the table [i] == null, directly adding new nodes, steering 8), if table [i] is not empty, the steering 5);
5) determines whether the table [i] of the first element and key as if the same direct coverage value, the same refers to here is hashCode and equals, else go to 6);
6 ) Analyzing table [i] whether the treeNode, i.e., table [i] whether the red-black tree, if a red-black tree, the tree is inserted directly in the key pair, otherwise turn to 7);
7) traversing table [i], determining whether the chain length is greater than 8, greater than 8, then the list is converted to red-black tree, red-black tree in the insert, the insertion operation otherwise list; traversal key already exists if found to direct coverage value;
8) after successful insertion, to determine the actual number of existing key size over whether more than the maximum capacity threshold, if exceeded, for expansion.

We look at the three most important methods there, hash (), putVal (), resize ().

1. hash method

We calculated the index by the hash method to obtain a position stored in the array, look at the source code

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

We can see the hash algorithm HashMap is the right value obtained through the key 16 hashcode value of its hashcode XOR operation to get, so why not just use key.hashCode (), and to the exclusive OR operation? We know the purpose is to obtain hash indexing, and hash conflict is possible, that is a different key to get the same hash value, so it is easy collision industry, how to reduce this happens it, on the adoption of the above the hash (Object key) algorithm with a low 16-bit hashcode hashcode of the XOR operation, a mixture of high and low final hash value obtained, the probability of collision is much smaller. for example:

There is a steamer, the first layer is a package of pork, beef bun, chicken package, the second package layer cabbage, Dousha third layer, the fourth layer is a package of mushrooms. Then you come to buy breakfast, you said, pointing to the first layer in addition to pork package, just give me a bun, because the appearance can not be resolved, then the probability of pork to get the package there is a third, if two, three , four-story mixed with a layer together, then the probability to get the pork package is much smaller.

Our hash (Object key) algorithm principle, the final hash values ​​of high and low mixing information, more doping elements, the greater the randomness of the final hash value, and the HashMap table depends on the final hash index value table.length () - 1 of the & operator, where the operation is similar to the process of pick & buns, much less a natural conflict. It is calculated as follows:

hashCode beginning: 1,111,111,111,111,111 0,100,110,000,001,010

16-bit right shift of hashCode: 0000 0000 0000 0000 1111 1111 1111 1111

XOR hash value calculation: 1,111,111,111,111,111 1,011,001,111,110,101

2. putVal way

A method of transmitting by putVal key-value pairs are added to the array table.

/ ** 
 * the Implements map.put and Related Methods 
 * 
 * @param hash hash for Key 
 * @param Key at The Key 
 * @param value at The value to PUT 
 * @param onlyIfAbsent IF to true, do not Change existing value 
 * @param evict to false IF, in The Table Creation MODE IS. 
 * @return Previous value, or null none IF 
 * / 
Final V PutVal (int the hash, Key K, V value, onlyIfAbsent Boolean, Boolean The evict) { 
    the Node <K, V> [] Tab; the Node <K, V> P; n-int, I; 
    / ** 
     * If the current table is an array HashMap undefined or not yet initialized its length, the first for expansion by a resize (), 
     * returns an array of the expansion the length of the n- 
     * / 
    IF ((Tab = Table) == null || (= n-tab.length) == 0) 
        n-= (Tab = a resize ()) length.;
    // do the hash value of the length of the array by bitwise AND operation to obtain the corresponding & array subscript, if this position is not the element, the new element directly to the new Node inserted 
    if ((p = tab [i = (n - 1) & hash ]) == null) 
        Tab [I] = the newNode (the hash, Key, value, null); 
    // otherwise, the position of elements has, we need some additional operations 
    the else { 
        the Node <K, V> E; K K; 
        // if the key is inserted and the same as the original key, click on the bin is replaced 
        if (p.hash == hash && (( k = p.key) == key || (key = null && key!. the equals (K)))) 
            E = P; 
        / ** 
         * key or different, it is determined whether the current Node is the TreeNode, if the execution putTreeVal new element is inserted 
         * into a red-black tree. 
         * / 
        The else IF (the instanceof the TreeNode P) 
            E = ((the TreeNode <K, V>) P) .putTreeVal (the this, Tab, the hash, Key, value); 
        // if not TreeNode, a linked list traversal is performed
        {the else 
            for (int BinCount = 0;; BinCount ++) { 
                / ** 
                 * not found after the last node in the list the same elements, the following operation is performed directly inserted new Node, 
                 * but could be transformed conditional red-black tree 
                 * / 
                IF ((E = p.next) == null) { 
                    // a new direct the Node 
                    p.next the newNode = (the hash, Key, value, null); 
                    / ** 
                     * = TREEIFY_THRESHOLD. 8, because binCount starts from 0, that is, the list exceeds 8 (inclusive), the 
                     * turned red-black tree. 
                     * / 
                    IF (BinCount> = TREEIFY_THRESHOLD -. 1) for -1 // 1st 
                        treeifyBin (Tab, the hash); 
                    BREAK;
                }  
                / **
                 * If the same key value is found in the list prior to the last node (and do not conflict with the above judgment, is directly above the array 
                 * index is determined whether the same key value), then the replacement 
                 * / 
                IF (the hash == e.hash && 
                    ((K = e.key) == || Key (Key = null && key.equals (K))!)) 
                    BREAK; 
                P = E; 
            } 
        } 
        ! IF (E = null) {// for existing Mapping Key 
            V = oldValue e.Value; 
            when // onlyIfAbsent is true: when a position is not already exist covering elements 
            iF (onlyIfAbsent oldValue == null ||!) 
                e.Value = value; 
            afterNodeAccess (E); 
            return oldValue ; 
        }
    } 
    ++ modCount;
    // final judgment threshold, whether expansion. 
    IF (size ++> threshold) 
        a resize (); 
    afterNodeInsertion (The evict); 
    return null; 
}

3. resize method

For expansion by a resize HashMap () method, the capacity of power of rule 2

/ ** 
 * Initializes Doubles Table or size. The If null, allocates in 
 * Initial Capacity Accord with Held in Field target threshold. 
 * The Otherwise, the using Power Because WE are Expansion-of-TWO, The 
 * Elements from each bin MUST either AT Stay index Same, or Move 
 . Power of A * with TWO The new new offset in Table 
 * 
 * The Table @return 
 * / 
Final the Node <K, V> [] a resize () { 
    the Node <K, V> [] = oldTab Table; 
    ? OLDCAP = int (oldTab == null) 0: oldTab.length; 
    int oldThr = threshold; 
    int newCap, newThr = 0; 
    // previous capacity greater than 0, that is, there are already hashMap elements, or when new objects sets the initial capacity 
    IF (OLDCAP> 0) {  
        // if the capacity is greater than the previous maximum capacity limit of 30 << 1, the critical value is set to the maximum value of 2 ^ 31-1 int
        IF (OLDCAP> = MAXIMUM_CAPACITY) {
            = Integer.MAX_VALUE threshold; 
            return oldTab; 
        } 
        / ** 
         * 2 volumes of previously If less than the maximum capacity limit while the default capacity greater than or equal 16, the threshold value is previously set threshold value 2 
         * times, since the threshold = loadFactor * capacity, capacity expanded twice, loadFactor unchanged, 
         * threshold naturally expanded twice. 
         * / 
        The else IF ((newCap = OLDCAP <<. 1) <MAXIMUM_CAPACITY && 
                 OLDCAP> = DEFAULT_INITIAL_CAPACITY) 
            newThr = oldThr <<. 1; // Double threshold 
    } 
    / ** 
     * constructor in the HashMap Hash (int initialCapacity, float loadFactor) in there is a code, this.threshold       
     * = tableSizeFor (initialCapacity), indicates when calling the constructor, the default is temporarily assigned to the initial capacity of the
     * Threshold the threshold, so here is equivalent to the last assigned to the initial capacity of the new capacity. Under what circumstances will the implementation of the sentence? When invoked      
     when * the HashMap (int initialCapacity) constructor, no additive element 
     * / 
    the else IF (oldThr> 0) 
        newCap = oldThr; 
    / ** 
     * invoke the default constructor, the initial capacity is not provided, so the default capacity DEFAULT_INITIAL_CAPACITY (16), the critical value 
     * 0.75 * is 16 
     * / 
    the else {                
        newCap = DEFAULT_INITIAL_CAPACITY; 
        newThr = (int) (* DEFAULT_INITIAL_CAPACITY DEFAULT_LOAD_FACTOR); 
    } 
    // make a judgment on the critical value to ensure that it is not 0, as in the above second case (oldThr> 0), and does not calculate newThr 
    IF (newThr == 0) { 
        a float. ft = (a float) newCap * loadFactor; 
        newThr = (newCap <&& MAXIMUM_CAPACITY. ft <(a float) MAXIMUM_CAPACITY?
                  (int). ft: Integer.MAX_VALUE); 
    } 
    threshold = newThr; 
     
    @SuppressWarnings ({ "rawtypes" , "unchecked"})
    / ** construct a new table, the data table initialization * / 
    the Node <K, V> [] = newtab (the Node <K, V> []) the Node new new [newCap]; 
    // new table just created is assigned to table 
    table = newtab; 
    IF (! oldTab = null) { 
        // iterate the new table in the original data table to put the expansion 
        for (int j = 0; J <OLDCAP; J ++) { 
            node <K, V> E; 
            ! IF ((E = oldTab [J]) = null) { 
                oldTab [J] = null; 
                // node node list with no direct discharge a new table in the subscript [e.hash & (newCap - 1)] to position 
                IF (e.next == null) 
                    newtab [e.hash & (newCap - 1)] = E; 
                // if treeNode node, the node into the tree in newTab 
                the else IF (the instanceof the TreeNode E) 
                    ((the TreeNode <K, V>) E). split (this, newTab, j, oldCap);
                // If the node list and there are e, where e is traversing the list,
                else {// order to ensure that 
                    the Node <K, V> = null loHead, loTail = null; 
                    the Node <K, V> = null hiHead, hiTail = null; 
                    the Node <K, V> Next; 
                    do { 
                        next node record // 
                        = e.next next; 
                        / ** 
                         * newtab twice the capacity before the capacity of the old table, because the array is not circular table index gradually increasing in accordance with 
                         *, but through (table.length-1) & hash calculated therefore after expansion, storage position on the 
                         * may change, so what changes occur in the end, that is obtained by the following algorithm. 
                         * 
                         * by e.hash & oldCap to determine the position of the nodes through again after the hash algorithm, it will change, as
                         * Fruit of 0 indicates that change does not happen, if 1 said they would change. In the end it is understood how, for example: 
                         * = 13 is binary e.hash: 0000 1101 
                         * = 32 binary OLDCAP: 0001 0000 
                         * & calculation: binary 0: 0000 0000 
                            loTail = E; 
                        } 
                        / **
                         * Conclusion: the element does not change position after expansion 
                         * / 
                        IF ((e.hash & OLDCAP) == 0) { 
                            IF (loTail == null) 
                                loHead = E; 
                            the else 
                                loTail.next = E;
                         * E.hash = 18 binary: 0001 0010 
                         * OLDCAP = 32 binary: 0001 0000 
                         * & operator: 32 Binary: 0001 0000 
                         * Conclusion: the element position will change after the expansion, then how to change it? 
                         * NewCap = 64 binary: 0010 0000 
                         * by (-newCap. 1) the hash & 
                            hiTail = E;
                         * 00011111 & 00010010 i.e. have 0010,32 + 2 = 34 is 0001 
                         * / 
                        the else { 
                            IF (hiTail == null) 
                                hiHead = E; 
                            the else 
                                hiTail.next = E; 
                        } 
                    } the while ((E = Next) = null! ); 
                    IF (loTail = null) {! 
                        loTail.next = null; 
                        / ** 
                         * if (e.hash & oldCap) == 0, subscript unchanged, the original table into an expansion element subscripts same table 
                         * subscript position 
                         * /
                        newtab [J] = loHead; 
                    } 
                    IF (hiTail = null!) { 
                        hiTail.next = null; 
                        / ** 
                         * if (e.hash & oldCap) = 0, the original table into an element of the underlying expansion table! in 
                         * [index + increased expansion amount] position 
                         * /  
                        newtab [J + OLDCAP] = hiHead;
                    } 
                } 
            } 
        } 
    } 
    return newtab; 
}

Two. Get method

We briefly talk about get (Object key) flow, () hash value by hash algorithm by passing a key, by (n - 1) & hash to find the array subscript, if the array index value corresponding to the node just key as it is returned, otherwise find node.next find the next node, to see whether it is treenNode, if so, traversing the red-black tree to find the corresponding node, if not traverse the list to find the node. We look at the source code

V GET public (Object Key) { 
    the Node <K, V> E; 
    // find the first hash value obtained by hash (key), then call getNode (hash, key) to find the node 
    return (e = getNode (hash ( key), key ?)) == null null: e.Value; 
} 
/ ** 
 * Map.get the Implements and Related Methods 
 * 
 * @param hash hash for Key 
 * @param Key at The Key 
 * @return at The the Node, or null IF none 
 * / 
the Node Final <K, V> the getNode (int the hash, Object Key) { 
    the Node <K, V> [] Tab; the Node <K, V> First, E; n-int; K K; 
    // by (n - 1) & hash find a node on a first position corresponding to an array of 
    IF ((Tab = Table) = null && (= n-tab.length)> 0 &&! 
        (first Tab = [(n--. 1) & hash]) = null! ) { 
        // If the node is just the same as the key value, returned directly 
        IF (first.hash the hash == && 
            ((K = first.key) == key || (key! = null &&key.equals(k))))
            First return; 
        // if not identical, to find longer down 
        IF ((E = first.next) = null!) { 
            // If treeNode, then traverse the red-black tree to find the corresponding Node 
            IF (First the instanceof the TreeNode) 
                return (( the TreeNode <K, V>) First) .getTreeNode (the hash, Key); 
            // if list, traversing the list to find the corresponding Node 
            do { 
                IF (e.hash the hash == && 
                    ((K = e.key) == Key || (! = null && key.equals Key (K)))) 
                    return E; 
            !} the while ((E = e.next) = null); 
        } 
    } 
    return null; 
}

This is the core of several methods, although there are many common methods HashMap, but in general these and related methods, or implement similar logic, there is not more to say.

III. Summary

Based on the basic concepts and the previous chapter on the underlying structure, the source from the perspective of the expansion mechanism and to explain the principles of access, analyzes the put and get methods, the core is put method hash (), putVal (), resize (), the core of the get method for getNode (), if imperfect, please criticism and hope to make progress together, thank you!

Guess you like

Origin www.linuxidc.com/Linux/2019-08/160022.htm