Rewrite jdk source: HashMap method of optimizing think resize

        Tips: This reasoning is not accurate, because in most cases the hash values are different elements of the array in the same position in the HashMap , but the whole thinking process is complete, interested can look at.


Did not talk much, we directly see the resize method HashMap Source:

Focus 715-744 line, I direct that conclusion, I will use this line of code to replace the nearly 30-line, as follows:

newTab[e.hash & (newCap - 1)] = e;

You probably saw it, and this above  e.next == null operation after judgment is the same, so you can directly determine merge these two , of a resize method after rewritten as follows:

final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                       newTab[e.hash & (newCap - 1)] = e;
                    }
                }
            }
        }
        return newTab;
    }

  

Analytical thinking:

Let's analyze what my thought process, in fact, my first thought is that this version of the following:

if ((e.hash & oldCap) == 0) {
	newTab[j] = e;
}else {
	newTab[j+ oldCap]= e;
}

Can you then feel better, there is the following line of code 2:

  newTab[e.hash & oldCap == 0?j:j+ oldCap] = e;

I could have been here can end, but I suddenly found that could be more concise, so that the line of code that begins with, and the source of the 712 line as well, so you can directly merged.

After I found this optimization is to consider the class inheritance HashMap to get hold of the test to test, but the method is not public resize modified and final keywords is modified, it must have copied the entire HashMap source code, and then modify its resize method validation can be realized, I think too much trouble, but did not need.

So here, then I will be thinking of the original author to verify the accuracy of my optimized reverse push by the way. We continue to analyze the 715-744 lines of code resize method, it first defines loHead, loTail and hiHead, hiTail these four variables, the purpose is to distinguish (e.hash & oldCap) == 0 and (e.hash & oldCap)! = 0, this is the case because the expansion in the HashMap e.hash & 2n-1 can only be the index value of the original array subscript j index, or the subscript of the array plus the length of the original the original array  j + oldCap . But in fact, no need to define four variables, only need to define two head and tail can. why? Now look at the code we focus on do-while loop:

Key line 721, can be seen e.hash () and oldCap these two variables  do-while loop is constant, since e is the entire list at the same position of the original array, they hash values It must be equal. Based on this conclusion, we can optimize what he thought 715-744 lines of code on the original author, as follows:

                        Node<K,V> head = null, tail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if (tail == null)
                                head = e;
                            else
                                tail.next = e;
                            tail = e;
                        } while ((e = next) != null);
                        if (tail != null) {
                            tail.next = null;
                            newTab[(head.hash & oldCap) == 0?j:j+ oldCap] =  head;
                        }

Only the head and tail 2 variables to complete the original function. Wrote here, I think you probably know, the above analysis of the code can be seen, in fact, head and tail variable and of little practical use. After the end of the do-while loop , first determine whether the tail is empty, this line is not completely necessary, since it has been determined in the e line 588 is not null, the following:

So after entering the loop tail certainly not empty, then use the hash value oldCap head of the bit after operation to get the index position in the array, will head into the new array, this line is simply superfluous, because head = e only when the first cycle of the assignment, it is not necessary the whole do-while loop, directly replace the head with e is not in question, after this has been replaced with e line of code that I'm beginning optimization:

newTab[(e.hash & oldCap) == 0?j:j+ oldCap] = e;

Since it is then found e.hash fixed value, there is no need of ternary expressions, directly e.hash & (newCap - 1) to give the subscript index on the line:

 newTab[e.hash & (newCap - 1)] = e;

So we thought we launched the original author of this optimized code.

to sum up:

I first saw the 715-744 line of this code is somewhat incomprehensible, always feel a little awkward, after reading it again, I get to the conclusion that newTab [j] = e;

So after thinking and analysis after this line has been optimized code:

 newTab[e.hash & (newCap - 1)] = e;

Also in the source code you can feel a bit gaudy, too comprehensive. In fact, we look back and think, you will find the whole purpose of the original 715-744 line is copied into the list the order of the array to the new array to avoid jdk1.7 infinite loop in the expansion of data migration time, it is actually very simple can do the

If students have a problem, you can come out in the comments middle finger

Published 25 original articles · won praise 51 · views 20000 +

Guess you like

Origin blog.csdn.net/Royal_lr/article/details/103196010