文章目录

一、hashMap的jdk1.7和jdk1.8区别

1、实现方式：
2、新节点插入到链表是的插入顺序不同
3、jdk8的hash算法有所简化
扩容机制有所优化
其他区别

节点表示

关于jdk7HashMap的一些细节
关于jdk8HashMap的一些细节
关于HashMap的一些其他细节

一、hashMap的jdk1.7和jdk1.8区别

1、实现方式：

jdk7中使用数组+链表来实现，jdk8使用的数组+链表+红黑树

2、新节点插入到链表是的插入顺序不同

jdk7插入在头部，jdk8插入在尾部
jdk7中，如何在头部插入？看addEntry的createEntry方法：

    /**
     * Like addEntry except that this version is used when creating entries
     * as part of Map construction or "pseudo-construction" (cloning,
     * deserialization).  This version needn't worry about resizing the table.
     *
     * Subclass overrides this to alter the behavior of HashMap(Map),
     * clone, and readObject.
     */
    void createEntry(int hash, K key, V value, int bucketIndex) {
        Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
    }

其实很简单的道理，就是创建一个new Entry，这个newEntry的next是e（参照new Entry<>(hash,key,value,e)的实现），这个e就是之前的头结点，然后把当前table的位置放上新建的Entry，也就是插入在链表的头部了。为什么jdk7要插入在头部呢？因为插入头部效率高，不需要遍历整个链表。
jdk8中，代码放在尾部，代码实现是putVal的一段代码：

                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if

可以很容易看出是放在尾部。
jdk8中为什么新来的元素是放在节点的后面？我们知道jdk8链表大于8的时候就要树化，本身就要算这个链表的长度有多大，链表算大小就要一个个去遍历了，遍历到了才能知道数量，也可以直接把数据放到后面了，这样也是很方便，减少了把数据移动到头数组位置这一步，效率也提高了。而且，jdk7中出现的那个HashMap死循环bug不会有了，因为这里是只是平移，没有调换顺序。

3、jdk8的hash算法有所简化

jdk7默认初始化大小16，加载因子0.75。如果传入了size，会变为大于等于当前值的2的n次方的最小的数。

        // Find a power of 2 >= toSize
        int capacity = roundUpToPowerOf2(toSize);

为什么是2次方数？因为indexFor方法的时候，h&(length-1),length是2的次方，那么length-1总是00011111等后面都是1的数，h&它之后，其实就相当于取余，与的效率比取余高，所以用了这种方式达到高效率。
下面是jdk7的indexfor的实现：

    /**
     * Returns index for hash code h.
     */
    static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);
    }

另外在hash函数里面方法里面，数会经过右移，为什么要右移？因为取余操作都是操作低位，hash碰撞的概率会提高，为了减少hash碰撞，右移就可以将高位也参与运算，减少了hash碰撞。
哈希函数如下：

    /**
     * Retrieve object hash code and applies a supplemental hash function to the
     * result hash, which defends against poor quality hash functions.  This is
     * critical because HashMap uses power-of-two length hash tables, that
     * otherwise encounter collisions for hashCodes that do not differ
     * in lower bits. Note: Null keys always map to hash 0, thus index 0.
     */
    final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

所以总体来说jdk7的hash算法复杂
jdk8的右移比较简单，没有jdk7那么复杂。
jdk8的哈希函数如下：

    /* ---------------- Static utilities -------------- */

    /**
     * Computes key.hashCode() and spreads (XORs) higher bits of hash
     * to lower.  Because the table uses power-of-two masking, sets of
     * hashes that vary only in bits above the current mask will
     * always collide. (Among known examples are sets of Float keys
     * holding consecutive whole numbers in small tables.)  So we
     * apply a transform that spreads the impact of higher bits
     * downward. There is a tradeoff between speed, utility, and
     * quality of bit-spreading. Because many common sets of hashes
     * are already reasonably distributed (so don't benefit from
     * spreading), and because we use trees to handle large sets of
     * collisions in bins, we just XOR some shifted bits in the
     * cheapest possible way to reduce systematic lossage, as well as
     * to incorporate impact of the highest bits that would otherwise
     * never be used in index calculations because of table bounds.
     */
    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

为什么jdk8的hash函数会变简单？jdk8中我们知道用的是链表过度到红黑树，效率会提高，所以jdk8提高查询效率的地方由红黑树去实现，没必要像jdk那样右移那么复杂。

扩容机制有所优化

在jdk8中，实际上也是通过取余找到元素所在的数组的位置的，jdk8里面取余的方式在putVal里面：

i = (n - 1) & hash

上面那段代码就是。
我们假设，在扩容之前，key取余之后留下了n位。扩容之后，容量变为2倍，所以key取余得到的就有n+1位。在这n+1位里面，如果第1位是0，那么扩容前后这个key的位置还是在相同的位置（因为hash相同，并且余数的第1位是0，和之前n位的时候一样，所以余数还是一样，位置就一样了）；如果这n+1位的第一位是1，那么就和之前的不同，那么这个key就应该放在之前的位置再加上之前整个数组的长度的位置，也就是这段代码：

                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;

这样子就减少了移动所有数据带来的消耗，jdk为了优化，真是“费尽心思”。

其他区别

节点表示

在jdk7中，节点是Entry，在jdk8中节点是Node，其实是一样的。

关于jdk7HashMap的一些细节

1、jdk7里面addEntry方法扩容的条件size>threshold，还有一个很容易忽略的，就是null!=table[bucketIndex],这个是什么意思？意思是如果当前放进来的值的那个位置也被占用了，才进行扩容，否则还是放到那个空的位置就行了，反正不存在hash冲突。（但是在jdk8里面取消了这个限制，只要达到了threshold，不管原来的位置空不空，还是要扩容）

2、jdk7resize方法多线程死循环的bug：http://www.importnew.com/22011.html 在jdk8的这个bug已经解决了。

关于jdk8HashMap的一些细节

1、jdk8 默认初始化大小16，加载因子0.75。还有一个默认的树化的大小8。非树化大小为6，也就是红黑树的元素小于6的时候，又会变回一个链表。为什么是8和6？
参考：
HashMap在JDK1.8及以后的版本中引入了红黑树结构，若桶中链表元素个数大于等于8时，链表转换成树结构；若桶中链表元素个数小于等于6时，树结构还原成链表。因为红黑树的平均查找长度是log(n)，长度为8的时候，平均查找长度为3，如果继续使用链表，平均查找长度为8/2=4，这才有转换为树的必要。链表长度如果是小于等于6，6/2=3，虽然速度也很快的，但是转化为树结构和生成树的时间并不会太短。
还有选择6和8，中间有个差值7可以有效防止链表和树频繁转换。假设一下，如果设计成链表个数超过8则链表转换成树结构，链表个数小于8则树结构转换成链表，如果一个HashMap不停的插入、删除元素，链表个数在8左右徘徊，就会频繁的发生树转链表、链表转树，效率会很低。
2、jdk8的putVal方法：i = (n - 1) & hash这个其实也就是取余操作了。

关于HashMap的一些其他细节

1、为什么重写对象equals方法的时候，要重写hashcode，跟hashmap有关系吗？
java object类默认有一个hashcode方法，返回的是对象的地址。
hashmap是根据hashcode返回value的，如果没有重写hashcode，即使重写了equals方法之后，两个对象equals之后相同，但是hash之后却不同，在使用hashMap的时候通过一个对象作为key去获取另外一个对象为key存进去的value的时候，很有可能返回为空，这就是原因。所以和hashmap有关系，不如说和hashcode有关系。

2、ConcurrentModificationException为什么会有这个错误？
首先我们看一段代码：

  public static void main(String[] args) {
        HashMap<Integer,Integer> hashMap = new HashMap<>();
        hashMap.put(1,1);
        hashMap.put(2,2);
        for(Integer key:hashMap.keySet()){
            if(key.equals(1)){
                hashMap.remove(1);
            }
        }
    }

上面这个会报错
下面这个不会：

public static void main(String[] args) {
        HashMap<Integer,Integer> hashMap = new HashMap<>();
        hashMap.put(1,1);
        hashMap.put(2,2);
        for(Integer key:hashMap.keySet()){
            if(key.equals(2)){
                hashMap.remove(1);
            }
        }
    }

为什么呢？我们把上面的代码转换一下，我们把foreach变为下面这个也是一样的原理：

 Iterator<Integer> iterator = hashMap.keySet().iterator();
        while(iterator.hasNext()){
            Integer key = iterator.next();
            if(key.equals(1)){
                hashMap.remove(1);
            }
        }

看源码expectedModCount的位置，HashIterator.nextNode的位置会出现这个错误，看看源码。

        final Node<K,V> nextNode() {
            Node<K,V>[] t;
            Node<K,V> e = next;
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            if (e == null)
                throw new NoSuchElementException();
            if ((next = (current = e).next) == null && (t = table) != null) {
                do {} while (index < t.length && (next = t[index++]) == null);
            }
            return e;
        }

在上面的例子中，到了一定的时间，expectedModCount是2，但是modCount变成3，就会报错。
在多线程情况下，一个线程在遍历，一个线程在remove就会出现这种情况。
其实也不一定在最后面添加的判断就不会，具体看hash实现情况。

3、设计hashmap需要解决的问题，1、找到hash函数 2、解决冲突

HashMap原理jdk7和jdk8的区别