HashMap源码阅读分析（JDK1.8）

HashMap是JDK提供的经典容器之一，最近刚好时间充裕，于是自己看了一遍hashMap的源码实现，不同版本的JDK，HashMap的实现方式有所不同，本文主要针对JDK1.8的源码进行分析，至于各版本实现方式的不同，本文不做讨论，下面直接开始。

一、hash函数分析

map的put方法，首先调用的就是hash函数，返回key的hash值，其函数方法如下：

    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

函数结构比较简单，但是我相信很多人看到这里是懵的，不知道函数这样处理的目的是什么，下面我写了一个小demo来帮助大家理解这个函数。

public class Test {


    public static void main(String[] args) {

        //int是4个字节  4*8位=32位
        //下面的输出为110101001101100011010011000  这是有效位，前面用0补齐，则”uu888“的hashCode值的二进制结果为：        00000110101001101100011010011000
        System.out.println(Integer.toBinaryString("uu888".hashCode()));
        //将上面得出的二进制数右移16位，结果为11010100110，这是有效位，前面用0补齐，则”uu888“的hashCode值右移16位的结果为： 00000000000000000000011010100110
        //上面的结果可以理解为将uu888的hashCode值的二进制结果的低16位去掉，高位用0补齐
        System.out.println(Integer.toBinaryString("uu888".hashCode() >> 16));
        //将上面的两个二进制值进行异或运算（对应位相同为0，不同为1），结果为110101001101100000000111110，用0补全高位结果为：00000110101001101100000000111110
        System.out.println(Integer.toBinaryString("uu888".hashCode() ^ ("uu888".hashCode() >> 16)));
        //将二进制结果转换为10进制数
        int hashCode = Integer.parseInt("00000110101001101100000000111110", 2);
        System.out.println(hashCode);
        System.out.println(getHash("uu888"));
    }

    private static int getHash(Object key) {
        //^  按位异或运算，对应位相同为0，不同为1
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);

    }

}

执行结果如下：

上面的结果只是为了验证我的推导过程的正确性，用Integer.toBinaryString（）这个函数可以得到一个10进制数的二进制有效位，因为位运算都是基于二进制的，所以说明中我进行了补全，其实一句话说明就是，这个hash函数获取的就是key的hashCode的值的二进制数及其自身无符号右移16位之后的值的异或运算后的结果，至于为什么要这么做，主要是为了将这个二进制数（32位）的低16位和高16位打乱，使其散列尽可能均匀。

备注：右移16位，相当于将32位的二进制数的高16位移动到低16位，高16位用0补齐，可以根据我给的例子观察出来。

二、HashMap的put（）方法源码注释分析

 /**

     * Implements Map.put and related methods

     *

     * @param hash hash for key                                   ->存储key对应的hash值

     * @param key the key                                         ->存储的key

     * @param value the value to put                              ->存储的value

     * @param onlyIfAbsent if true, don't change existing value   ->如果为true，仅当key不存在时添加，即不覆盖现有的值

     * @param evict if false, the table is in creation mode.      ->暂时不太理解用处，后面补充上~

     * @return previous value, or null if none

     */

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,

                   boolean evict) {

        Node<K,V>[] tab;

        Node<K,V> p;

        int n, i;

        if ((tab = table) == null || (n = tab.length) == 0) //table未初始化或者长度为0，调用resize()进行表（Node数组）的初始化 //resize()函数后面单独说明 n = (tab = resize()).length; if ((p = tab[i = (n - 1) & hash]) == null) //(n - 1) & hash等价于hash值对table长度取模，即通过hash获取存储数据的槽位 //尝试获取该槽位的节点值p且p为空，说明该槽位没有存放过数据，直接将put的数据存放在该槽位即可 tab[i] = newNode(hash, key, value, null); else { //该槽位已经存储了值，p不为空！  Node<K,V> e; K k; if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) //判断p节点的key和put的key相等 将p赋值给e e = p; else if (p instanceof TreeNode) //p是红黑树节点，将新的key、value追加到树中 e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { //p为单向链表，将新的key、value追加到链表尾部 for (int binCount = 0; ; ++binCount) { if ((e = p.next) == null) { //当e的下一个节点为空，说明当前的e就是链表尾节点 p.next = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st //链表长度达到TREEIFY_THRESHOLD=8，将链表转换为红黑树  treeifyBin(tab, hash); break; } if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) //在寻找尾节点的过程中找到了与put的key相同的key，停止尾节点寻找 break; p = e; } } if (e != null) { // existing mapping for key //key已经存在，根据onlyIfAbsent判断是否要用心的value替换老的value //onlyIfAbsent = false || oldValue == null 用新值替换老值  V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; //节点修改完成后回调->扩展点，Callbacks to allow LinkedHashMap post-actions  afterNodeAccess(e); return oldValue; } } ++modCount; if (++size > threshold) //新增后的元素个数>threshold,进行扩容  resize(); //节点新增后回调->扩展点  afterNodeInsertion(evict); return null; }

上面的分析过程中，有两处用到了resize函数，一是table未初始化的时候，一是新增元素后，如果元素总数size大于threshold的时，下面详细分析下这个函数。

三、HashMap扩容函数resize()方法源码注释分析

 /**

     * Initializes or doubles table size.  If null, allocates in

     * accord with initial capacity target held in field threshold.

     * Otherwise, because we are using power-of-two expansion, the

     * elements from each bin must either stay at same index, or move

     * with a power of two offset in the new table.

     *

     * @return the table

     */

    final Node<K,V>[] resize() {

        //获取到老table的引用

        Node<K,V>[] oldTab = table;

        //获取老table的长度，如果未初始化，就是0

        int oldCap = (oldTab == null) ? 0 : oldTab.length;         //获取老table的阀值threshold          int oldThr = threshold;         //根据老table的长度和阀值获取新table的长度和阀值          int newCap, newThr = 0;         if (oldCap > 0) {             //如果老table的长度>0              if (oldCap >= MAXIMUM_CAPACITY) {                 //如果老table的长度>MAXIMUM_CAPACITY(1 << 30 = 1073741824,这个值也差不多是最大int值的一半),                 //将阀值threshold的值设置为Integer.MAX_VALUE(2147483647)，返回老table                  threshold = Integer.MAX_VALUE;                 return oldTab;             }             else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&                      //如果老table的长度<MAXIMUM_CAPACITY,那就将新的table长度赋值为老table长度的两倍（oldCap << 1 = oldCap*2）                      //如果这个两倍的新table长度小于MAXIMUM_CAPACITY并且满足老table长度>=DEFAULT_INITIAL_CAPACITY(1<<4=16)                      //则将新的阀值newThr扩展为老阀值的两倍                       oldCap >= DEFAULT_INITIAL_CAPACITY)                 newThr = oldThr << 1; // double threshold          }         else if (oldThr > 0) // initial capacity was placed in threshold             //如果老的阀值>0,新的数table长度用老的阀值代替              newCap = oldThr;         else {               // zero initial threshold signifies using defaults             //map创建的时候会走这里！             //新的table长度newCap为DEFAULT_INITIAL_CAPACITY(1<<4=16)             //新的阀值大小newThr为(int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY) = 0.75*16=12;              newCap = DEFAULT_INITIAL_CAPACITY;             newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);         }         if (newThr == 0) {           //如果新的阀值newThr为0，新的table长度*加载因子的值和新的table长度都小于MAXIMUM_CAPACITY（1073741824）           //则新的阀值为(float)newCap * loadFactor，否则为Integer.MAX_VALUE（2147483647）              float ft = (float)newCap * loadFactor;             newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?                       (int)ft : Integer.MAX_VALUE);         }         threshold = newThr;         @SuppressWarnings({"rawtypes","unchecked"})             //根据上面计算出的新的table长度，初始化新的table，并将table指向新创建的table              Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];         table = newTab;         if (oldTab != null) {             //如果老table不为空（即扩容操作），将老table的数据迁移到新的table              for (int j = 0; j < oldCap; ++j) {                 Node<K,V> e;                 if ((e = oldTab[j]) != null) {                     //将老table的对应位置赋值为空                      oldTab[j] = null;                     if (e.next == null)                         //对应位置只有一个数据，直接寻找到该数据在新的table的槽位赋值即可                         //（e.hash & (newCap - 1)即取模运算）                          newTab[e.hash & (newCap - 1)] = e;                     else if (e instanceof TreeNode)                         //对应位点是红黑树，将该位点的红黑树重建                          ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);                     else { // preserve order                          //对应位点是普通单向链表，对该链表的元素遍历进行重新hash分布                          Node<K,V> loHead = null, loTail = null; //存放索引位置不变的元素， loHead用来标记链表，loTail用来拼接链表                          Node<K,V> hiHead = null, hiTail = null; //存放索引位置为当前位置j+oldCap的元素， hiHead用来标记链表，hiTail用来拼接链表                          Node<K,V> next;                         do {                             next = e.next;                             if ((e.hash & oldCap) == 0) {                                 //e.hash & oldCap代表新索引和原始索引一样                                  if (loTail == null)                                     //loTail为空   loHead->e 头指针指向第一个元素                                      loHead = e;                                 else //尾指针不为空，将当前元素放在尾指针后面                                     loTail.next = e; //移动尾指针指向当前元素，即链表的最后有一个元素                                 loTail = e;                             }                             else {                                 if (hiTail == null)                                     hiHead = e;                                 else                                     hiTail.next = e;                                     hiTail = e;                          } } while ((e = next) != null); if (loTail != null) { //尾指针元素存在，说明链表不为空 //将尾部元素的下一个元素赋值为空，并将链表放到位置为j的桶中 loTail.next = null; newTab[j] = loHead; } if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; }

三、HashMap的get（）方法源码注释分析

/**
     * Implements Map.get and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @return the node, or null if none
     */
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; 
        Node<K,V> first, e; 
        int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) { //首先通过hash对table长度n取模（[(n - 1) & hash并获取第一个元素引用first if (first.hash == hash && // always check first node ((k = first.key) == key || (key != null && key.equals(k)))) //第一个元素就是要查询的值，直接返回！！ return first; if ((e = first.next) != null) { if (first instanceof TreeNode) //该槽位的节点是红黑树，从红黑树中查询返回 return ((TreeNode<K,V>)first).getTreeNode(hash, key); do { //该槽位是普通链接，遍历链表查询返回 if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) return e; } while ((e = e.next) != null); } } return null; }

以上就是HashMap几个重要函数的源码分析，当然还有比较棘手的红黑树的处理，下次再专门写一篇文章说明。

HashMap源码阅读分析（JDK1.8）

猜你喜欢