On the Java source code Meditations of HashMap

Question

  1. HashMap usage scenarios
  2. HashMap works
  3. HashMap achieve distinction and JDK8 of JDK7
  4. HashMap is thread safe? If you have any problems of insecurity, there are thread-safe solution?

Answer

1. HashMap usage scenarios

When a program needs to store a number of key-value pair, for example, a data dictionary, global variables and other types of parameters may be used as a data structure stored HashMap object.

2. HashMap works

The key can quickly locate HashMap hashCode to the array index, if the collision occurs, a node down list to find a node down, the time complexity is the length of the list, O (n), in Java8, when more than 8 list element after months, the list will be automatically converted to red-black tree, the time complexity becomes O (logN), improved search efficiency.

First look at HashMap in a few more key member variables.

/**
 * 默认数组长度16,长度保持2^n,可扩容,扩容后数组为原来的2倍。
 */
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

/**
 * 数组最大长度2^30
 */
static final int MAXIMUM_CAPACITY = 1 << 30;

/**
 * 默认负载因子,0.75
 */
static final float DEFAULT_LOAD_FACTOR = 0.75f;

/**
 * 用于判断是否需要将链表转换为红黑树的阈值.
 */
static final int TREEIFY_THRESHOLD = 8;

/**
 * 存放元素的数组,长度保持2^n,可扩容
 */
transient Node<K,V>[] table;

/**
 * HashMap中键值对的数量.
 */
transient int size;

/**
 * threshold = capacity * load factor.超过该阈值需要进行扩容
 */
int threshold;

/**
 * 负载因子
 */
final float loadFactor;

One of the most important parameters that affect performance are two:

  1. Capacity (Capacity) : the number of buckets in the hash table, initial capacity is the capacity to create a hash table.
  2. Load factor (Load factor) : load factor is allowed to increase the capacity before automatically filled metric hash table.

When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table will be re-hash (i.e., reconstruction of the internal data structures), so that a hash table with about twice the number of barrels.

Barrel node, the hash value is defined, key, value and the next node

static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    Node<K,V> next;

    Node(int hash, K key, V value, Node<K,V> next) {
        this.hash = hash;
        this.key = key;
        this.value = value;
        this.next = next;
    }

    public final K getKey()        { return key; }
    public final V getValue()      { return value; }
    public final String toString() { return key + "=" + value; }

    public final int hashCode() {
        return Objects.hashCode(key) ^ Objects.hashCode(value);
    }

    public final V setValue(V newValue) {
        V oldValue = value;
        value = newValue;
        return oldValue;
    }

    public final boolean equals(Object o) {
        if (o == this)
            return true;
        if (o instanceof Map.Entry) {
            Map.Entry<?,?> e = (Map.Entry<?,?>)o;
            if (Objects.equals(key, e.getKey()) &&
                Objects.equals(value, e.getValue()))
                return true;
        }
        return false;
    }
}

Let's look at the most common get, put methods:

When calculating the subscript get, put the method, it is necessary to use a hash method.

/*
 *先获取key的hashCode,另h = key.hashCode()
 *再h进行无符号右移16位
 *将两个结果异或得到最终的key的hash值,i = (n - 1) & hash
 *作为节点下标
static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

/**
 * Implements Map.get and related methods
 *
 * @param hash hash for key
 * @param key the key
 * @return the node, or null if none
 */
final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    //若桶不为空,先判断第一个节点是否要取的节点
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        if (first.hash == hash && // always check first node
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        //若冲突了,则取链表下一个节点,通过判断key是否相等
        if ((e = first.next) != null) {
            //判断是否为红黑树节点,时间复杂度O(logn)
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            do {
                //链表节点,时间复杂度O(n)
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}

get method is relatively simple, the general idea is as follows:

  1. Analyzing the first node to the array, and hash values ​​are equal if the key hits returned
  2. If there is conflict, the node determines whether the linked list, if the list is traversed to find the node key and hash equal return, the time complexity of O (n)
  3. If the list is a node, the node determines whether the red-black trees, red-black tree traversal The key and the hash, equal to find nodes, the time complexity of O (logN)
  4. If traverse the entire hash table miss null is returned

Let's look at the put method:

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

/**
 * @param hash hash for key
 * @param key the key
 * @param value the value to put
 * @param onlyIfAbsent if true, don't change existing value
 * @param evict if false, the table is in creation mode.
 * @return previous value, or null if none
 */
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    //判断当前桶是否为空,为空需进行初始化
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    //根据hash值判断当前节点是否为空(没有碰撞),为空新建一个节点,并将key,value传进去
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    //若与当前节点碰撞,通过链表方式存放数据
    else {
        Node<K,V> e; K k;
        //根据hash和key判断当前节点是否与要新增的key值相等,若是则返回该节点
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        //判断当前节点是否为红黑树节点,若是则按照红黑树方式写入数据
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        //通过新增一个链表节点,写入数据
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    //链表长度过长,把链表转换成红黑树
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                //key相同时退出循环
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        //覆盖原有节点值
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

The method put general idea is as follows:

  1. Find nodes based on the hash value of the key, if no collision occurs, the key value and write the new node
  2. If the collision, and by the way the list is stored in the bucket
  3. If the list is filled up conversion, longer than TREEIFY_THRESHOLD (default 8), put the list of red-black tree
  4. If the key already exists, replace the old value
  5. If the entire barrel is full, the load factor exceeds the threshold value threshold load factor * current capacity of current capacity, the need for expansion of a resize ()

3. HashMap realize the difference in JDK7 and JDK8

The main difference is the use of an array + JDK7 HashMap linked list implementation, and JDK8 HashMap takes an array + + list red-black tree implementation, improve query efficiency.

Java8 HashMap structure

4. HashMap thread safety

HashMap is not thread-safe code is not concurrent treatment, may result in data inconsistency when operating in a multi-threaded. Collections can synchronizedMap method of the HashMap thread-safe capability, or use ConcurrentHashMap. Then follow-up study on the ConcurrentHashMap.

Guess you like

Origin www.cnblogs.com/universal/p/11128264.html