LinkedHashMap principle and source code analysis

LinkedHashMap principle and source code analysis

Introduction to Design

First look at its successor system

LinkedHashMap inherited from HashMap has all its functions, except that it adopted a doubly linked list to maintain the order of inserting data, sequential access data, in order to get to know LinkedHashMap need to understand the underlying principles of implementation of HashMap HashMap principle and source code analysis

FIG LinkedHashMap following data structure is, of course, the same list may be converted to red-black tree, here not shown, but they are doubly linked list points to the same manner

Thinking

  1. LinkedHashMap and the difference between HashMap
  2. How LinkedHashMap is to ensure that the insertion order, the order of access
  3. What is the strategy out of LRU cache
  4. LinkedHashMap, accessOrder meaning
  5. Whether LinkedHashMap can realize LRU cache elimination strategy
  6. The other is related to thinking HashMap not repeat them here

Question 3: What is the strategy out of LRU cache

Understand this question can help us better understand the meaning of LinkedHashMap sort, LRU (Least Recently Used) least recently used strategy, we think about when we use the cache, tend to divide the content part of the offer to use, then when the memory is not enough the how to do it? Is not required to eliminate a portion of the infrequently used data, out there are many strategies, while LRU is one of them, he refers to the least recently used data is eliminated, LinkedHashMap practice is visited recently placed on the list Finally, the top of the list is relatively recent infrequently used data, and then if you can eliminate the need to eliminate part of the data U-turn portion.

Other questions we seek answers from the source code analysis.

Variable definitions

Partial variables are defined as follows, understood temporarily not matter described later one by one code

    // 继承了 HashMap 的基础属性的双向链表结点
    static class Entry<K,V> extends HashMap.Node<K,V> {
        Entry<K,V> before, after;
        Entry(int hash, K key, V value, Node<K,V> next) {
            super(hash, key, value, next);
        }
    }

    private static final long serialVersionUID = 3801124242820219131L;

    // 双向链表头结点
    transient LinkedHashMap.Entry<K,V> head;

    // 双向链表尾结点
    transient LinkedHashMap.Entry<K,V> tail;


    // 是否按访问顺序排序
    // 注意这里的排序指的是实现类似的 LRU 功能,访问过的结点数据会被放置链表最后
    final boolean accessOrder;

复制代码

This is the definition HashMap Node node

    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "=" + value; }

        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false;
        }
    }
复制代码

We can see Entry inherited HashMap of Node Based on this more than two nodes are used to maintain two-way linked list nodes around the point to the current node

Construction method

    /**
     * 传入默认容量和装载因子去创建一个 HashMap
     * 并且默认是不支持按照访问顺序排序的
     * @param initialCapacity
     * @param loadFactor
     */
    public LinkedHashMap(int initialCapacity, float loadFactor) {
        super(initialCapacity, loadFactor);
        accessOrder = false;
    }

    /**
     * 传入装载因子创建 HashMap 不支持按照访问书序排序
     * @param initialCapacity
     */
    public LinkedHashMap(int initialCapacity) {
        super(initialCapacity);
        accessOrder = false;
    }

    /**
     * 根据默认参数创建一个 HashMap 不支持按照访问书序排序
     */
    public LinkedHashMap() {
        super();
        accessOrder = false;
    }


    /**
     * 根据一个 Map 去创建 HashMap
     * @param m
     */
    public LinkedHashMap(Map<? extends K, ? extends V> m) {
        // 首先根据默认参数创建一个 HashMap
        super();
        // 不支持排序
        accessOrder = false;
        // 调用 HashMap 的 putMapEntries 方法将数据放入 HashMap 中
        putMapEntries(m, false);
    }

    /**
     * 传入默认容量和装载因子创建 HashMap 同时可以指定是否需要按照访问顺序排序
     * @param initialCapacity
     * @param loadFactor
     * @param accessOrder
     */
    public LinkedHashMap(int initialCapacity,
                         float loadFactor,
                         boolean accessOrder) {
        super(initialCapacity, loadFactor);
        this.accessOrder = accessOrder;
    }

复制代码

LinkedHashMap can see the constructor to be divided into two kinds of

  • To call the constructor HashMap related to the creation of a corresponding HashMap
  • Support accessOrder constructor support sorted access order

Insert data

When we call the put (key, val), we found that actually calls the put method of HashMap, while the put method of HashMap calls in order to ensure the maintenance of a doubly linked list to achieve LinkedHashMap is to achieve subclass by calling the method to complete, see below Code

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        // 如果 table 为 null 先进行初始化
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        // 如果 hash 后制定的槽位为 null 则直接放入数据即可
        if ((p = tab[i = (n - 1) & hash]) == null)
            // 创建结点
            tab[i] = newNode(hash, key, value, null);
        else {
            // 槽位存在数据就需要检查槽位链表是否存在对应的数据
            // 如果有根据策略选择是更新还是放弃
            // 如果没有这执行插入
            Node<K,V> e; K k;
            // 已经存在对应的 key 直接进行赋值后续根据 putIfAbsent 决定是否更新
            if (p.hash == hash &&
                    ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            // 如果当前节点是一个树节点那么将数据放入红黑树中
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                // binCount 临时统计链表数量
                for (int binCount = 0; ; ++binCount) {
                    // 如果不存在对应的 key 则直接执行插入
                    if ((e = p.next) == null) {
                        // 创建一个新结点
                        p.next = newNode(hash, key, value, null);
                        // 当链表中的数据数量大于等于 8 的时候
                        // 需要进行树化
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    // 如果已经存在对应的结点则直接返回后续根据 onlyIfAbsent 决定是否更新
                    if (e.hash == hash &&
                            ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            // 如果存在待更新的值
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                // 是否更新数据
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                // 调用子类的实现方法在访问之后做什么事情
                afterNodeAccess(e);
                return oldValue;
            }
        }
        // 修改次数 + 1
        // 该字段用于后续迭代器 fail-fast 
        ++modCount;
        // 数据量大于 threshold 进行 table 扩容
        if (++size > threshold)
            resize();
        // 调用子类的实现方法在插入之后做什么事情
        afterNodeInsertion(evict);
        return null;
    }
复制代码

The code above is putVal method of HashMap, HashMap read the source code analysis of the students should understand that in LinkedHashMap we look at these three main sections of the code

// 创建结点的时候就会同时设置 LinkedHashMap 的双向链表
tab[i] = newNode(hash, key, value, null);

// 如果在插入数据的时候数据已经存在会调用这个方法
// 意识就是访问到了对应的数据是否需要进行某种排序设置
afterNodeAccess(e);

// 如果插入数据成功会调用这个方法
// 意识是是否需要根据某个子类的缓存对应的功能做一些插入后的事情比如删除头部数据
afterNodeInsertion(evict);
复制代码

We analyzed one by one, first look

newNode()

    
    Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {
        LinkedHashMap.Entry<K,V> p =
            new LinkedHashMap.Entry<K,V>(hash, key, value, e);
        linkNodeLast(p);
        return p;
    }
    
    // 将插入的结点放置到列表最后
    private void linkNodeLast(LinkedHashMap.Entry<K,V> p) {
        LinkedHashMap.Entry<K,V> last = tail;
        tail = p;
        if (last == null)
            head = p;
        else {
            p.before = last;
            last.after = p;
        }
    }
复制代码

We can see from the code, insert data in LinkedHashMap time, the order of the data is sorted according to the order of insertion, inserted data to the top surface

afterNodeAccess(e)

This code calls LinkedHashMap corresponding implementation is, to find the corresponding data to decide whether to place the tail basis of determination conditions, as shown in FIG.

Can be seen from the figure, in a doubly linked list before sorting (B) is a call to get A-> B-> C-> D, after calling the get (B) becomes the order of A-> C-> D- > B, we look at the source code to achieve the following

    // 将当前传入结点设置为尾结点
    void afterNodeAccess(Node<K,V> e) { // move node to last
        LinkedHashMap.Entry<K,V> last;
        // 如果设置了需要按照访问顺序排序并且当前结点不是尾巴结点
        if (accessOrder && (last = tail) != e) {
            // 当前结点复制给 p,并且将当前结点的前后结点分别复制给 b, a
            LinkedHashMap.Entry<K,V> p =
                (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
            // 需要将当前结点设置为尾结点所以尾巴结点的下一个结点肯定是 Null
            p.after = null;
            // 如果前一个结点没有数据,那么当前结点的下一个结点就是 head 结点
            if (b == null)
                head = a;
            // 否则的话,前一个结点的下一个结点就是当前结点的下一个结点,相当于删除掉了当前结点 e
            else
                b.after = a;
            // 如果当前结点的下一个结点不为 null
            // 还需要将当前结点的下一个结点的前一个结点设置为当前结点的上一个结点相当于彻底的删除了当前结点 e
            if (a != null)
                a.before = b;
            // 否则当前结点的下一个结点就是尾结点了
            else
                last = b;
            // 如果 b 为 null,意味着就只有当前结点一个数据所以 head 也是它
            if (last == null)
                head = p;
            else {
                // 否则的话将 p 设置为链表的最后一个结点
                p.before = last;
                last.after = p;
            }
            // 最后再将当前结点设置为 tail 结点
            tail = p;
            // 修改次数 + 1 用于迭代器 fail-fast
            ++modCount;
        }
    }
复制代码

This method is called, the first thing will be to judge whether accessOrder set to true, that is, whether to support sorted access order, if true

  • The end position of the current node is placed
  • After the mobile node, before and after the adjustment points to the node in its original location
  • Special handling only one or two nodes

afterNodeInsertion

Finally, as long as the data put succeed, it will call this method here afterNodeInsertion call is HashMap subclass LinkedHashMap the associated implementation code as follows

    // 在 HashMap 插入数据后回调
    void afterNodeInsertion(boolean evict) { // possibly remove eldest
        LinkedHashMap.Entry<K,V> first;
        // 如果 evict 为 true
        // 并且 removeEldestEntry 返回为 true 
        //(它会根据采用不同的缓存策略子类来订制对应的功能是否需要淘汰掉 first 结点)
        // 默认不会执行
        if (evict && (first = head) != null && removeEldestEntry(first)) {
            K key = first.key;
            // 删除对应的 HashMap 结点
            removeNode(hash(key), key, null, false, true);
        }
    }
复制代码

If you can see this method provides us the time needed to insert data out of the list header data function, removeNode () call is HashMap in removeNode () method

delete data

    public V remove(Object key) {
        Node<K,V> e;
        return (e = removeNode(hash(key), key, null, false, true)) == null ?
                null : e.value;
    }
    
    final Node<K,V> removeNode(int hash, Object key, Object value,
                               boolean matchValue, boolean movable) {
        Node<K,V>[] tab; Node<K,V> p; int n, index;
        // 如果 table 中存在 key 对应的 hash 值
        if ((tab = table) != null && (n = tab.length) > 0 &&
                (p = tab[index = (n - 1) & hash]) != null) {
            Node<K,V> node = null, e; K k; V v;
            // 如果 key 就是对应槽位的 key 则找到数据
            if (p.hash == hash &&
                    ((k = p.key) == key || (key != null && key.equals(k))))
                node = p;
            // 去槽位链表中查找
            else if ((e = p.next) != null) {
                // 如果是一个树去树节点中查找
                if (p instanceof TreeNode)
                    node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
                else {
                    do {
                        // 遍历槽位链表查找对应的数据
                        if (e.hash == hash &&
                                ((k = e.key) == key ||
                                        (key != null && key.equals(k)))) {
                            node = e;
                            break;
                        }
                        p = e;
                    } while ((e = e.next) != null);
                }
            }
            // 如果找到了 key 对应的值
            // 根据后续的判断确定是否需要删除对应的数据结点
            // 默认 remove, matchValue: false 需要进行删除
            if (node != null && (!matchValue || (v = node.value) == value ||
                    (value != null && value.equals(v)))) {
                // 如果是树节点则删除树中的结点

                if (node instanceof TreeNode)
                    ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
                // 如果是 table 槽位上的值,则将其下一个结点复制到槽位上
                else if (node == p)
                    tab[index] = node.next;
                // 如果在槽位链表上删除当前节点
                else
                    p.next = node.next;
                // 修改次数 + 1 用于迭代器 fail-fast
                ++modCount;
                // 数据长度 - 1
                --size;
                // 删除后要做的事情留个子类实现
                afterNodeRemoval(node);
                return node;
            }
        }
        return null;
    }
复制代码

This code will HashMap when it talked about the principles of analysis, our main concern here is afterNodeRemoval (node); this method, we know LinkedHashMap use sequential maintain a doubly linked list node data, then when deleting a data time may have to maintain a doubly linked list corresponding to the data, it is to call this method to complete, as follows

    // HashMap 结点被删除后
    void afterNodeRemoval(Node<K,V> e) { // unlink
        LinkedHashMap.Entry<K,V> p =
            (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
        // 清空当前结点指针
        p.before = p.after = null;
        // 如果没有前一个结点了,那么 e 之后的结点就为 head 结点
        if (b == null)
            head = a;
        // 否则的话上一个结点的下一个结点为 e 的下一个结点也就是删除了 e 点
        else
            b.after = a;
        // 如果当前结点的下一个结点为 Null 的话就说明当前结点本来就是 tail 结点
        // 需要将其前一个结点设置为 tail 结点
        if (a == null)
            tail = b;
        // 否则的话 e 的下一个结点指向 e 的上一个结点相当于彻底的删除了 e
        else
            a.before = b;
    }
复制代码

This code is the deletion of the list of nodes, a node is disconnected

  • Before and after the connection node
  • Delete the current node
  • Consider the position in which the nodes do special processing (header, tail)
  • Only one case of data processing

retrieve data

    /**
     * 获取数据
     * @param key
     * @return
     */
    public V get(Object key) {
        Node<K,V> e;
        // 通过 HashMap 的 getNode() 方法获取对应的结点
        if ((e = getNode(hash(key), key)) == null)
            return null;
        // 如果需要按照访问顺序排序
        if (accessOrder)
            // 调整结点位置实现按照访问顺序排序
            // 将当前结点放到最后
            afterNodeAccess(e);
        return e.value;
    }
复制代码

The function of this code is to call the HashMap getNode node data acquisition, if need sorted order, it would have access to set the data call afterNodeAccess order

fail-fast

fail-fast rapid-fail strategy, it means that if a traversal when accessing data, there are other threads to modify the current map so long throws ConcurrentModificationException, we choose to traverse a key access point of view (LinkedValues ​​and so are all the same )

    final class LinkedKeySet extends AbstractSet<K> {
        // 几个基础访问很简单就是调用外层访问来实现对应的功能
        public final int size()                 { return size; }
        public final void clear()               { LinkedHashMap.this.clear(); }
        public final Iterator<K> iterator() {
            return new LinkedKeyIterator();
        }
        public final boolean contains(Object o) { return containsKey(o); }
        public final boolean remove(Object key) {
            return removeNode(hash(key), key, null, false, true) != null;
        }
        public final Spliterator<K> spliterator()  {
            return Spliterators.spliterator(this, Spliterator.SIZED |
                Spliterator.ORDERED |
                Spliterator.DISTINCT);
        }
        // 可以通过迭代器进行迭代访问所有的 key
        public final void forEach(Consumer<? super K> action) {
            if (action == null)
                throw new NullPointerException();
            int mc = modCount;
            // 从 head 开始遍历
            for (LinkedHashMap.Entry<K,V> e = head; e != null; e = e.after)
                action.accept(e.key);
            // 如果访问的时候发现 LinkedHashMap 被修改了抛错 ConcurrentModificationException
            // 也就是 fail-fast
            if (modCount != mc)
                throw new ConcurrentModificationException();
        }
    }
复制代码

Can be seen mainly guaranteed by modCount this field, such as source code put in about the time of analysis, you can see there is a modCount ++ operation is successful, if we call forEach method at time T1, found modCount = 5, we when traversing find this data has changed so long throw exceptions

Guess you like

Origin juejin.im/post/5d7ca5da5188251ecc40dad3