TreeMap and LinkedHashMap core source code analysis

TreeMap and LinkedHashMap core source code analysis

After getting familiar with HashMap, let's now take a look at TreeMap and LinkedHashMap to see how TreeMap is sorted according to keys, and how LinkedHashMap is accessed using two strategies.

One: Basic knowledge

Before understanding TreeMap, let's look at the two ways of sorting in daily work. As the basic reserve for our learning, the two ways are:

  1. Implement the Comparable interface;
  2. Use the external sorter Comparator to sort;

Now let's look at the code implementation of these two sorting methods:

@Data
class DTO implements Comparable<DTO> {
    
    

    private Integer id;

    public DTO(Integer id) {
    
    
        this.id = id;
    }

    public Integer getId() {
    
    
        return id;
    }

    @Override
    public int compareTo(DTO o) {
    
    
        //默认从小到大排序
        return id - o.getId();
    }
}

@Test
public void testComparable1() {
    
    
    // 第一种排序,从小到大排序,实现 Comparable 的 compareTo 方法进行排序
    List<DTO> list = new ArrayList<>();
    for (int i = 5; i > 0; i--) {
    
    
        list.add(new DTO(i));
    }
    Collections.sort(list);
    log.info(JSON.toJSONString(list));
}

@Test
public void testComparable2() {
    
    
    // 第二种排序,从大到小排序,利用外部排序器 Comparator 进行排序
    Comparator comparator = (Comparator<DTO>) (o1, o2) -> o2.getId() - o1.getId();
    List<DTO> list2 = new ArrayList<>();
    for (int i = 5; i > 0; i--) {
    
    
        list2.add(new DTO(i));
    }
    Collections.sort(list2, comparator);
    log.info(JSON.toJSONString(list2));
}

Sort result 1
Sort result 2

The first sorting output results from small to large, the result is: [{"id":1},{"id":2},{"id":3},{"id":4},{"id ”:5}];
The second output is just the opposite, the result is: [{"id":5},{"id":4},{"id":3},{"id":2} ,{"Id":1}].
The above two are the methods of sorting by Comparable and Comparator respectively, and TreeMap also uses this principle to realize the sorting of keys. Let's take a look together.

2: The overall architecture of TreeMap

The underlying data structure of TreeMap is the red-black tree, which is the same as the red-black tree structure of HashMap.

The difference is that TreeMap takes advantage of the red-black tree's nature that the left node is small and the right node is large. It sorts according to the key, so that each element can be inserted into the appropriate position of the red-black tree, maintains the key size relationship, and is suitable for key Scenes that need to be sorted.

Because the bottom layer uses a balanced red-black tree structure, the time complexity of methods such as containsKey, get, put, and remove are all log(n).

2.1: TreeMap properties

Common attributes of TreeMap are:

//比较器,如果外部有传进来 Comparator 比较器,首先用外部的
//如果外部比较器为空,则使用 key 自己实现的 Comparable#compareTo 方法
//比较手段和上面日常工作中的比较 demo 是一致的
private final Comparator<? super K> comparator;

//红黑树的根节点
private transient Entry<K,V> root;

//红黑树的已有元素大小
private transient int size = 0;

//树结构变化的版本号,用于迭代过程中的快速失败场景
private transient int modCount = 0;

//红黑树的节点
static final class Entry<K,V> implements Map.Entry<K,V> {
    
    }

2.2: New node

The steps for adding nodes to TreeMap are as follows:

  1. Determine whether the node of the red-black tree is empty. If it is empty, the new node will be directly used as the root node. The code is as follows:
    Entry<K,V> t = root;
    //红黑树根节点为空,直接新建
    if (t == null) {
          
          
        // compare 方法限制了 key 不能为 null
        compare(key, key); // type (and possibly null) check
        // 成为根节点
        root = new Entry<>(key, value, null);
        size = 1;
        modCount++;
        return null;
    }
    
  2. According to the characteristics of the red-black tree, the left is small, the right is large, and the parent node of the new node should be found.
    Comparator<? super K> cpr = comparator;
    if (cpr != null) {
          
          
        //自旋找到 key 应该新增的位置,就是应该挂载那个节点的头上
        do {
          
          
            //一次循环结束时,parent 就是上次比过的对象
            parent = t;
            // 通过 compare 来比较 key 的大小
            cmp = cpr.compare(key, t.key);
            //key 小于 t,把 t 左边的值赋予 t,因为红黑树左边的值比较小,循环再比
            if (cmp < 0)
                t = t.left;
            //key 大于 t,把 t 右边的值赋予 t,因为红黑树右边的值比较大,循环再比
            else if (cmp > 0)
                t = t.right;
            //如果相等的话,直接覆盖原值
            else
                return t.setValue(value);
            // t 为空,说明已经到叶子节点了
        } while (t != null);
    }
    
  3. Insert a new node on the left or right of the parent node, the code is as follows:
    //cmp 代表最后一次对比的大小,小于 0 ,代表 e 在上一节点的左边
    if (cmp < 0)
        parent.left = e;
    //cmp 代表最后一次对比的大小,大于 0 ,代表 e 在上一节点的右边,相等的情况第二步已经处理了。
    else
        parent.right = e;
    
  4. The coloring rotates, reaches balance, and ends.

We can see from the above source code:

  1. When adding a new node, it uses the characteristics of the red-black tree that the left is small and the right is large, and the root node is continuously searched down until the node is found to be null. If the node is null, it means that the leaf node is reached;
  2. During the search process, it is found that the key value already exists, and it is directly overwritten;
  3. TreeMap prohibits the key from being a null value;

2.3: TreeMap summary

TreeMap is relatively simple. Red-black trees and HashMap are similar. The key is to compare the size of keys through compare, and then use the characteristics of red-black trees to find their own position for each key to maintain The key size sort order.

Three: LinkedHashMap overall architecture

LinkedHashMap itself inherits HashMap, so it has all the features of HashMap, and on this basis, it also provides two major features:

  1. Visit in the order of insertion;
  2. Achieve the least access and first delete function, the purpose is to automatically delete keys that have not been accessed for a long time;

3.1: Access in order of insertion

3.1.1: LinkedHashMap linked list structure

Let's take a look at what attributes are added to LinkedHashMap to achieve the structure of the linked list:

// 链表头
transient LinkedHashMap.Entry<K,V> head;

// 链表尾
transient LinkedHashMap.Entry<K,V> tail;

// 继承 Node,为数组的每个元素增加了 before 和 after 属性
static class Entry<K,V> extends HashMap.Node<K,V> {
    
    
    Entry<K,V> before, after;
    Entry(int hash, K key, V value, Node<K,V> next) {
    
    
        super(hash, key, value, next);
    }
}

// 控制两种访问模式的字段,默认 false
// true 按照访问顺序,会把经常访问的 key 放到队尾
// false 按照插入顺序提供访问
final boolean accessOrder;

As can be seen from the new attributes of the above Map, the data structure of LinkedHashMap is very similar to replacing each element of LinkedList with the Node of HashMap, like a combination of the two. It is precisely because of the addition of these structures that it can The elements of the Map are connected in series to form a linked list, and the linked list can guarantee the order, and the order in which the elements are inserted can be maintained.

3.1.2: How to add in order

When LinkedHashMap is initialized, the default accessOrder is false, which means that access will be provided in the order of insertion. The insert method uses the put method of the parent class HashMap, but overwrites the newNode/newTreeNode and afterNodeAccess methods called during the put method.

The newNode/newTreeNode method controls the appending of new nodes to the end of the linked list, so that every time a new node is appended to the end, the insertion order can be guaranteed. Let's take the newNode source code as an example:

// 新增节点,并追加到链表的尾部
Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {
    
    
    // 新增节点
    LinkedHashMap.Entry<K,V> p =
        new LinkedHashMap.Entry<K,V>(hash, key, value, e);
    // 追加到链表的尾部
    linkNodeLast(p);
    return p;
}
// link at the end of list
private void linkNodeLast(LinkedHashMap.Entry<K,V> p) {
    
    
    LinkedHashMap.Entry<K,V> last = tail;
    // 新增节点等于位节点
    tail = p;
    // last 为空,说明链表为空,首尾节点相等
    if (last == null)
        head = p;
    // 链表有数据,直接建立新增节点和上个尾节点之间的前后关系即可
    else {
    
    
        p.before = last;
        last.after = p;
    }
}

LinkedHashMap adds before and after attributes to each node by adding a head node and a tail node. Each time the node is added, the node is appended to the tail node. When it is added, the insertion order has been maintained. Linked list structure.

3.1.2: Visit in order

LinkedHashMap only provides one-way access, that is, access is performed from beginning to end in the order of insertion, and cannot be accessed in both directions like LinkedList.

We mainly access it through the iterator. When the iterator is initialized, it is accessed from the head node by default. During the iteration, it is enough to continuously access the after node of the current node.

Map provides an iterative method for key, value and entity (node). Assuming we need to iterate the entity, we can use LinkedHashMap.entrySet().iterator() to directly return LinkedHashIterator, LinkedHashIterator is an iterator, we call The nextNode method of the iterator can get the next node. The source code of the iterator is as follows:

// 初始化时,默认从头节点开始访问
LinkedHashIterator() {
    
    
    // 头节点作为第一个访问的节点
    next = head;
    expectedModCount = modCount;
    current = null;
}

final LinkedHashMap.Entry<K,V> nextNode() {
    
    
    LinkedHashMap.Entry<K,V> e = next;
    if (modCount != expectedModCount)// 校验
        throw new ConcurrentModificationException();
    if (e == null)
        throw new NoSuchElementException();
    current = e;
    next = e.after; // 通过链表的 after 结构,找到下一个迭代的节点
    return e;
}

When adding new nodes, we have already maintained the insertion order between elements, so iterative access is very simple, only need to continuously access the next node of the current node.

3.2: Access to the least delete strategy

3.2.1: Case

This strategy is also called LRU (Least recently used, least recently used), which roughly means that frequently accessed elements will be appended to the end of the team, so that the data that is not frequently accessed will naturally be close to the head of the team, and then we can set the deletion strategy , For example, when the number of Map elements is greater than how many, delete the head node, we write a demo to demonstrate.

@Test
public void testAccessOrder() {
    
    
LinkedHashMap<Integer, Integer> map = new LinkedHashMap<Integer, Integer>(4,0.75f,true) {
    
    
  {
    
    
    put(10, 10);
    put(9, 9);
    put(20, 20);
    put(1, 1);
  }

  @Override
  protected boolean removeEldestEntry(Map.Entry<Integer, Integer> eldest) {
    
    
    return size() > 3;
  }
};

log.info("初始化:{}",JSON.toJSONString(map));
Assert.assertNotNull(map.get(9));
log.info("map.get(9):{}",JSON.toJSONString(map));
Assert.assertNotNull(map.get(20));
log.info("map.get(20):{}",JSON.toJSONString(map));

}

The printed result is as follows: As
LRU
you can see, when the map is initialized, we put in four elements, but the result is only three elements, 10 is missing, this is mainly because we overwrite the removeEldestEntry method, we realized if When the number of elements in the map is greater than 3, we delete the element at the head of the team. When put(1, 1) is executed, the 10 at the head of the team is deleted. This reflects that when the deletion strategy we set is reached, it will Automatically delete the head node.

When we call the map.get(9) method, the element 9 is moved to the end of the queue, and when the map.get(20) method is called, the element 20 is moved to the end of the queue. This reflects that the frequently visited node will be moved to the queue. tail.

This example is a good illustration of the least-access delete strategy. Next, let's look at the principle.

3.2.2: The element is transferred to the end of the line

Let's first take a look at why the element will be moved to the end of the queue when getting:

public V get(Object key) {
    
    
    Node<K,V> e;
    // 调用 HashMap  get 方法
    if ((e = getNode(hash(key), key)) == null)
        return null;
    // 如果设置了 LRU 策略
    if (accessOrder)
    // 这个方法把当前 key 移动到队尾
        afterNodeAccess(e);
    return e.value;
}

From the above source code, it can be seen that the current access node is moved to the end of the queue through the afterNodeAccess method. It is not only the get method, but also when the getOrDefault, compute, computeIfAbsent, computeIfPresent, and merge methods are executed. Frequently visited nodes move to the end of the team, so the nodes near the head of the team are naturally the elements that are rarely visited.

3.2.3: Delete strategy

In the above case, when we executed the put method, we found that the head element was deleted. LinkedHashMap itself is not implemented by the put method. Instead, the put method of HashMap is called, but LinkedHashMap implements the afterNodeInsertion method in the put method, which is implemented in this way To delete, let's look at the source code:

// 删除很少被访问的元素,被 HashMap 的 put 方法所调用
void afterNodeInsertion(boolean evict) {
    
     
    // 得到元素头节点
    LinkedHashMap.Entry<K,V> first;
    // removeEldestEntry 来控制删除策略,如果队列不为空,并且删除策略允许删除的情况下,删除头节点
    if (evict && (first = head) != null && removeEldestEntry(first)) {
    
    
        K key = first.key;
        // removeNode 删除头节点
        removeNode(hash(key), key, null, false, true);
    }
}

Four: Summary

Above we mainly talked about the data structure of TreeMap and LinkedHashMap, and analyzed the source code of the core content of the two. We found that both make full use of the characteristics of the underlying data structure. TreeMap uses the characteristics of the red and black trees to sort by. LinkedHashMap simply adds a linked list structure on the basis of HashMap to form the order of nodes, which is very clever.

Guess you like

Origin blog.csdn.net/weixin_38478780/article/details/107904319