[Logic of Java programming] Map and Set

HashMap

Map has the concept of keys and values. A key is mapped to a value, Map stores and accesses the value according to the key, and the key cannot be repeated.

HashMap implements the Map interface.

Fundamental

The basic implementation principle of HashMap: there is a hash table inside, that is, an array table, each element table[i] points to a singly linked list, according to the key access value, use the key to calculate the hash value, and take the modulo to get the index position in the array index, and then operate on the singly linked list pointed to by table[index].
When accessing, according to the hash value of the key, it only operates in the corresponding linked list, and will not access other linked lists. When operating on the corresponding linked list, the hash value is also compared first, and if the same, the equals method is used to compare. This requires that the hashCode return value of the same object must be the same. If the key is a custom class, special attention should be paid.

HashMap expansion strategy:

  1. When will it be expanded? When the first element is added, the default allocated size is 16, however, it is not expanded when the size is greater than 16, and the next expansion is related to the threshold (threshold). Threshold represents the threshold. When the size of key-value pairs is greater than or equal to the threshold, expansion is considered. Threshold is generally equal to table.length multiplied by loadFactor (load factor, default 0.75).
  2. How to expand? The length of the table is always a multiple of 2, so the expansion is to first x2 the length of the table, and then convert it. The main job of the transformation is to recalculate the array subscripts of the key-value pairs.

Note: Java8 has optimized the implementation of HashMap. In the case of serious hash collision, that is, when a large number of elements are mapped to the same linked list (specifically at least 8 elements, and the total number of key-value teams is at least 64), Java 8 will convert the linked list into a balanced sorted binary tree.

summary

HashMap implements the Map interface, which can easily access values ​​according to keys, and uses array linked lists and hashes internally to achieve
1. The efficiency of saving and obtaining values ​​according to keys is very high, which is O(1), each one-way The linked list often has only one or a few nodes, which can be located directly and quickly according to the hash value.
2. HashMap supports key is null, when the key is null, it is placed in table[0]
3. The key-value pair in HashMap has no order, because the hash value is random
4. HashMap is not thread-safe, it can be used in a multi-threaded environment Use Hashtable or ConcurrentHashMap.

HashSet

HashSet implements the Set interface. Set represents a container interface with no duplicate elements and no guarantee of order. It extends Collection but does not define any new methods.

principle

HashSet is implemented with HashMap internally, and it has a HashMap instance variable inside it

private transient HashMap<E,Object> map;

Map has keys and values. HashSet is equivalent to only having keys, and the values ​​are the same fixed value. This value is:

// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();

The corresponding add, get and other methods are all indirect methods of calling HashMap.

summary

  1. no repeating elements
  2. It can efficiently add, delete elements, and determine whether an element exists, and the efficiency is O(1)
  3. no order
  4. Not thread safe

TreeMap

In TreeMap, key-value pairs are ordered by keys, and the implementation basis of TreeMap is a sorted binary tree .

Basic usage

TreeMap has two common constructors

public TreeMap();
public TreeMap(Comparator<? super K> comparator)

The first is the default constructor, which requires the keys in the Map to implement the Comparabe interface. When performing various comparisons inside the TreeMap, the compareTo method in the Comparabe interface of the key will be called.
The second interface is a comparator object comparator. If the comparator is not null, the method of the comparator will be called when comparing inside the TreeMap, and the compareTo method of the key will not be called.

principle

TreeMap is internally implemented with a red-black tree, which is a roughly balanced sorted binary tree.
The main members are as follows:

// 比较器
private final Comparator<? super K> comparator;  
// 树的根节点, Entry是节点类型
private transient Entry<K,V> root; 
// 当前键值对个数
private transient int size = 0;

static final class Entry<K,V> implements Map.Entry<K,V> {
    // 键
    K key;
    // 值
    V value;
    // 左孩子
    Entry<K,V> left;
    // 右孩子
    Entry<K,V> right;
    // 父节点
    Entry<K,V> parent;
    // 节点颜色,非黑即红
    boolean color = BLACK;
}

save key-value pairs

public V put(K key, V value) {
    Entry<K,V> t = root;
    // 1. 第一次添加
    if (t == null) {
        // 判断key是否为null
        compare(key, key); // type (and possibly null) check
        // 直接将root指向该节点
        root = new Entry<>(key, value, null);
        size = 1;
        modCount++;
        return null;
    }
    // ...
}
// 当root为null的时候,主要是判断key是否为null
final int compare(Object k1, Object k2) {
    return comparator==null ? ((Comparable<? super K>)k1).compareTo((K)k2)
        : comparator.compare((K)k1, (K)k2);
}
  1. When adding the first node, root is null, mainly to create a new node and set root to point to it. and determine whether the key is null
public V put(K key, V value) {
    // ... 
    // 如果不是第一次添加  
    int cmp;
    Entry<K,V> parent;
    // split comparator and comparable paths
    Comparator<? super K> cpr = comparator;
    // 如果设置了 comparator  
    if (cpr != null) {
        do {
            // 一开始指向跟节点
            parent = t;            
            cmp = cpr.compare(key, t.key);
            // 如果小于根节点,就将t设为左孩子,继续比较
            if (cmp < 0)
                t = t.left;
            // 如果大于,就将t设为右孩子,继续比较    
            else if (cmp > 0)
                t = t.right;
            else
            // 如果有值,表示已经有这个键了
                return t.setValue(value);
         // 如果t为null,则退出循环,parent就指向待插入节点的父节点    
        } while (t != null);
    }
    else {
        if (key == null)
            throw new NullPointerException();
        @SuppressWarnings("unchecked")
            Comparable<? super K> k = (Comparable<? super K>) key;
        do {
            parent = t;
            cmp = k.compareTo(t.key);
            if (cmp < 0)
                t = t.left;
            else if (cmp > 0)
                t = t.right;
            else
                return t.setValue(value);
        } while (t != null);
    }
    Entry<K,V> e = new Entry<>(key, value, parent);
    if (cmp < 0)
        parent.left = e;
    else
        parent.right = e;
    fixAfterInsertion(e);
    size++;
    modCount++;
    return null;
}
  1. If it is not added for the first time, find the parent node first. Finding the parent node is divided into two cases according to whether the comparator is set. The logic of finding the parent node in the two cases is basically the same, except that if the comparator is not set, it is assumed that the key must implement the Comparable interface, and the key cannot be null. Can be null if comparator is set by itself.
  2. fixAfterInsertion(e): Adjust the structure of the number so that it conforms to the constraints of the red-black tree and maintains a roughly balanced balance.

summary

TreeMap also implements the Map interface, but uses red-black tree internally. Red-black tree is a roughly balanced binary tree with relatively high statistical efficiency.
1. The keys are ordered. Search, such as the first, last, a range of keys, etc.
2. In order to order the keys, TreeMap requires the keys to implement the Comparable interface or provide a Comparator object through the constructor
3. Efficiency comparison of saving, searching, and deleting according to the key Height is O(h), h is the height of the tree, in the case of tree balance, h is log2(N), N is the number of nodes

TreeSet

TreeSet implements the Set interface, and internally, it is based on TreeMap.
1. There are no repeated elements
2. Adding, deleting elements, and judging whether an element exists, the efficiency is relatively high, O(log2N), and N is the number of elements
3. Orderly, the elements are required to implement the Comparable interface or provide a Comparator object through the construction method

LinkedHashMap

LinkedHashMap is a subclass of HashMap, and there is also a doubly linked list inside to maintain the order of key-value pairs. LinkedHashMap can keep elements sorted by insertion or access, unlike TreeMap sorted by key.

  • Insertion sorting is easy to understand. The first added is in the front, the latter is added in the back, and the modification operation does not affect the sorting.
  • Access sorting: The so-called access refers to the get/put operation. After the get/put operation is performed on a key, the corresponding key-value pair will move to the end of the linked list. Therefore, the last one is the most recently accessed, and the first one is the one that has not been accessed for the longest time. visited.

basic use

By default, LinkedHashMap is sorted by insertion. To sort by access, use the following constructor:

public LinkedHashMap(int initialCapacity,
                         float loadFactor,
                         boolean accessOrder) {
   super(initialCapacity, loadFactor);
    this.accessOrder = accessOrder;
}

The parameter accessOrder is used to specify whether to access the order or not. If true, it is the access order.

When would you want to keep the insertion order?
When receiving some key-value pairs as input, processing, and outputting, you want to keep the original order when outputting.

When do you want order by access?
A typical application is LRU cache. Generally speaking, the cache capacity is limited and cannot store all data indefinitely. If the cache is full, when data needs to be stored, a certain strategy is required to clean up some old data. This strategy is generally called a replacement algorithm. LRU is a popular replacement algorithm, its full name is Least Recently Used, that is, the least recently used. The idea is that those that have been used recently have the highest probability of being used again soon, and those that have not been accessed for the longest time have the lowest probability of being accessed again soon, so they are limitedly cleaned up.
Using LinkedHashMap, LRU cache can be implemented very easily. By default, LinkedHashMap has no capacity limit, but it can easily do it. It has a method:

protected boolean removeEldestEntry(Map.Entry<K,V> eldest) {
    return false;
}

After adding an element to the LinkedHashMap, this method is called, and the passed parameter is the key-value pair that has not been accessed for the longest time. If this method returns true, the oldest key-value pair will be deleted. Because LinkedHashMap has no capacity limit, it always returns false by default.

Implementation principle

LinkedHashMap is a subclass of HashMap, and the following variables are added internally:

// 双向链表的头
private transient Entry<K,V> header;
// 表示按访问排序还是按插入排序
private final boolean accessOrder;

LinkedHashSet

LinkedHashSet is a subclass of HashSet, and the internal Map implementation class is LinkedHashMap

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325875705&siteId=291194637