A sketch HashMap

A sketch HashMap

Before finishing the simple knowledge of Java collections, HashMap not found speak a few words can understand, so specialized knowledge to sort out the HashMap.

HashMap storage structure

Map HashMap type of data structure is a hash table implementation, the interior is not as long a the Entry type array. Entry type is often said pairs.

/**
     * The default initial capacity - MUST be a power of two.
     */
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

In HashMap source, the default length of 16 initializes the table, if you want to define the length from the table must also n-th power of 2.

static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    Node<K,V> next;
}

In the HashMap, the Node Entry type is the implementation class, in addition to the key-value pairs, as well as the hash value and a next pointer. Not difficult to draw, hash table every position can hold a Node list.

With such a structure to shorten the length of the hash table, so that the limited amount of data, either a better query performance, but also to reduce the waste of space.

However, if a large amount of data but also to use the default length of the hash table, it will lead to a long list, query performance. Fortunately HashMap is supported by dynamic expansion.

HashMap into a working principle of operation

After learning HashMap storage structure, there are two issues before us. First, how HashMap determine what position in the table to insert elements, the second is inserted into what position in the list to go.

First to answer the first question, HashMap hashCode will first get the object, then hashCode long table with modulo operation to do, that is, hashCode% long table, the resulting value is the element to be inserted in place of the table.

The answer to the second question is HashMap will use interpolation to head the new element is inserted into the head of the list.

Here is a special case, HashMap is null as a key support, but can not get a null hashCode, so forcibly inserted into the null for the location of zero.

//HashMap哈希函数的实现
static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
//

Curious HashMap hash function works and why the hash table length must be n-th power of 2 students can refer to the following answer.

What is the principle hash method HashMap JDK source code is? - Fat Jun answer - know almost

So when HashMap will find elements in two steps:

  1. Position of the element in the hash table is calculated
  2. Traverse the list to find the elements

HashMap expansion

We mentioned above if the huge amount of data used in the default HashMap hash table length, will lead the list is too long, the query efficiency decline, even degenerate into a one-way linked list. Fortunately HashMap itself supports dynamic expansion, HashMap will be based on the amount of data to dynamically adjust the size of the hash table length, in order to let the search efficiency and space efficiency of HashMap can be guaranteed.

Introduce a few important parameters:

  • capacity: the length of the hash table, the default is 16. Note that must be guaranteed for the n-th power of 2
  • size: the number of key-value pairs
  • threshold: threshold, when the size is greater than the threshold it is necessary for expansion
  • loadFactor: load factor, the ratio of the hash table can be used, threshold = (int) (capacity * loadFactor)

LoadFactor initial length and the hash table is defined in the initialization HashMap, loadFactor default value 0.75f. Check the size of the expansion trigger condition and threshold when inserting a new element, if the size is greater than the threshold, it will trigger expansion.

/**
     * Initializes or doubles table size.  If null, allocates in
     * accord with initial capacity target held in field threshold.
     * Otherwise, because we are using power-of-two expansion, the
     * elements from each bin must either stay at same index, or move
     * with a power of two offset in the new table.
     *
     * @return the table
     */
final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;//如果表长已经超过最大值,就会将threshold设置成一个很大的数来阻止HashMap继续扩容
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)//
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    //下面是将旧表中的数据转移到新表中去,使用了两个嵌套的循环,开销非常大
    @SuppressWarnings({"rawtypes","unchecked"})
    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    if (oldTab != null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { // preserve order
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

Calculation array capacity

When you create a HashMap if the incoming n-th power capacity is not 2, HashMap will automatically convert it to 2 ^ n.

Trees turn red and black list

Starting the JDK 8. If the chain length of more than 8, the list will be converted into a red-black tree.

Compared with the HashTable

  1. HashTable using synchronized to operate synchronously
  2. HashMap can insert the key is null Entry
  3. You can not add elements and iterative expansion HashMap
  4. Over time, the order of elements in the HashMap may vary

ConcurrentHashMap

Storage structure

ConcurrentHashMap compared to HashMap, storage structure is similar, but the introduction of ConcurrentHashMap lock segments, each segment maintains a lock for a table, so that different threads can simultaneously access different segments lock hash table. Greatly improves the access efficiency. The default lock segments 16.

size operation

Each Segment maintains a count variable to count the number of key-value pairs in the Segment.

/**
 * The number of elements. Accessed only either within locks
 * or among other volatile reads that maintain visibility.
 */
transient int count;

When performing size, you need to traverse all Segment then add up the count.

ConcurrentHashMap size in the implementation of operations to try not locked, unlocked if two consecutive results obtained in the same operation, you can think that this result is correct.

RETRIES_BEFORE_LOCK attempts to use defined, the value 2, retries the initial value of -1, and therefore attempts to 3.

If the number of attempts over three times, we need to lock for each Segment.


/**
 * Number of unsynchronized retries in size and containsValue
 * methods before resorting to locking. This is used to avoid
 * unbounded retries if tables undergo continuous modification
 * which would make it impossible to obtain an accurate result.
 */
static final int RETRIES_BEFORE_LOCK = 2;

public int size() {
    // Try a few times to get accurate count. On failure due to
    // continuous async changes in table, resort to locking.
    final Segment<K,V>[] segments = this.segments;
    int size;
    boolean overflow; // true if size overflows 32 bits
    long sum;         // sum of modCounts
    long last = 0L;   // previous sum
    int retries = -1; // first iteration isn't retry
    try {
        for (;;) {
            // 超过尝试次数,则对每个 Segment 加锁
            if (retries++ == RETRIES_BEFORE_LOCK) {
                for (int j = 0; j < segments.length; ++j)
                    ensureSegment(j).lock(); // force creation
            }
            sum = 0L;
            size = 0;
            overflow = false;
            for (int j = 0; j < segments.length; ++j) {
                Segment<K,V> seg = segmentAt(segments, j);
                if (seg != null) {
                    sum += seg.modCount;
                    int c = seg.count;
                    if (c < 0 || (size += c) < 0)
                        overflow = true;
                }
            }
            // 连续两次得到的结果一致,则认为这个结果是正确的
            if (sum == last)
                break;
            last = sum;
        }
    } finally {
        if (retries > RETRIES_BEFORE_LOCK) {
            for (int j = 0; j < segments.length; ++j)
                segmentAt(segments, j).unlock();
        }
    }
    return overflow ? Integer.MAX_VALUE : size;
}

Changes JDK 8

JDK 1.7 segmented locking mechanism used to implement concurrent update operation, Segment core class, it inherits the lock of ReentrantLock weight, equal to the number of concurrent Segment.

JDK 1.8 uses a CAS operations to support a higher degree of concurrency, use the built-in lock synchronized when the CAS operation fails.

When converted to a red-black tree and realize JDK 1.8 also list is too long.

Laidakedःashanap

Storage structure

LinkedHashMap inherited from HashMap, HashMap therefore quickly find the same characteristics.

In addition, inside LinkedHashMap also maintains a doubly linked list data is recorded sequentially inserted or least recently used (LRU) order.

accessOrder which determines the order of recording, default is false, record insertion order.

/**
 * The iteration ordering method for this linked hash map: {@code true}
 * for access-order, {@code false} for insertion-order.
 *
 * @serial
 */
final boolean accessOrder;

LinkedHashMap use the following two methods to maintain the list

  • afterNodeInsertion
  • afterNodeAccess

afterNodeInsertion

void afterNodeInsertion(boolean evict) { // possibly remove eldest
    LinkedHashMap.Entry<K,V> first;
    if (evict && (first = head) != null && removeEldestEntry(first)) {
        K key = first.key;
        removeNode(hash(key), key, null, false, true);
    }
}

After put other operations performed, when the return true removeEldestEntry () method removes the latest node, i.e. node list header first.

evict only when the Map was constructed to false, here is true.

protected boolean removeEldestEntry(Map.Entry<K,V> eldest) {
    return false;
}

removeEldestEntry () defaults to false, if you need to make it true, the need to extend coverage LinkedHashMap and implementation of this method, which LRU cache is particularly useful in achieving, by removing nodes least recently used, so as to ensure sufficient buffer space, and the cached data is hot data.

afterNodeAccess

void afterNodeAccess(Node<K,V> e) { // move node to last
    LinkedHashMap.Entry<K,V> last;
    if (accessOrder && (last = tail) != e) {
        LinkedHashMap.Entry<K,V> p =
            (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
        p.after = null;
        if (b == null)
            head = a;
        else
            b.after = a;
        if (a != null)
            a.before = b;
        else
            last = b;
        if (last == null)
            head = p;
        else {
            p.before = last;
            last.after = p;
        }
        tail = p;
        ++modCount;
    }
}

If accessOrder is true, the move to list the tail, the tail of the list to ensure that after a node access node will be a node recent visit, then list the header is the least recently used nodes.

LRU cache

Inheritance LinkedHashMap can quickly implement an LRU cache, the following is an example of a LRU:

class LRUCache<K, V> extends LinkedHashMap<K, V> {
    private static final int MAX_ENTRIES = 3;

    //重写removeEldestEntry方法使元素数量大于3时将最近最久未使用的元素移除
    @Override
    protected boolean removeEldestEntry(Map.Entry eldest) {
        return size() > MAX_ENTRIES;
    }

    LRUCache() {
        super(MAX_ENTRIES, 0.75f, true);
    }
}
public static void main(String[] args) {
    LRUCache<Integer, String> cache = new LRUCache<>();
    cache.put(1, "a");
    cache.put(2, "b");
    cache.put(3, "c");
    cache.get(1);//访问1使(1,"a")重新置于链表尾部
    cache.put(4, "d");//加入4使数量大于3而将最近最少使用的2移除
    System.out.println(cache.keySet());
}
[3, 1, 4]

WeakHashMap

Storage structure

WeakHashMap of Entry inherited from WeakReference, it will be recycled when the next garbage collection WeakReference associated objects.

WeakHashMap mainly used to implement the cache, the cache to reference the object by using WeakHashMap, this part of the recovery cached by JVM.

ConcurrerntCache

Tomcat in ConcurrentCache WeakHashMap used to implement caching feature.

ConcurrentCache adopted a generational cache:

  • Frequently used objects in the eden, eden use ConcurrentHashMap implementation, without fear of being recovered (Garden of Eden);
  • Not commonly used objects into longterm, longterm use WeakHashMap achieve these old objects are garbage collected.
  • When you call get () method, it will start eden District acquire, if not find the words and then to obtain longterm, when obtained from longterm to put objects into eden in order to ensure access node is often not easily recycled.
  • When you call put () method, if the size of eden over the size, so will all the objects are placed in eden longterm, the use of virtual machine recovery off part of the object is not frequently used.
public final class ConcurrentCache<K, V> {

    private final int size;

    private final Map<K, V> eden;

    private final Map<K, V> longterm;

    public ConcurrentCache(int size) {
        this.size = size;
        this.eden = new ConcurrentHashMap<>(size);
        this.longterm = new WeakHashMap<>(size);
    }

    public V get(K k) {
        V v = this.eden.get(k);
        if (v == null) {
            v = this.longterm.get(k);
            if (v != null)
                this.eden.put(k, v);
        }
        return v;
    }

    public void put(K k, V v) {
        if (this.eden.size() >= size) {
            this.longterm.putAll(this.eden);
            this.eden.clear();
        }
        this.eden.put(k, v);
    }
}

Guess you like

Origin www.cnblogs.com/supermaskv/p/12488888.html