Detailed explanation of HashTable and ConcurrentHashMap interview questions

HashTable

Features: underlying data structure: array + linked list (array of internal class Entry)

Thread safety: synchronized keyword modification (larger granularity of lock; JVM lock); low efficiency

Initialization: The initialization size is 11 and the load factor is 0.75

Storage and expansion: When Key and Value are stored as Null, NullPointerException will be thrown

Capacity expansion: when the actual size is greater than the capacity expansion threshold = initialization size * load factor; it will expand 2 times plus 1
 

Combined with source code analysis:

Data structure: array + linked list

/**
 * The hash table data.
 */
private transient Entry<?,?>[] table;
...
private static class Entry<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    Entry<K,V> next;
    ...

 

Thread safety: dependent on synchronized

public synchronized V put(K key, V value) {
    // Make sure the value is not null
    if (value == null) { // VALUE值不为NULL
        throw new NullPointerException();
    }

    // Makes sure the key is not already in the hashtable.
    Entry<?,?> tab[] = table;
    int hash = key.hashCode(); // key值为Null会导致空指针
    int index = (hash & 0x7FFFFFFF) % tab.length;
    @SuppressWarnings("unchecked")
    Entry<K,V> entry = (Entry<K,V>)tab[index];
    for(; entry != null ; entry = entry.next) {
        if ((entry.hash == hash) && entry.key.equals(key)) {
            V old = entry.value;
            entry.value = value;
            return old;
        }
    }

    addEntry(hash, key, value, index);
    return null;
}

Take the put method as an example and the put method also handles the KEY / VALUE value as NULL

Hash value operation: directly use KEY's HashCode, then the maximum int value and operation and then take the modulus of the array length (default 11).

private void addEntry(int hash, K key, V value, int index) {
    modCount++;

    Entry<?,?> tab[] = table;
    if (count >= threshold) {
        // Rehash the table if the threshold is exceeded
        rehash();

        tab = table;
        hash = key.hashCode(); // 直接使用KEY的HashCode
        index = (hash & 0x7FFFFFFF) % tab.length; //最大int值与运算再对数组长度(默认11)取模
    }

    // Creates the new entry.
    @SuppressWarnings("unchecked")
    Entry<K,V> e = (Entry<K,V>) tab[index];
    tab[index] = new Entry<>(hash, key, value, e);
    count++;
}

to sum up:

You need to learn to deal with the interview; but HashTable is used less in actual production; HashMap is not required for thread safety (relevant optimization is also performed in 1.8); ConcurrentHashMap is required for thread safety, which is more efficient, and JDK1.8 has also been optimized.

 

ConcuurentHashMap

 

Features:

Data structure: JDK1.7 array + linked list internal class (Entry array) + segment lock (to ensure thread safety) internal class: Entry, Segment (segment lock)

JDK1.8 array + linked list + red black tree + (CAS mechanism + Synchronized [guarantee thread safety]) Inner class: HashEntry

In other respects ConcurrentHashMap is very similar to HashMap

 

Source code analysis

JDK1.7

/**
 * Segments are specialized versions of hash tables.  This
 * subclasses from ReentrantLock opportunistically, just to
 * simplify some locking and avoid separate construction.
 */
static final class Segment<K,V> extends ReentrantLock 
                                        implements Serializable {
                                        ...
transient volatile HashEntry<K,V>[] table;
...           

 

Note: ReentrantLock implements thread synchronization (atomicity) for each segment; at the same time, volatile HashEntry ensures the visibility and order of HashEntry.

JDK1.8

// Unsafe mechanics CAS机制相关的Unsafe类
private static final sun.misc.Unsafe U;

/**
 * Initializes table, using the size recorded in sizeCtl.
 */
private final Node<K,V>[] initTable() {
    Node<K,V>[] tab; int sc;
    while ((tab = table) == null || tab.length == 0) {
        if ((sc = sizeCtl) < 0)
            Thread.yield(); // lost initialization race; just spin
        else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
            // CAS机制
            try {
                if ((tab = table) == null || tab.length == 0) {
                    int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
                    @SuppressWarnings("unchecked")
                    Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
                    table = tab = nt;
                    sc = n - (n >>> 2);
                }
            } finally {
                sizeCtl = sc;
            }
            break;
        }
    }
    return tab;
}

 

1: Use the Unsafe class (CAS mechanism) to realize the atomicity of data modification; 2: Get, put and other methods use the Synchronized keyword to achieve county security.

Understand here why the Synchronized keyword is used, and the CAS mechanism is also used to achieve thread safety?

My understanding is that Synchronized modifies storage and retrieval methods; the CAS mechanism is aimed at the change of collection size. In JDK1.8, a volatile variable baseCount is used to record the number of elements. When new data is inserted or data is deleted , Will update baseCount through addCount () method

private final void addCount(long x, int check) {
    CounterCell[] as; long b, s;
    // U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)
    // 该方法判断本地存储的baseCount是否与缓存中一致,若一直才容许修改
    if ((as = counterCells) != null ||
        !U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)) {
        CounterCell a; long v; int m;
        boolean uncontended = true;
        if (as == null || (m = as.length - 1) < 0 ||
            (a = as[ThreadLocalRandom.getProbe() & m]) == null ||
            !(uncontended =
              U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) {
            fullAddCount(x, uncontended);
            return;
        }
        if (check <= 1)
            return;
        s = sumCount();
    }
    if (check >= 0) {
        Node<K,V>[] tab, nt; int n, sc;
        while (s >= (long)(sc = sizeCtl) && (tab = table) != null &&
               (n = tab.length) < MAXIMUM_CAPACITY) {
            int rs = resizeStamp(n);
            if (sc < 0) {
                if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
                    sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
                    transferIndex <= 0)
                    break;
                if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
                    transfer(tab, nt);
            }
            else if (U.compareAndSwapInt(this, SIZECTL, sc,
                                         (rs << RESIZE_STAMP_SHIFT) + 2))
                transfer(tab, null);
            s = sumCount();
        }
    }
}

 

to sum up:

From JDK1.7 to JDK1.8, the granularity of ConcurrentHashMap's lock has been refined. It is composed of the original segmented lock (multiple segments, synchronized in the same segment), and becomes locked based on the array table; the maximum Improved performance.

Note: ReentrantLock will be discussed in detail in subsequent thread safety. AQS will also be updated in subsequent blogs.

The unsafe class and its related CAS mechanism will also be discussed in detail in thread safety.

Published 27 original articles · praised 0 · visits 9932

Guess you like

Origin blog.csdn.net/weixin_38246518/article/details/105542976