HashMap for Java programmer interview

Introduction

The basic HashMap in JAVA is used very frequently in work. I believe that many students are often asked to be dizzy during interviews. Today, let’s talk about the common interview questions in HashMap!

Common interview questions 

1. What optimizations have been made in JavaJDK1.7 to 1.8HashMap?

 HashMap in java1.7 is composed of array + linked list. Red and black trees were added after 1.8.

When the linked list is greater than 8 and the capacity exceeds 64. The linked list will become a red-black tree.

  

 as the picture shows:

 

 The elements in the array are what we usually call (bucket) hash bucket, the code is as follows:

static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    Node<K,V> next;
    Node(int hash, K key, V value, Node<K,V> next) {
        this.hash = hash;
        this.key = key;
        this.value = value;
        this.next = next;
    }
    public final K getKey()        { return key; }
    public final V getValue()      { return value; }
    public final String toString() { return key + "=" + value; }
    public final int hashCode() {
        return Objects.hashCode(key) ^ Objects.hashCode(value);
    }
    public final V setValue(V newValue) {
        V oldValue = value;
        value = newValue;
        return oldValue;
    }
    public final boolean equals(Object o) {
        if (o == this)
            return true;
        if (o instanceof Map.Entry) {
            Map.Entry<?,?> e = (Map.Entry<?,?>)o;
            if (Objects.equals(key, e.getKey()) &&
                Objects.equals(value, e.getValue()))
                return true;
        }
        return false;
    }
}

Each hash bucket (bucket) contains four fields: hash, key, value, next, where next represents the next node of the linked list. JDK 1.8 added the red-black tree because once the linked list is too long, it will seriously affect the performance of HashMap, and the advantage of the red-black tree is that it can quickly add, delete, modify and check, which can effectively solve the problem of slow operation when the linked list is too long.

2. Is HashMap thread safe? Why is HashTabel thread safe?

HashMap is thread-unsafe, while Hashtable is thread-safe, because all its CRUD operations are modified by synchronized, this implementation is very slow. Hashtable does not allow keys and values ​​to be null, but HashMap allows.

Why is HashMap thread unsafe? Next, I will show you the situation with code:

1. Concurrent modification exception occurs: ConcurrentModificationException.

2. Cause: HashMap is not safe under concurrent conditions.

3. Solution: Use ConcurrentHashMap in the JUC (java.util.concurrent) package.

Map<String,String> map = new ConcurrentHashMap<>();

3. HashMap source code analysis

    There are these attributes in HashMap:

// HashMap 初始化长度
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
// HashMap 最大长度
static final int MAXIMUM_CAPACITY = 1 << 30; // 1073741824
// 默认的加载因子 (扩容因子)
static final float DEFAULT_LOAD_FACTOR = 0.75f;
// 当链表长度大于此值且容量大于 64 时
static final int TREEIFY_THRESHOLD = 8;
// 转换链表的临界值,当元素小于此值时,会将红黑树结构转换成链表结构
static final int UNTREEIFY_THRESHOLD = 6;
// 最小树容量
static final int MIN_TREEIFY_CAPACITY =

What is the load factor? Why is the default load factor 0.75?

Load factor is also called expansion factor or load factor. It is used to determine when to expand. If the load factor is 0.6 and the initial capacity of HashMap is 20, then when there are 20*0.6=12 elements in the HashMap, HashMap will perform Expansion.

Why 0.75 is because the official said it is for performance considerations so the default setting is 0.75.

 There are three important methods in HashMap: add , query , and expand .

 

The new source code is as follows:

public V put(K key, V value) {
    // 对 key 进行哈希操作
    return putVal(hash(key), key, value, false, true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    // 哈希表为空则创建表
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    // 根据 key 的哈希值计算出要插入的数组索引 i
    if ((p = tab[i = (n - 1) & hash]) == null)
        // 如果 table[i] 等于 null,则直接插入
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        // 如果 key 已经存在了,直接覆盖 value
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        // 如果 key 不存在,判断是否为红黑树
        else if (p instanceof TreeNode)
            // 红黑树直接插入键值对
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            // 为链表结构,循环准备插入
            for (int binCount = 0; ; ++binCount) {
                // 下一个元素为空时
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    // 转换为红黑树进行处理
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                //  key 已经存在直接覆盖 value
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    // 超过最大容量,扩容
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

If you don't understand the code, see the following flowchart:

The query source code is as follows:

public V get(Object key) {
    Node<K,V> e;
    // 对 key 进行哈希操作
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}
final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    // 非空判断
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        // 判断第一个元素是否是要查询的元素
        if (first.hash == hash && // always check first node
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        // 下一个节点非空判断
        if ((e = first.next) != null) {
            // 如果第一节点是树结构,则使用 getTreeNode 直接获取相应的数据
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            do { // 非树结构,循环节点判断
                // hash 相等并且 key 相同,则返回此节点
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}

The expansion source code is as follows:

final Node<K,V>[] resize() {
    // 扩容前的数组
    Node<K,V>[] oldTab = table;
    // 扩容前的数组的大小和阈值
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    // 预定义新数组的大小和阈值
    int newCap, newThr = 0;
    if (oldCap > 0) {
        // 超过最大值就不再扩容了
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        // 扩大容量为当前容量的两倍,但不能超过 MAXIMUM_CAPACITY
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    // 当前数组没有数据,使用初始化的值
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {             
      // zero initial threshold signifies using defaults
        // 如果初始化的值为 0,则使用默认的初始化容量
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    // 如果新的容量等于 0
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr; 
    @SuppressWarnings({"rawtypes","unchecked"})
    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    // 开始扩容,将新的容量赋值给 table
    table = newTab;
    // 原数据不为空,将原数据复制到新 table 中
    if (oldTab != null) {
        // 根据容量循环数组,复制非空元素到新 table
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                // 如果链表只有一个,则进行直接赋值
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    // 红黑树相关的操作
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { // preserve order
                    // 链表复制,JDK 1.8 扩容优化部分
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        // 原索引
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        // 原索引 + oldCap
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    // 将原索引放到哈希桶中
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    // 将原索引 + oldCap 放到哈希桶中
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

4. HashMap dead loop analysis

Take JDK 1.7 as an example, suppose the default size of HashMap is 2. Originally there is an element key(5) in HashMap, we use two more threads: t1 adds element key(3), t2 adds element key(7), when the element key After (3) and key(7) are added to the HashMap, thread t1 surrenders the right to use the CPU when it executes to Entry<K,V> next = e.next;, the source code is as follows:

The source code is as follows:

void transfer(Entry[] newTable, boolean rehash) {


    int newCapacity = newTable.length;


    for (Entry<K,V> e : table) {


        while(null != e) {


            Entry<K,V> next = e.next; // 线程一执行此处


            if (rehash) {


                e.hash = null == e.key ? 0 : hash(e.key);


            }


            int i = indexFor(e.hash, newCapacity);


            e.next = newTable[i];


            newTable[i] = e;


            e = next;


        }


    }


}


Then at this time the e in thread t1 points to key(3), and next points to key(7); after thread t2 rehash, the order of the linked list is reversed, and the position of the linked list becomes key(5) → key( 7) → key(3), where "→" is used to indicate the next element.

When t1 regains the right to execute, first execute newTalbe[i] = e and set the next of key(3) to key(7), and the next element of key(7) is queried for key(3) in the next loop. As a result, a circular reference of key(3) and key(7) is formed, which leads to an endless loop, as shown in the following figure:

The reason for the infinite loop is that the JDK 1.7 linked list insertion method is the first reverse insertion. This problem has been improved in JDK 1.8 and has become the tail positive insertion.

Someone once reported this question to Sun, but Sun believed that this is not a problem because HashMap itself is not thread-safe. If you want to use multiple threads, it is recommended to use ConcurrentHashMap instead, but this question was asked in the interview The odds are still very high, so I need to make a special note here.

to sum up 

The above are common HashMap interview questions.

Recommended in the past

Have you practiced MySQL performance optimization?

Java programmer interview ---String type

Novel and comic online reading site based on SpringBoot open source

SpringCloud Mall system (with source code and tutorial)

Be careful! 3W word takes you to play SpringCloud

Friends who like this article, please click on the picture to follow the subscription account and watch more exciting content!

Guess you like

Origin blog.csdn.net/qq_39507327/article/details/111503256