Explore HashMap and ConcurrentHashMap step by step

This article explores and introduces various principles and programming ideas step by step by exploring the use and existing problems of HashMap. The main contents include:

  • Explore HashMap

    • What is HashMap
    • How to expand HashMap
    • Why the initial capacity of HashMap is 16 by default
    • Why is the load factor 0.75
    • Specific implementation of hash method
    • LinkedHashMap implements LRU caching
    • Problems and solutions in HashMap
  • Explore ConcurrentHashMap

    • Segment lock
    • CAS+Synchronized
    • Initial construction principle

Explore HashMap

  • 1. [What is HashMap]

    • Data structure: HashMap is an array and linked list data structure (chain address method).
    • Use: Add data first through hash and put it into the calculated position. If the position has a value, a hash conflict occurs. Generally, there are four ways to handle hash conflict:
      • Chain address method: This address stores a linked list, and you can add data directly to the end of the linked list .
      • Open addressing method: postpone the next non-conflicting address.
      • rehash: use another hash algorithm
      • Establish a common overflow area: divide the hash table into two parts: the basic table and the overflow table. All elements that conflict with the basic table are filled in the overflow table.
  • 2. [How to expand HashMap]

    • There is an array in the HashMap, and the linked list is hung on the array. The array has a set value, and the linked list also has an expansion length. When the length of the linked list is greater than 8, the array length is further judged,
    • The length of the array is less than the set value MIN_TREEIFY_CAPACITY (64), just expand the array, and hash the stored data again.
      • With each expansion, the length of the array doubles.
      • If you know in advance that you want to store 1000, it is more reasonable to set a HashMap: 1000<0.75x, find x is 1334, new HashMap(1334), and automatically change the storage size to the smallest power of 2 greater than 1334: 2048 .
    • When the length of the array exceeds 64, the linked list is converted into a red-black tree.
    • Reference: When will the linked list in HashMap be transformed into a red-black tree?
  • 2.1. [Why the initial capacity of HashMap is 16 by default]

    • Why can we use bit operation (&) to implement modulo operation (%)? The principle of this realization is as follows:
      • X % 2^n = X & (2^n – 1)
    • Pick 16 is the experience value
    • How to find the smallest power of 2 greater than the incoming value: math problem, no.
// 找到第一个大于等于initialCapacity的2的平方的数
int capacity = 1; 
while (capacity < initialCapacity)
       capacity <<= 1;
  • 2.2. [Why the load factor is 0.75]

    • Because time complexity and space complexity are considered comprehensively, 0.75 is chosen as a compromise.
  • 2.3 [The specific implementation of the hash method]

static int indexFor(int h, int length) {
    
    
     return h & (length - 1);
}
  • Why is length-1

    • Because length is generally a power of 2, it will cause odd-numbered data to be less than hashed.
  • 3. [LinkedHashMap implements LRU caching]

    • LinkedHashMap supports two types of insertion order and access order
      • Insertion order: the one added first is in the front, and the one added later is in the back. The modification operation does not affect the order
      • Access sequence: The so-called access refers to the get/put operation. After the get/put operation is performed on a key, the corresponding key-value pair will move to the end of the linked list, so the last one is the most recently accessed, and the first one is the longest Those that have not been visited, this is the order of visits.
      • Reference: LinkedHashMap basic usage & use to realize simple caching
  • 4. [Problems and solutions of HashMap in the case of multithreading]

    • 4.1. Existing problems: in a multi-threaded environment,
    • 4.2. Solution
      • Use Hashtable, use synchronized to ensure thread safety.
      • But using HashTable will cause efficiency problems, and the same lock is too inefficient.
      • So ConcurrentHashMap was introduced, and segment locks were used before jdk1.8. Different segments are different locks, but even this will still reduce efficiency.
      • After jdk1.8, ConcurrentHashMap introduced a lock-free mode.

Explore ConcurrentHashMap

  • Segment lock (before 1.8)
    • ConcurrentHashMap is composed of Segment array structure and HashEntry array structure. Segment is a reentrant lock ReentrantLock
  • CAS+Synchronized(1.8)
  • Initial construction principle
//返回一个大于输入参数且最小的为2的n次幂的数。
private static final int tableSizeFor(int c) {
    
    
        int n = c - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
//tableSizeFor(int c)的原理:
//将c最高位以下通过|=运算全部变成1,最后返回的时候,返回n+1;

Guess you like

Origin blog.csdn.net/ljfirst/article/details/107813985