I have compiled the ConcurrentHashMap interview questions that were asked before

Hello, my name is yes.

The previous article talked about the relevant interview points of the collection class already including HashMap, this article will discuss ConcurrentHashMap.

We all know that HashMap is not thread-safe, and then there is a HashTable. Although this thing is thread-safe, all methods use the same lock, the concurrency is too low, and the performance is not good.

Therefore, if you want to use Map in concurrent scenarios, ConcurrentHashMap is recommended.

When it comes to ConcurrentHashMap, it is often asked what optimizations JDK1.8 does compared to version 1.7, so let's first look at the implementation of 1.7.

ConcurrentHashMap 1.7

In fact, there is no essential difference between the general hash table implementation and HashMap. They are all located through the hash of the key to a subscript, and then get the element. If there is a conflict, it is connected by a linked list.

The difference is that a Segments array is introduced. Let's take a look at the general structure.

The principle is to first obtain the subscript of the segment array by judging the hash of the key, then lock the segment , and then obtain the subscript of the HashEntry array in the segment through the hash of the key again. The following step is actually the same as the HashMap, so I say the difference Is the introduction of a Segments array.

Therefore, it can be simplified as follows: each Segment array stores a separate HashMap.

It can be seen that we have 6 Segments in the figure, which means that there are six locks, so a total of six threads can operate this ConcurrentHashMap at the same time, and the concurrency is 6. Compared with directly locking the put method, the concurrency is improved. Yes, this is segment lock .

The specific locking method comes from Segment. This class actually inherits ReentrantLock, so it has the function of locking itself.

It can be seen that the segment lock of 1.7 already has the concept of refining the lock granularity. One of its defects is that the Segment array will not expand once it is initialized, but only the HashEntry array will expand, which leads to too rigid concurrency and cannot follow Increase concurrency as data increases.

ConcurrentHashMap 1.8

1.8 ConcurrentHashMap has more fine-grained lock control. It can be understood that each position of the array of 1.8 HashMap is a lock, so the expansion of the lock will also increase, and the concurrency will also increase.

The change of thought is to make the granularity more refined. I will not segment it. I will directly put a lock on each node of the Node array, so that the concurrency will be higher, right?

And 1.8 does not rely on ReentrantLock, but uses synchronized directly, which also proves that the speed of synchronized optimization in 1.8 is no less than that of ReentrantLock .

The specific implementation idea is also simple. When inserting a value, first calculate the subscript after the hash of the key. If the calculated subscript does not yet have a Node, then insert a new Node through cas. If there is already a node, lock the node through synchronized, so that other threads cannot access this node and all the nodes after it.

Then judge whether the keys are equal. If they are equal, replace the value. Otherwise, add a new node. This operation is the same as that of HashMap.

The expansion of 1.8 is also a bit interesting, it allows to assist in expansion, that is, multi-threaded expansion .

When put, it is found that the current node hash value is -1, which indicates that the current node is expanding, and the assisted expansion mechanism will be triggered.


It's really a bit troublesome to talk about in detail, and it's not easy to explain. The code is a bit big, so I'll give a rough idea.

In fact, it is enough for everyone to understand:

Expansion is nothing more than relocating Node. Assuming that the current array length is 32, it can be relocated separately. Nodes with subscripts 31-16 are relocated by thread A, and then thread B is used to relocate Nodes with subscripts 15-0. .

Simply put, it will maintain a transferIndex variable, and the incoming thread cas will compete for the subscript in an infinite loop. If the subscript has been allocated, then naturally there is no need to relocate. If the cas grabs the subscript to be moved, it will help to move. Alright, that's it.

And the size method of 1.8 is different from 1.7

1.7 There is an idea of ​​trying, when the size method is called, it will not lock, but first try to obtain sum without locking three times.

If the total number of three times is the same, indicating that the current number has not changed, it will return directly. If the total number is different, it means that there are threads adding and deleting maps at this time, so the calculation is locked, and other threads cannot operate the map at this time.

  if (retries++ == RETRIES_BEFORE_LOCK) {
    
    
       for (int j = 0; j < segments.length; ++j)
           ensureSegment(j).lock(); // force creation
   }
   ...再累加数量

1.8 is different. It directly calculates the returned result, which is similar to the idea of ​​LongAdder. The accumulation is no longer based on a member variable, but an array. Each thread accumulates at its corresponding subscript. At the end, unify the data in the array, which is the final value.

So it's a disaggregated idea, divide and conquer.

When put, the number of counterCells will be maintained through the addCount method. Of course, this method will also be called when remove.


All in all, the daily operation will maintain the number of nodes in the map, and will first modify the baseCount through CAS. If it succeeds, it will return directly. If it fails, it means that there are threads competing at this time, then select a CounterCell object through hash and modify it. The final size The result is baseCount + all CounterCells.

The counterCell array reduces the competitive pressure on a single member variable in concurrent scenarios, improves concurrency, and improves performance, which is the idea of ​​LongAdder.

Another thing to ask here is the @sun.misc.Contendedannotation , which is related to pseudo-sharing. For details, see this article I wrote before:

Fill in the previous pit, pseudo-shared analysis

Does ConcurrentHashMap#get need to be locked?

No need to lock.

After ensuring thread safety when putting, you only need to ensure visibility when getting , and visibility does not need to be locked.

Specifically, it is implemented by Unsafe#getXXXVolatile and using volatile to decorate the node's val and next pointers.

Why ConcurrentHashMap does not support key or value is null?

First of all, why can't key also be null? I don't know, I can't figure it out, maybe the author lea doesn't like null values.

Why can't value be null?

Because in the case of multi-threading, the null value will be ambiguous , because you don't know whether the key does not exist in the map, or it is put (key, null).

Some people here may say, that HashMap does not have this problem? HashMap can use containsKey to determine whether this key exists, but ConcurrentHashMap used by multiple threads cannot.

For example, if you get (key) and get null, there is no such key in the map at this time, but you don't know, so you want to call containsKey to see, but just before you call, another thread put this key, so you containKey Just found out that there is this key, is this a "misunderstanding"?

At last

The understanding of ConcurrentHashMap is almost the same. If you look at the source code for more, there are still many places worth learning.

The interview questions of the collection class are almost here, and the Spring interview questions will be written later.

Welcome to follow me, I am yes, see you in the next article~

Guess you like

Origin blog.csdn.net/yessimida/article/details/122763351
Recommended