Java Multithreading Series VII (Thread-Safe Collection Class)


Preface

In the data structure, we have studied Java's built-in collections, but we know that most of the collection classes we have studied are thread-unsafe, and a few, such as , are thread-safe, but these are relatively Vector" Stackrough HashTable" The class (with synchronized locks added to all methods) is generally not recommended for use.

So what should we do when we want to use collection classes in multi-threading?

1. Multi-threading using linear tables

Method 1: Manually lock logic that may cause thread safety issues.

For example, if multiple threads modify ArrayList, problems may occur at this time, and the modification operation can be locked.

Way 2: UseCollections.synchronizedList();encapsulation

Encapsulating the linear table you want to use using the above method is equivalent to adding a lock to the key method in the collection.

Method 3: UseCopyOnWriteArrayList

CopyOnWriteArrayList principle

  1. CopyOnWriteArrayListAlso known as a copy-on-write container. When we add elements to a container, we do not add them directly to the current container. Instead, we first copy the current container and create a new container. Then we add elements to the new container. After adding the elements, we copy the original container. The reference points to the new container.

  2. The advantage of this is that we can perform concurrent reads on the CopyOnWriteArrayList container without locking (only locking when writing), because the current container will not add any elements.

  3. Therefore, the CopyOnWriteArrayList container adopts the idea of ​​​​separating reading and writing, reading and writing different containers.

advantage:

  • Suitable for scenarios where there is more reading and less writing. In scenarios where there are many reads and few writes, the performance is very high and there is no need for lock competition.

shortcoming:

  • When modifying, the container needs to be copied again, which takes up a lot of memory.
  • The newly written data cannot be read immediately, that is, a "dirty read" problem may occur.

2. Multi-threading using stacks and queues

When multi-threads use stacks and queues, we can directly use the blocking queue provided by the Java standard library. Because of the blocking function, these collections are thread-safe under multi-threading:

  1. ArrayBlockingQueueBlocking queue implemented based on array
  2. LinkedBlockingQueueBlocking queue based on linked list
  3. PriorityBlockingQueueHeap-based implementation of priority blocking queue
  4. TransferQueueA blocking queue containing at most one element

3. Using hash tables under multi-threading

Hash tables are also collection classes that we often use, and the standard library provides three types of hash tables. The difference between the three types of hash tables is also a very important knowledge point. Let’s take a moment to compare them HashMap. HashTable, ConcurrentHashMap.

(1)HashMap

HashMapNeedless to say, this is a hash table used in a single thread and is inherently unsafe.

(2)HashTable

HashTableIt adds synchronizeda lock to the public method, which is actually equivalent to locking the entire hash table. If multiple threads access the same HashTable, lock competition will occur, and once the expansion is triggered, the entire expansion process will be completed by that thread, which will be very inefficient. We can simply draw a picture to understand:

(3)ConcurrentHashMap

ConcurrentHashMapMade some optimizations based on HashTable:

Java1.7The main optimization methods are:

Segment lock technology is used, which simply means dividing several hash buckets into a "segment" and locking each segment separately. The purpose is also to reduce the probability of lock competition. Lock competition is only triggered when the data accessed by two threads happens to be on the same segment.

Java1.8Optimization in

Optimization 1: Read and write operations (the most critical optimization)

In Java 1.8, segmentation locks were cancelled, and a lock was directly assigned to each hash bucket/each linked list, that is, the head node object of each linked list was used as the lock object. The lock is canceled on the read operation and volatile is used to ensure that the result is read from the memory. Similarly, let’s draw a picture to understand:

for example:

Suppose there are two threads now, inserting two elements. The element inserted by thread 1 corresponds to the linked list with index 1; the element inserted by thread 2 corresponds to the linked list with index 2
. At this time, it is equivalent to two threads modifying different variables, and there is obviously no thread safety issue. In ConcurrentHashMap, each insertion operation only locks the corresponding linked list. Operating different linked lists means locking different locks, and no lock competition will occur. Therefore, this results in that most locks are actually conflict-free, and the cost of lock operations here is minimal. If a HashTable is used at this time, since the entire hash table is locked, these two insertions will still cause lock competition for the same this, resulting in blocking waiting.

Optimization 2: Reuse CAS features.
For example, updating and obtaining the number of elements can be completed directly using CAS without locking.

Optimization 3: Optimize the expansion mechanism (divided into zeros)
We know that there is a parameter in the hash table called "load factor". If there are too many elements and the load factor is too high, expansion will be considered.

Expansion requires re-applying for memory space, deleting elements from the old hash table, and inserting them into the new hash table. However, if there are many hash table elements, moving them once will cause the put operation to be very stuck.

As for ConcurrentHashMapthe expansion strategy - break it into parts. If you find a thread that needs to be expanded, you only need to create a new array (larger memory space) and move only a few elements there. During expansion, new and old arrays exist at the same time.

Each subsequent thread that operates ConcurrentHashMap will participate in the moving process. Each operation is responsible for moving a small number of elements, and then deletes the old array after moving the last element. During this period, insertion only adds to the new array, and search requires searching both the new array and the old array.

Optimization 4: Underlying implementation
Improve the original 数组 + 链表implementation method into 数组 + 链表 / 红黑树the method. When the linked list is long (greater than or equal to 8 elements), it is converted into a red-black tree.

Use HashMap, HashTable, ConcurrentHashMap trivia:

  • HashMap: key is allowed to be null
  • HashTable: key is not allowed to be null
  • ConcurrentHashMap: key is not allowed to be null

Guess you like

Origin blog.csdn.net/LEE180501/article/details/130655967