Concurrent Programming Practice 11 - Concurrent Container CopyOnWriteArrayList

1. Synchronization container and concurrent container
In the early days of jdk, in order to solve the problem of concurrency security, the synchronization container Vector and Hashtable were introduced. In JDK1.2, the synchronization encapsulation class was introduced, which can be created by methods such as Collections.synchronizedXxxx, and can directly encapsulate the ArrayList to achieve synchronization. However, there is a problem with the synchronization container, which is too strict, that is, it is completely serialized, which leads to locking even under compound operations without thread safety issues.
Common compound operations are as follows:

  • Iteration: repeatedly visit elements until all elements are traversed;
  • Jump: Find the next (next n) elements of the current element according to the specified order;
  • Conditional operation: for example, if there is no, add, etc.;

Synchronized container access to all container states is serialized, seriously reducing concurrency; when multiple threads compete for locks, throughput is severely reduced;

After java5.0, a variety of concurrent containers are provided to improve the performance of synchronous containers, such as ConcurrentHashMap, CopyOnWriteArrayList, CopyOnWriteArraySet, ConcurrentSkipListMap, etc. These containers are optimized to meet different needs and are not completely serialized.

general container Sync container concurrent container
ArrayList Vector CopyOnWriteArrayList
HashMap HashTable ConcurrentHashMap

Generally, container threads are not safe; synchronous containers are thread-safe, but too strict; in order to meet actual needs and reasonably adapt to the needs, concurrent containers are added to better adapt to concurrent conditions, that is, synchronization containers are for safety, and concurrent containers are for safety. At the same time, achieve High concurrency efficiency.

2. Source code analysis: CopyOnWriteArrayList
already knew from the previous understanding of read-write locks. In actual situations, it often reads more and writes less, and reading data does not affect the security of data, so reading data can be completely concurrent. , and only strictly lock the write data.

  • When modified (added and removed), the array is copied and locked.
  • When reading, do not lock, read directly
/** The lock protecting all mutators */
// 1、用于加锁的可重入锁
final transient ReentrantLock lock = new ReentrantLock();
/** The array, accessed only via getArray/setArray. */
// 2、存取数据的volatile数组
private transient volatile Object[] array;

3. Add elements

// 3、添加元素
 public boolean add(E e) {
        final ReentrantLock lock = this.lock;                       // 获取独占锁
        lock.lock();
        try {
            Object[] elements = getArray();
            int len = elements.length;
            Object[] newElements = Arrays.copyOf(elements, len + 1);// 重新生成一个新的数组实例,并将原始数组的元素拷贝到新数组中
            newElements[len] = e;                                   // 添加新的元素到新数组的末尾
            setArray(newElements);                                  // 更新底层数组
            return true;
        } finally {
            lock.unlock();
        }
    }

Two things must be clear:

  • First, before the "add operation" starts, acquire the exclusive lock (lock), if there is a need for a thread to acquire the lock at this time, it must wait; after the operation is completed, release the exclusive lock (lock), then other threads can acquire it Lock. Use exclusive locks to prevent multiple threads from modifying data at the same time! At this point, it can still be read, but the original array is read. Haven't security issues occurred during this time?
  • Second, when the operation is complete, the volatile array is updated by setArray(). A read of a volatile variable always sees (any thread) the last write to the volatile variable; this way, every time an element is added, other threads can see the newly added element.

4. Delete data
remove data is similar to add, and it is also a copy

public E remove(int index) {  
    final ReentrantLock lock = this.lock;  
    lock.lock();  
    try {  
        Object[] elements = getArray();  
        int len = elements.length;  
        E oldValue = get(elements, index); // 获取volatile数组中指定索引处的元素值  
        int numMoved = len - index - 1;  
        if (numMoved == 0) // 如果被删除的是最后一个元素,则直接通过Arrays.copyOf()进行处理,而不需要新建数组  
            setArray(Arrays.copyOf(elements, len - 1));  
        else {  
            Object[] newElements = new Object[len - 1];  
            System.arraycopy(elements, 0, newElements, 0, index);    // 拷贝删除元素前半部分数据到新数组中  
            System.arraycopy(elements, index + 1, newElements, index, numMoved);// 拷贝删除元素后半部分数据到新数组中  
            setArray(newElements); // 更新volatile数组  
        }  
        return oldValue;  
    } finally {  
        lock.unlock();  
    }  
} 

5. Read
data Return the element at the specified index of the underlying volatile array.

public E get(int index) {
        return get(getArray(), index);
    }
private E get(Object[] a, int index) {
        return (E) a[index];
    }

6. Traverse elements

    public Iterator<E> iterator() {
        return new COWIterator<E>(getArray(), 0);
    }
    public ListIterator<E> listIterator() {
        return new COWIterator<E>(getArray(), 0);
    }
    public ListIterator<E> listIterator(final int index) {
        Object[] elements = getArray();
        int len = elements.length;
        if (index<0 || index>len)
            throw new IndexOutOfBoundsException("Index: "+index);

        return new COWIterator<E>(elements, index);
    }

    private static class COWIterator<E> implements ListIterator<E> {
        private final Object[] snapshot; // 保存数组的快照,是一个不可变的对象
        private int cursor;

        private COWIterator(Object[] elements, int initialCursor) {
            cursor = initialCursor;
            snapshot = elements;
        }

        public boolean hasNext() {
            return cursor < snapshot.length;
        }

        public boolean hasPrevious() {
            return cursor > 0;
        }

        @SuppressWarnings("unchecked")
        public E next() {
            if (! hasNext())
                throw new NoSuchElementException();
            return (E) snapshot[cursor++];
        }

        @SuppressWarnings("unchecked")
        public E previous() {
            if (! hasPrevious())
                throw new NoSuchElementException();
            return (E) snapshot[--cursor];
        }

        public int nextIndex() {
            return cursor;
        }

        public int previousIndex() {
            return cursor-1;
        }
        public void remove() {
            throw new UnsupportedOperationException();
        }
        public void set(E e) {
            throw new UnsupportedOperationException();
        }
        public void add(E e) {
            throw new UnsupportedOperationException();
        }
    }

As above, an immutable Object array object is stored in the iterator of the container, so no further synchronization is required when traversing this object. On each modification, a new copy of the window is created and republished, enabling mutability. The above iterator code retains a reference to a volatile array, and since it will not be modified, multiple threads can iterate over it at the same time without interfering with each other or with the thread modifying the container.
Compared to the previous ArrayList implementation, CopyOnWriteArrayList returns an iterator without throwing a ConcurrentModificationException, i.e. it is not fail-fast!

3. Application Scenarios of CopyOnWrite The
CopyOnWrite concurrent container is used for concurrent scenarios with more reads and fewer writes. For example, cache; whitelist, blacklist, access and update scenarios of commodity categories, if we have a search website, users enter keywords in the search box of this website to search for content, but some keywords are not allowed to be searched. These keywords that cannot be searched will be placed in a blacklist, which is updated every night. When the user searches, it will check whether the current keyword is in the blacklist. If it is, it will prompt that the keyword cannot be searched.

There are two things to be aware of when using CopyOnWriteMap:

  • 1. Reduce the expansion cost. Initialize the size of CopyOnWriteMap according to actual needs to avoid the overhead of expanding CopyOnWriteMap during writing.
  • 2. Use batch add. Because each time you add, the container will be copied every time, so reducing the number of additions can reduce the number of times the container is copied. Such as using the addBlackList method in the above code.

Disadvantages of CopyOnWrite: The CopyOnWrite container has many advantages, but there are also two problems, namely memory usage and data consistency. So you need to pay attention when developing.

  • Memory usage issue. Because of the copy-on-write mechanism of CopyOnWrite, when a write operation is performed, the memory of two objects will reside in the memory at the same time, the old object and the newly written object (note: when copying, only the reference in the container is copied, It is just that new objects are created and added to the new container when writing, and the objects of the old container are still in use, so there are two object memory). If the memory occupied by these objects is relatively large, such as about 200M, then 100M of data is written in, and the memory will occupy 300M, so this time is likely to cause frequent Yong GC and Full GC. Previously, a service was used in our system. Due to the use of the CopyOnWrite mechanism to update large objects every night, Full GC occurred every night for 15 seconds, and the application response time also became longer. For the problem of memory usage, you can reduce the memory consumption of large objects by compressing the elements in the container. For example, if the elements are all decimal numbers, you can consider compressing them into 36 or 64 hexadecimal. Or do not use the CopyOnWrite container, but use other concurrent containers, such as ConcurrentHashMap.
  • Data consistency issues/eventually consistent. The CopyOnWrite container can only guarantee the eventual consistency of the data, but not the real-time consistency of the data. So if you want the written data to be read immediately, please do not use the CopyOnWrite container. That is, only the eventual consistency of data can be guaranteed, but real-time consistency cannot be guaranteed. Eventual consistency is also very important for distributed systems. It improves the availability and partition fault tolerance of the entire distributed system by tolerating data inconsistency for a certain period of time. Of course, eventual consistency is not applicable to all scenarios. Users of systems such as train station ticket sales have very, very high requirements for real-time data, so they must be strongly consistent.

Link: https://www.nowcoder.com/questionTerminal/95e4f9fa513c4ef5bd6344cc3819d3f7
Source: Niuke.com

Supplement:
One: fail-fast

  • When traversing a collection object with an iterator, if the content of the collection object is modified (added, deleted, modified) during the traversal process, a Concurrent Modification Exception will be thrown.
  • Principle: The iterator directly accesses the contents of the collection when traversing, and uses a modCount variable during the traversal. If the content of the collection changes while it is being traversed, the value of modCount will change. Whenever the iterator uses hashNext()/next() to traverse the next element, it will check whether the modCount variable is the expectedmodCount value, and if so, return the traversal; otherwise, throw an exception and terminate the traversal.
  • Note: The throw condition of this exception is that modCount is detected! =expectedmodCount This condition. If the modified modCount value happens to be set to the expectedmodCount value when the collection changes, the exception will not be thrown. Therefore, the programming of concurrent operations cannot depend on whether this exception is thrown or not. This exception is only recommended for detecting concurrent modification bugs.
  • Scenario: The collection classes in the java.util package are all fail-fast and cannot be concurrently modified (modified during iteration) under multiple threads.

Two: fail-safe (clone container Copy)

  • A collection container that adopts a failsafe mechanism does not directly access the collection content during traversal, but first copies the original collection content and traverses the copied collection.
  • Principle: Since the copy of the original collection is traversed during iteration, the modifications made to the original collection during the traversal process cannot be detected by the iterator, so the Concurrent Modification Exception will not be triggered.
  • Scenario: The containers under the java.util.concurrent package are all safe to fail, and can be used and modified concurrently under multiple threads.

Java 7 Multithreaded Concurrent Container - CopyOnWriteArrayList
CopyOnWriteArrayList Detailed Explanation

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326316837&siteId=291194637