Problems with ArrayList under multithreading

1. add(E e) source code

First look at the implementation of the add(E e) method of ArrayList under jdk 1.8:

/**
 * Appends the specified element to the end of this list.
 *
 * @param e element to be appended to this list
 * @return <tt>true</tt> (as specified by {@link Collection#add})
 */
public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}
复制代码

Next, let's analyze in detail the code of each line of this method:

1.1 ensureCapacityInternal(int minCapacity)方法

The source code is:

private void ensureCapacityInternal(int minCapacity) {
    ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}
复制代码

The purpose of this method is to expand the original Object[] elementDate array (the bottom layer of ps:ArrayList is implemented with Object[] array), and the incoming parameter int minCapacityrefers to the size of the expanded Object[] array size.

Take a closer look at calculateCapacity(Object[] elementData, int minCapacity)how and ensureExplicitCapacity(int minCapacity)how:

1.1.1 calculateCapacity(Object[] elementData, int minCapacity)方法

private static int calculateCapacity(Object[] elementData, int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        return Math.max(DEFAULT_CAPACITY, minCapacity);
    }
    return minCapacity;
}
复制代码

The DEFAULTCAPACITY_EMPTY_ELEMENTDATAsource code here is:

/**
     * Shared empty array instance used for default sized empty instances. We
     * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
     * first element is added.
     */
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
复制代码

Instead DEFAULT_CAPACITY:

/**
     * Default initial capacity.
     */
    private static final int DEFAULT_CAPACITY = 10;
复制代码

It is also explained here that the capacity of ArrayList is not initialized to 10 at the beginning, but the capacity will become 10 until the first element is added.

1.1.2 ensureExplicitCapacity(int minCapacity)方法

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;

    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}
复制代码

modCount indicates how many times the ArrayList has been changed (if the container ArrayList is changed when the iterator is used to traverse the ArrayList, an error will be reported, which is related to this).

When the parameter passed in minCapacity(as mentioned above, the size passed in here is minCapacitygenerally the size after expansion) is larger than the capacity of the original ArrayList, the capacity will be expanded. (In a special case, if the expanded capacity has exceeded the maximum value of int, the capacity will not be expanded).

In particular, take a look at the scaling method:

1.1.2.1 grow(int minCapacity) - expansion method
/**
 * Increases the capacity to ensure that it can hold at least the
 * number of elements specified by the minimum capacity argument.
 *
 * @param minCapacity the desired minimum capacity
 */
private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}
复制代码

很明显,一般而言,扩容后的容量一般是原始容量的1.5倍。但是这里源码判断了if (newCapacity - minCapacity < 0)的情况,那这里为什么要这么判断呢?难道扩容完新容量还会变小吗?答案是存在这种情况的:(参考文章:blog.csdn.net/anglehuap/a… int oldCapacity = Integer.MAX_VALUE-11111111;,根据扩容算法得到int newCapacity = oldCapacity + (oldCapacity >> 1);。输出结果为oldCapacity = 2136372536,而newCapacity = -1090408492,这时候我们新容量等于原来的容量,相当于不扩容。

再看一下if (newCapacity - MAX_ARRAY_SIZE > 0)这个判断,首先看一下MAX_ARRAY_SIZE是什么。其源码为:

/**
     * The maximum size of array to allocate.
     * Some VMs reserve some header words in an array.
     * Attempts to allocate larger arrays may result in
     * OutOfMemoryError: Requested array size exceeds VM limit
     */
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
复制代码

至于为什么是Integer.MAX_VALUE - 8,是因为虚拟机会给数组这个大对象分配对象头,一般而言是8个字节,防止OOM的。而hugeCapacity(int minCapacity)则确保了int边界的情况,源码为:

private static int hugeCapacity(int minCapacity) {
    if (minCapacity < 0) // overflow
        throw new OutOfMemoryError();
    return (minCapacity > MAX_ARRAY_SIZE) ?
        Integer.MAX_VALUE :
        MAX_ARRAY_SIZE;
}
复制代码

2、分析add(E e)方法

很明显,添加的方法为:

ArrayList的底层是一个动态数组,ArrayList首先会对传进来的初始化参数initalCapacity进行判断,如果参数等于0,则将数组初始化为一个空数组,如果不等于0,将数组初始化为一个容量为10的数组。初始容量也可以自定义指定。随着不断添加元素,数组大小增加,当数组的大小大于初始容量的时候(比如初始为10,当添加第11个元素的时候),就会进行扩容,新的容量为旧的容量的1.5倍。扩容的时候,会以新的容量建一个原数组的拷贝,修改原数组,指向这个新数组,原数组被抛弃,会被GC回收。

(参考:blog.csdn.net/FateRuler/a…

3、多线程下的情况

假如原始数组的容量为5,索引为4。线程A刚运行完add方法的第一行,即ensureCapacityInternal(size + 1);,此时size + 1了,但是线程A并没有分配值到size + 1这个位置上时间片就结束了。此时运行线程B,线程B完整地跑完了add方法,此时线程A再运行的时候,则会将线程B在索引为5的地方写的值覆盖掉。

4、改进

使用线程安全的CopyOnWriteArrayList类去完成多线程下的工作,这个类在读多写少的情况下,性能很好,因为只在写的时候(即在使用add(E e)方法)才会用ReentrantLock锁,读则是直接读。 直接贴一下CopyOnWriteArrayListadd(E e)方法的源码:

/**
 * Appends the specified element to the end of this list.
 *
 * @param e element to be appended to this list
 * @return {@code true} (as specified by {@link Collection#add})
 */
public boolean add(E e) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        int len = elements.length;
        Object[] newElements = Arrays.copyOf(elements, len + 1);
        newElements[len] = e;
        setArray(newElements);
        return true;
    } finally {
        lock.unlock();
    }
}
复制代码

Guess you like

Origin juejin.im/post/7078499179097489439