[JAVA collection articles] Detailed explanation of ArrayList source code


foreword

ArrayList is a List implemented by an array. Compared with an array, it has the ability to expand dynamically, so it can also be called a dynamic array.

Any type of data can be stored in the ArrayList collection, and it is a sequential container. The order of the stored data is the same as the order we put it in, and it also allows us to put null elements.

inheritance system

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
        
	{
    
    ...}
  • ArrayList implements List and provides basic operations such as adding, deleting, and traversing.

  • ArrayList implements RandomAccess, which provides random access capabilities.

  • ArrayList implements Cloneable and can be cloned.

  • ArrayList implements Serializable and can be serialized.

Source code analysis

Attributes

	/**
	 * 默认容量
	 */
	private static final int DEFAULT_CAPACITY = 10;
	
	/**
	 * 空数组,如果传入的容量为0时使用
	 */
	private static final Object[] EMPTY_ELEMENTDATA = {
    
    };
	
	/**
	 * 空数组,传传入容量时使用,添加第一个元素的时候会重新初始为默认容量大小
	 */
	private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {
    
    };
	
	/**
	 * 存储元素的数组
	 */
	transient Object[] elementData; // non-private to simplify nested class access
	
	/**
	 * 集合中元素的个数
	 */
	private int size;

(1) DEFAULT_CAPACITY: The default capacity is 10, which is new ArrayList()the default capacity at the time of creation.

(2) EMPTY_ELEMENTDATA: An empty array, this new ArrayList(0)empty array is used when creating.

(3) DEFAULTCAPACITY_EMPTY_ELEMENTDATA: It is also an empty array, which is new ArrayList()created by using this empty array, and EMPTY_ELEMENTDATAthe difference is that when adding the first element, the empty array will be initialized to DEFAULT_CAPACITY(10)an element.

(4) elementData: where the elements are actually stored.

(5) size: the number of elements actually stored, not the length of the elementData array.

Why should the elementData array of ArrayList be decorated with transient?

Since ArrayList has an automatic expansion mechanism, elementDatathe array size of ArrayList is often larger than the number of existing elements. If transientdirect serialization is not added, the vacant positions in the array will also be serialized, which wastes a lot of space.

writeObjectArrayList rewrites the corresponding methods of serialization and deserialization readObject. When traversing array elements, size is used as the end mark, and only the elements that already exist in ArrayList are serialized.

ArrayList (int initialCapacity) construction method

public ArrayList(int initialCapacity) {
    
    
    if (initialCapacity > 0) {
    
    
        // 如果传入的初始容量大于0,就新建一个数组存储元素
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
    
    
        // 如果传入的初始容量等于0,使用空数组EMPTY_ELEMENTDATA
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
    
    
        // 如果传入的初始容量小于0,抛出异常
        throw new IllegalArgumentException("Illegal Capacity: " + initialCapacity);
    }
}

ArrayList() construction method

public ArrayList() {
    
    
    // 如果没有传入初始容量,则使用空数组DEFAULTCAPACITY_EMPTY_ELEMENTDATA
    // 使用这个数组是在添加第一个元素的时候会扩容到默认大小10
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

ArrayList constructor

/**
* 把传入集合的元素初始化到ArrayList中
*/
public ArrayList(Collection<? extends E> c) {
    
    
    // 集合转数组
    elementData = c.toArray();
    if ((size = elementData.length) != 0) {
    
    
        // 检查c.toArray()返回的是不是Object[]类型,如果不是,重新拷贝成Object[].class类型
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    } else {
    
    
        // 如果c的空集合,则初始化为空数组EMPTY_ELEMENTDATA
        this.elementData = EMPTY_ELEMENTDATA;
    }
}

add(E e) method

Adding elements to the end has an average time complexity of O(1).

public boolean add(E e) {
    
    
    // 检查是否需要扩容
    ensureCapacityInternal(size + 1);
    // 把元素插入到最后一位
    elementData[size++] = e;
    return true;
}

private void ensureCapacityInternal(int minCapacity) {
    
    
    ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}

private static int calculateCapacity(Object[] elementData, int minCapacity) {
    
    
    // 如果是空数组DEFAULTCAPACITY_EMPTY_ELEMENTDATA,就初始化为默认大小10
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
    
    
        return Math.max(DEFAULT_CAPACITY, minCapacity);
    }
    return minCapacity;
}

private void ensureExplicitCapacity(int minCapacity) {
    
    
    modCount++;

    if (minCapacity - elementData.length > 0)
        // 扩容
        grow(minCapacity);
}

private void grow(int minCapacity) {
    
    
    int oldCapacity = elementData.length;
    // 新容量为旧容量的1.5倍
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    // 如果新容量发现比需要的容量还小,则以需要的容量为准
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    // 如果新容量已经超过最大容量了,则使用最大容量
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // 以新容量拷贝出来一个新数组
    elementData = Arrays.copyOf(elementData, newCapacity);
}

add(int index, E element) method

Add elements to the specified position, the average time complexity is O(n).

public void add(int index, E element) {
    
    
    // 检查是否越界
    rangeCheckForAdd(index);
    // 检查是否需要扩容
    ensureCapacityInternal(size + 1);
    // 将inex及其之后的元素往后挪一位,则index位置处就空出来了
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    // 将元素插入到index的位置
    elementData[index] = element;
    // 大小增1
    size++;
}

private void rangeCheckForAdd(int index) {
    
    
    if (index > size || index < 0)
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

Why is ArrayList slow when it is added?

Through the above source code, we can see that ArrayListthere are new additions to the specified index, and there are also direct additions. Before that, he will have a step to check the length of the judgment ensureCapacityInternal, that is, if the length is not enough, it needs to be expanded.

When expanding capacity, there is a difference between the old version of jdk and the version after 8. After 8, the efficiency is higher, and the bit operation is used to shift one bit to the right, which is actually the operation of dividing by 2. int newCapacity = oldCapacity + (oldCapacity >> 1);The capacity of the new array is 1.5 times that of the old array.

When the specified location is added, the operation after verification is very simple, which is the copy of the array. System.arraycopy(elementData, index, elementData, index + 1, size - index);For better explanation, here is a picture, as follows:

For example, if there is an array like the following, I need to add an element a at index 4

image-20220302112943491

From the code, we can see that it copies an array, starting from the position of index 4, and then puts it at the position of index 4+1

image-20220302113056958

Make room for the element we want to add, and then put the element a at the index position to complete the new operation.

image-20220302113127354

This is just an operation in such a small List. If I add an element to a List with a size of hundreds, tens, and tens of thousands, then all subsequent elements need to be copied, and then it will be slower if expansion is involved. is not it.

addAll method

Finds the union of two sets.

/**
* 将集合c中所有元素添加到当前ArrayList中
*/
public boolean addAll(Collection<? extends E> c) {
    
    
    // 将集合c转为数组
    Object[] a = c.toArray();
    int numNew = a.length;
    // 检查是否需要扩容
    ensureCapacityInternal(size + numNew);
    // 将c中元素全部拷贝到数组的最后
    System.arraycopy(a, 0, elementData, size, numNew);
    // 大小增加c的大小
    size += numNew;
    // 如果c不为空就返回true,否则返回false
    return numNew != 0;
}

get(int index) method

Get the element at the specified index position, the time complexity is O(1).

public E get(int index) {
    
    
    // 检查是否越界
    rangeCheck(index);
    // 返回数组index位置的元素
    return elementData(index);
}

private void rangeCheck(int index) {
    
    
    if (index >= size)
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

E elementData(int index) {
    
    
    return (E) elementData[index];
}

(1) Check whether the index is out of bounds. Here only check whether the upper bound is exceeded. If the upper bound is exceeded, an exception will be thrown. IndexOutOfBoundsExceptionIf the index is lower than the lower bound, an exception will be thrown ArrayIndexOutOfBoundsException.

(2) Return the element at the index position;

remove(int index) method

Delete the element at the specified index position, the time complexity is O(n).

public E remove(int index) {
    
    
    // 检查是否越界
    rangeCheck(index);

    modCount++;
    // 获取index位置的元素
    E oldValue = elementData(index);

    // 如果index不是最后一位,则将index之后的元素往前挪一位
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index, numMoved);

    // 将最后一个元素删除,帮助GC
    elementData[--size] = null; // clear to let GC do its work

    // 返回旧值
    return oldValue;
}

remove(Object o) method

Delete the element with the specified element value, the time complexity is O(n).

public boolean remove(Object o) {
    
    
    if (o == null) {
    
    
        // 遍历整个数组,找到元素第一次出现的位置,并将其快速删除
        for (int index = 0; index < size; index++)
            // 如果要删除的元素为null,则以null进行比较,使用==
            if (elementData[index] == null) {
    
    
                fastRemove(index);
                return true;
            }
    } else {
    
    
        // 遍历整个数组,找到元素第一次出现的位置,并将其快速删除
        for (int index = 0; index < size; index++)
            // 如果要删除的元素不为null,则进行比较,使用equals()方法
            if (o.equals(elementData[index])) {
    
    
                fastRemove(index);
                return true;
            }
    }
    return false;
}

private void fastRemove(int index) {
    
    
    // 少了一个越界的检查
    modCount++;
    // 如果index不是最后一位,则将index之后的元素往前挪一位
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index, numMoved);
    // 将最后一个元素删除,帮助GC
    elementData[--size] = null; // clear to let GC do its work
}

(1) Find the first element equal to the specified element value;

(2) Quick delete, compared to fastRemove(int index), there remove(int index)are fewer operations to check for index out-of-bounds.

retainAll method

Finds the intersection of two sets.

public boolean retainAll(Collection<?> c) {
    
    
    // 集合c不能为null
    Objects.requireNonNull(c);
    // 调用批量删除方法,这时complement传入true,表示删除不包含在c中的元素
    return batchRemove(c, true);
}

/**
* 批量删除元素
* complement为true表示删除c中不包含的元素
* complement为false表示删除c中包含的元素
*/
private boolean batchRemove(Collection<?> c, boolean complement) {
    
    
    final Object[] elementData = this.elementData;
    // 使用读写两个指针同时遍历数组
    // 读指针每次自增1,写指针放入元素的时候才加1
    // 这样不需要额外的空间,只需要在原有的数组上操作就可以了
    int r = 0, w = 0;
    boolean modified = false;
    try {
    
    
        // 遍历整个数组,如果c中包含该元素,则把该元素放到写指针的位置(以complement为准)
        for (; r < size; r++)
            if (c.contains(elementData[r]) == complement)
                elementData[w++] = elementData[r];
    } finally {
    
    
        // 正常来说r最后是等于size的,除非c.contains()抛出了异常
        if (r != size) {
    
    
            // 如果c.contains()抛出了异常,则把未读的元素都拷贝到写指针之后
            System.arraycopy(elementData, r,
                             elementData, w,
                             size - r);
            w += size - r;
        }
        if (w != size) {
    
    
            // 将写指针之后的元素置为空,帮助GC
            for (int i = w; i < size; i++)
                elementData[i] = null;
            modCount += size - w;
            // 新大小等于写指针的位置(因为每写一次写指针就加1,所以新大小正好等于写指针的位置)
            size = w;
            modified = true;
        }
    }
    // 有修改返回true
    return modified;
}

(1) traverse the elementData array;

(2) If the element is in c, add this element to the w position of the elementData array and move the w position back one bit;

(3) After the traversal, the elements before w are shared by both, and the elements after (including) w are not shared by both;

(4) Set the elements after (including) w to null to facilitate GC recycling;

removeAll

Find the unidirectional difference of two sets, keep only the elements in the current set that are not in c, and not keep the elements in c that are not in the current set.

public boolean removeAll(Collection<?> c) {
    
    
    // 集合c不能为空
    Objects.requireNonNull(c);
    // 同样调用批量删除方法,这时complement传入false,表示删除包含在c中的元素
    return batchRemove(c, false);
}

Similar to retainAll(Collection<?> c)method, except that elements not in c are retained here.

Summarize

(1) ArrayList internally uses arrays to store elements. When expanding, half of the space is added each time, and ArrayList will not shrink.

(2) ArrayList supports random access, accessing elements through indexes is extremely fast, and the time complexity is O(1).

(3) Adding elements to the end of ArrayList is extremely fast, and the average time complexity is O(1).

(4) Adding elements to the middle of ArrayList is relatively slow, because the average time complexity is O(n) to move elements.

(5) ArrayList deletes elements from the tail extremely fast, and the time complexity is O(1).

(6) ArrayList is relatively slow to delete elements from the middle, because the average time complexity is O(n) to move elements.

(7) ArrayList supports union, addAll(Collection<? extends E> c)just call the method.

(8) ArrayList supports intersection, retainAll(Collection<? extends E> c)just call the method.

(7) ArrayList supports one-way difference, removeAll(Collection<? extends E> c)just call the method.

Guess you like

Origin blog.csdn.net/jiang_wang01/article/details/131214767