ArrayList特性及源码阅读

ArrayList是日常开发用的最频繁的集合了，它因为能够快速访问而备受青睐。

文章目录

一、构造方法

常用的构造方法有2种，一种是默认无参构造方法，另一种是含参数的构造方法。

默认无参构造

//new ArrayList(0) 会创建一个空实例的共享空数组实例
private static final Object[] EMPTY_ELEMENTDATA = {
    
    };

//new ArrayList()默认构造会创建一个提供默认大小的实例的共享空数组实例
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {
    
    };

public ArrayList() {
    
    
     this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

那为啥会有EMPTY_ELEMENTDATA 和DEFAULTCAPACITY_EMPTY_ELEMENTDATA这两个实例？主要的用于在于扩容的时候辨别当前ArrayList是由哪个构造方法创建的。

含参构造

/**
 * @param  initialCapacity  初始容量
 * 如果initialCapacity<0 就抛出异常
 */
public ArrayList(int initialCapacity) {
    
    
    if (initialCapacity > 0) {
    
    
        this.elementData =
                new Object[initialCapacity];
    } else if (initialCapacity == 0) {
    
    
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
    
    
        throw new IllegalArgumentException(
                "Illegal Capacity: "+ initialCapacity);
    }
}

总的来说，ArrayList的三种构造:

new ArrList() ：使用DEFAULTCAPACITY_EMPTY_ELEMENTDATA（第一次add元素要扩容），默认长度是10
new ArrList(0)：使用EMPTY_ELEMENTDATA，（第一次add元素要扩容），但第一次扩容结果和上面不同
new ArrList(21)：创建一个长度为21的数组
其他情况：抛异常

二、成员变量

在看具体操作之前，先看一下有关的成员变量

//new ArrayList() 第一次扩容后的大小
private static final int DEFAULT_CAPACITY = 10;

//elementData.lentgh是当前ArrayList的容量大小,用于扩容判断
transient Object[] elementData;

//当前ArrayList中的元素个数
private int size;

三、add()操作

add()允许所有的元素，包括null。

public boolean add(E e) {
    
    
     ensureCapacityInternal(size + 1);  // Increments modCount!!
     elementData[size++] = e;
     return true;
}

//确定是否需要扩容
private void ensureCapacityInternal(int minCapacity) {
    
    
     ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}    

private static int calculateCapacity(Object[] elementData, int minCapacity) {
    
    
	 //new ArrayList()在第一次add会return max(10,1)
     if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
    
    
        return Math.max(DEFAULT_CAPACITY, minCapacity);
     }
     //new ArrayList(0) 在第一次add时会走这一步 return 1
     return minCapacity;
}

/*
    new ArrayList()会在grow()内扩容成 10
    new ArrayList(0)会在grow()内扩容成 1 
*/
private void ensureExplicitCapacity(int minCapacity) {
    
    
     modCount++;

     // new ArrayList()和new ArrayList(0)的length均为0,因为一开始创建的都是空的数组实例
     if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

add的流程是先确定是否扩容，然后把新元素放至数组末尾。
值得注意的是，new ArrayList()和new ArrayList(0)在扩容的结果中不一样，当第一次add时，在calculateCapacity(Object[] elementData, int minCapacity)方法里，new ArrayList()会返回10，而new ArrayList(0) 会返回1。他们的返回结果会在grow()中决定第一次扩容的长度

private void grow(int minCapacity) {
    
    
        // new ArrayList()和new ArrayList(0)的length均为0,因为一开始创建的都是空的数组实例
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
}

grow()方法会将数组长度扩大到原来的1.5倍，采用了(oldCapacity >> 1)位运算效率高。如果newCapacity >Integer.MAX_VALUE - 8，那直接让扩容后的容量等于Integer.MAX_VALUE，可能有人会疑惑为什么在数组元素个数大于等于Integer.MAX_VALUE - 8的时候才扩容到最大值，这个值取小了，会有很多数组“更轻易”扩容到最大值而浪费内存。
在确定新的数组容量之后，在Arrays.copyOf(elementData, newCapacity)会创建一个新数组，让原来的引用指向新的数组。
具体流程长这样：
在这里插入图片描述

add(int index, E element) ：在某个位置插入

public void add(int index, E element) {
    
    
	//检查插入的位置是否存在数值越界
    rangeCheckForAdd(index);
    //确定是否需要扩容
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    System.arraycopy(elementData, index, elementData, index + 1, size - index);
    elementData[index] = element;
    size++;
}

为什么说ArrayList不适合插入修改呢（面试考点）？原因就是在插入的时候，System.arraycopy(elementData, index, elementData, index + 1, size - index); 会将插入位置index以后的元素都复制出来，然后从index+1开始依次安放，看起来就像是元素逐个往后移一位，最后才在指定的index赋值。这个过程还是很费时间的，如果刚好要扩容就更麻烦了。

addAll() ：把某个集合的内容加入ArrayList

//把集合的内容加到末尾
public boolean addAll(Collection<? extends E> c) {
    
    
    Object[] a = c.toArray();
    int numNew = a.length;
    //判断是否需要扩容
    ensureCapacityInternal(size + numNew); 
    //其中的参数size就是说把 来源数组a copy到目标数组elementData的最后一个位置
    System.arraycopy(a, 0, elementData, size, numNew);
    size += numNew;
    return numNew != 0;
}

//把集合的内容添加到数组的某个位置
public boolean addAll(int index, Collection<? extends E> c) {
    
    
    rangeCheckForAdd(index);

    Object[] a = c.toArray();
    int numNew = a.length;
    ensureCapacityInternal(size + numNew);  // Increments modCount

    int numMoved = size - index;
    if (numMoved > 0)
    	//先把目标数组elementData的部分元素后移numNew个单位
        System.arraycopy(elementData, index, elementData, index + numNew, numMoved);
	
	//再把 来源数组a copy到目标数组elementData下标为[index,index+numNew]的位置
    System.arraycopy(a, 0, elementData, index, numNew);
    size += numNew;
    return numNew != 0;
}

虽然都是addAll()，在指定位置插入 addAll(int index, Collection<? extends E> c)的逻辑要复杂一些，要做两步：

把目标数组的元素（从index位置开始的元素）往后移动numNew个单位，numNew是来源数组的长度
将来源数组a 的所有元素copy并依次安放到目标数组elementData的[index,index+numNew]的位置。

四、 remove()

remove()是要将待删除元素的后面所有元素往前移动一位，来完成删除操作

remove(int index)：在指定位置移除元素

public E remove(int index) {
    
    
    rangeCheck(index);
    modCount++;
    E oldValue = elementData(index);
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index, numMoved);
    elementData[--size] = null; // clear to let GC do its work
    return oldValue;
}

将指定位置的元素移除时，remove(int index)的操作有：

检查index是否越界
判断index是否在数组末尾，若在数组末尾，则直接让最后一个元素为Null即可elementData[--size] = null
若index不是数组最后一个元素，则复制elementData[index+1]及以后的元素，把它们从下标index开始依次安放，最后才让数组末尾元素为null。

remove(Object o)：删除指定元素

public boolean remove(Object o) {
    
    
   if (o == null) {
    
    
       for (int index = 0; index < size; index++)
           if (elementData[index] == null) {
    
    
               fastRemove(index);
               return true;
           }
   } else {
    
    
       for (int index = 0; index < size; index++)
           if (o.equals(elementData[index])) {
    
    
               fastRemove(index);
               return true;
           }
   }
   return false;
}

private void fastRemove(int index) {
    
    
   modCount++;
   int numMoved = size - index - 1;
   if (numMoved > 0)
       System.arraycopy(elementData, index+1, elementData, index,numMoved);
   elementData[--size] = null; // clear to let GC do its work
}

删除指定元素remove(Object o) 的要麻烦一点，因为要遍历数组挨个比较一遍，如果比较相等，则调用fastRemove(int index) 本质还是remove(int index)将其后面的元素前移一位。

五 get()：获取指定位置的元素

public E get(int index) {
    
    
    rangeCheck(index);
    return elementData(index);
}

ArrayList底层是数组，数组支持随机访问，因为内存的连续性也让数组的遍历效率高。而基于双向链表实现的LinkedList查找元素就得挨个遍历，加上内存不连续的原因，不适合查找。

六、迭代器

迭代器是ArrayList的内部类，遍历的时候不能同时进行删除、新增等操作，为了防止多线程对同一个ArrayList实例进行操作带来的数据一致性的问题（脏读）。

private class Itr implements Iterator<E> {
    
    
    int cursor;       // index of next element to return
    int lastRet = -1; // index of last element returned; -1 if no such
    int expectedModCount = modCount;

    public boolean hasNext() {
    
    
        return cursor != size;
    }

    @SuppressWarnings("unchecked")
    public E next() {
    
    
        checkForComodification();
        int i = cursor;
        //数组为空会抛异常
        if (i >= size)
            throw new NoSuchElementException();
        Object[] elementData = ArrayList.this.elementData;
        //new ArrayList(0)后调用next()会抛异常
        if (i >= elementData.length)
            throw new ConcurrentModificationException();
        cursor = i + 1;
        return (E) elementData[lastRet = i];
    }

    public void remove() {
    
    
        if (lastRet < 0)
            throw new IllegalStateException();
        checkForComodification();

        try {
    
    
            ArrayList.this.remove(lastRet);
            cursor = lastRet;
            lastRet = -1;
            expectedModCount = modCount;
        } catch (IndexOutOfBoundsException ex) {
    
    
            throw new ConcurrentModificationException();
        }
    }

    final void checkForComodification() {
    
    
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
    }
}

成员变量有3个：

cursor：下一个要访问的元素
lastRet：上一个访问的元素下标，定义为-1可用于判断数组是否为空。
expectedModCount ：ArrayList修改次数的期望值，如果期望值和实际值modCount不符，代表迭代器在遍历的过程中伴随着remove和add，就会抛出异常（每次remove和add改变了ArrayList原有的结构都会让modCount+1）。

迭代器有3个方法hasNext()、next()、remove()

查看是否还有下一个元素public boolean hasNext()：当cursor（代表下一个要访问的元素下标）等于size时，代表当前已经到达了数组的最后一个元素。
返回下一个元素public E next()：若数组为空，会抛NoSuchElementException异常；若数组通过new ArrayList(0)构造，直接调用next()会抛ConcurrentModificationException异常。正常时候会返回下一个要访问的元素
移除元素remove(): 将elementData[ lastRet]删除，后面的元素往前移动一位之后，cursor也要跟着往前移动一位，体现在cursor = lastRet。

七、线程不安全

        ArrayList是线程不安全的，想象一下，多线程对同一个ArrayList同时进行遍历、增加、删除，会有脏读，重复扩容浪费空间等问题发生。
        为了规避风险，ArrayList有快速失败fail-fast机制，而不是冒险执行并发修改，具体体现在if (modCount != expectedModCount) throw new ConcurrentModificationException();
        如果要线程安全，可用Vector，也可用Collections.synchronizedList来包装ArrayList，这两种都是通过synchronized关键字修饰方法来保证线程安全。