ArrayList source code analysis (expansion mechanism jdk8)

ArrayList Overview

(1) ArrayListis a variable-length collection class, Based on fixed-length array.

(2) ArrayListallow nulls and repeating elements, when a large number of its capacity of the underlying array element when added to ArrayList, which will by expansion regenerate a larger array mechanism.

(3) Since ArrayListthe underlying array based implementation, it can ensure its O(1)complete lookup operation in random complexity.

(4) ArrayLista non-thread-safe, concurrent environment, a plurality of threads operating the ArrayList, throws an exception or unpredictable errors.

ArrayList member property

Look at the basic properties of the members before the introduction of the various methods on ArrayList. Which DEFAULTCAPACITY_EMPTY_ELEMENTDATA与EMPTY_ELEMENTDATA的区别是:当我们向数组中添加第一个元素时,DEFAULTCAPACITY_EMPTY_ELEMENTDATA将会知道数组该扩充多少.

//默认初始化容量
private static final int DEFAULT_CAPACITY = 10;

//默认的空的数组,这个主要是在构造方法初始化一个空数组的时候使用
private static final Object[] EMPTY_ELEMENTDATA = {};

//使用默认size大小的空数组实例,和EMPTY_ELEMENTDATA区分开来,
//这样可以知道当第一个元素添加的时候进行扩容至多少
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

//ArrayList底层存储数据就是通过数组的形式,ArrayList长度就是数组的长度。
//一个空的实例elementData为上面的DEFAULTCAPACITY_EMPTY_ELEMENTDATA,当添加第一个元素的时候
//会进行扩容,扩容大小就是上面的默认容量DEFAULT_CAPACITY
transient Object[] elementData; // non-private to simplify nested class access

//arrayList的大小
private int size;
复制代码

modified static EMPTY_ELEMENTDATAandDEFAULTCAPACITY_EMPTY_ELEMENTDATA

ArrayList Constructor

(1) with an initial capacity of the constructor

  • Array parameter is greater than 0, elementData initialized size initialCapacity
  • Parameter is less than 0, elementData initially empty array
  • Parameter is less than 0, an exception is thrown
//参数为初始化容量
public ArrayList(int initialCapacity) {
    //判断容量的合法性
    if (initialCapacity > 0) {
        //elementData才是实际存放元素的数组
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        //如果传递的长度为0,就是直接使用自己已经定义的成员变量(一个空数组)
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+
                                           initialCapacity);
    }
}
复制代码

(2) constructor with no arguments

  • In the constructor initializes an empty array DEFAULTCAPACITY_EMPTY_ELEMENTDATA elementData
  • When calling the add method to add the first element, it will be expansion
  • Expansion to a size of DEFAULT_CAPACITY = 10
//无参构造,使用默认的size为10的空数组,在构造方法中没有对数组长度进行设置,会在后续调用add方法的时候进行扩容
public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
复制代码

(3) Collection parameter type constructor

//将一个参数为Collection的集合转变为ArrayList(实际上就是将集合中的元素换为了数组的形式)。如果
//传入的集合为null会抛出空指针异常(调用c.toArray()方法的时候)
public ArrayList(Collection<? extends E> c) {
    elementData = c.toArray();
    if ((size = elementData.length) != 0) {
        //c.toArray()可能不会正确地返回一个 Object[]数组,那么使用Arrays.copyOf()方法
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    } else {
        //如果集合转换为数组之后数组长度为0,就直接使用自己的空成员变量初始化elementData
        this.elementData = EMPTY_ELEMENTDATA;
    }
}
复制代码

Above these construction method is relatively simple to understand, before attention two constructors do, the goal is to initialize the underlying array elementData (this.elementData = XXX) . The difference is 无参构造方法会将 elementData 初始化一个空数组,插入元素时,扩容将会按默认值重新初始化数组. And 有参的构造方法则会将 elementData 初始化为参数值大小(>= 0)的数组. Under normal circumstances, we can use the default constructor. If the constructor has parameters at will know how many elements are inserted into the case ArrayList, you may be used.

When it comes to the use of the above constructor with no arguments, and when you call the add method will be expansion, so let's look at the details of the method and add expansion

ArrayList add method

The method of general process add

//将指定元素添加到list的末尾
public boolean add(E e) {
    //因为要添加元素,所以添加之后可能导致容量不够,所以需要在添加之前进行判断(扩容)
    ensureCapacityInternal(size + 1);  // Increments modCount!!(待会会介绍到fast-fail)
    elementData[size++] = e;
    return true;
}
复制代码

We see approach before adding add an element, it will first determine the size of size, so we look at the details of the method ensureCapacityInternal

ensureCapacityInternal analysis method

private void ensureCapacityInternal(int minCapacity) {
    //这里就是判断elementData数组是不是为空数组
    //(使用的无参构造的时候,elementData=DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
    //如果是,那么比较size+1(第一次调用add的时候size+1=1)和DEFAULT_CAPACITY,
    //那么显然容量为10
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }
    ensureExplicitCapacity(minCapacity);
}
复制代码

* When you want to add into the first element, minCapacity is (size + 1 = 0 + 1 =) 1, in comparison Math.max () method, minCapacity 10. ** Then immediately call ensureExplicitCapacity updated value modCount and determine whether additional capacity is needed

ensureExplicitCapacity analysis method

private void ensureExplicitCapacity(int minCapacity) {
    modCount++; //这里就是add方法中注释的Increments modCount
    //溢出
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);//这里就是执行扩容的方法
}
复制代码

The following look at the main method of expansion grow.

grow to analyze

private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
private void grow(int minCapacity) {
    // oldCapacity为旧数组的容量
    int oldCapacity = elementData.length;
    // newCapacity为新数组的容量(oldCap+oldCap/2:即更新为旧容量的1.5倍)
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    // 检查新容量的大小是否小于最小需要容量,如果小于那旧将最小容量最为数组的新容量
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    //如果新容量大于MAX_ARRAY_SIZE,使用hugeCapacity比较二者
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    // 将原数组中的元素拷贝
    elementData = Arrays.copyOf(elementData, newCapacity);
}
复制代码

hugeCapacity way

Here a brief look at the methods hugeCapacity

private static int hugeCapacity(int minCapacity) {
    if (minCapacity < 0) // overflow
        throw new OutOfMemoryError();
    //对minCapacity和MAX_ARRAY_SIZE进行比较
    //若minCapacity大,将Integer.MAX_VALUE作为新数组的大小
    //若MAX_ARRAY_SIZE大,将MAX_ARRAY_SIZE作为新数组的大小
    //MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
    return (minCapacity > MAX_ARRAY_SIZE) ? Integer.MAX_VALUE : MAX_ARRAY_SIZE;
}
复制代码

add summary execution flow method

We use a simple map to sort out what, when using a constructor with no arguments when the execution process after the first call to add methods

This is the first time the process of calling the add method, when the value of capacity expansion is after 10,

  • Continue to add the second element (note that the first call parameter passing method is ensureCapacityInternal size + 1 = 1 + 1 = 2)

  • In ensureCapacityInternal method, elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA not established, so the direct execution method ensureExplicitCapacity

  • The method as just ensureExplicitCapacity minCapacity 2 passed, so that if the second determination (2-10 = -8) is not established, i.e. newCapacity than MAX_ARRAY_SIZE large, it will not enter the growmethod. Array capacity 10, add methods return true, size increased to 1.

  • Suppose further added 3,4 ...... elements 10 (where the process is similar but does not perform grow expansion method)

  • When the add time of 11 elements, will enter the grow method when calculating newCapacity 15, be minCapacity ratio of (10 + 1 = 11) is large, if the first determination is not satisfied. The new capacity is not greater than the maximum array size, will not enter hugeCapacity method. Expanding the capacity of the array 15, add methods return true, size increased to 11.

add (int index, E element) Method

//在元素序列 index 位置处插入
public void add(int index, E element) {
    rangeCheckForAdd(index); //校验传递的index参数是不是合法
    // 1. 检测是否需要扩容
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    // 2. 将 index 及其之后的所有元素都向后移一位
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    // 3. 将新元素插入至 index 处
    elementData[index] = element;
    size++;
}
private void rangeCheckForAdd(int index) {
    if (index > size || index < 0) //这里判断的index>size(保证数组的连续性),index小于0
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}
复制代码

add (int index, E element) method ((designated position in the sequence of elements 假设该位置合理) is inserted) The following procedure is about

  1. Detecting whether there is sufficient space array (implemented here and above)
  2. After all elements of its index moved back a
  3. The new element is inserted at index.

The new element is inserted into the sequence specified location, you need to position the elements and are back after a move, make room for the new element. The time complexity of this operation is O(N)frequently move elements may lead to efficiency issues, particularly large number of elements in the set time. In the daily development, if not required, we should try to avoid a second insertion method calls through large collections.

ArrayList remove method

ArrayList supports two ways to remove elements

1, remove (int index) deleted in accordance with the subscript

public E remove(int index) {
    rangeCheck(index); //校验下标是否合法(如果index>size,旧抛出IndexOutOfBoundsException异常)
    modCount++;//修改list结构,就需要更新这个值
    E oldValue = elementData(index); //直接在数组中查找这个值

    int numMoved = size - index - 1;//这里计算所需要移动的数目
    //如果这个值大于0 说明后续有元素需要左移(size=index+1)
    //如果是0说明被移除的对象就是最后一位元素(不需要移动别的元素)
    if (numMoved > 0)
        //索引index只有的所有元素左移一位  覆盖掉index位置上的元素
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    //移动之后,原数组中size位置null
    elementData[--size] = null; // clear to let GC do its work
    //返回旧值
    return oldValue;
}
//src:源数组   
//srcPos:从源数组的srcPos位置处开始移动
//dest:目标数组
//desPos:源数组的srcPos位置处开始移动的元素,这些元素从目标数组的desPos处开始填充
//length:移动源数组的长度
public static native void arraycopy(Object src,  int  srcPos,
                                    Object dest, int destPos,
                                    int length);
复制代码

The removal process as shown below

2, remove (Object o) according to remove elements, the first element of a matching parameter and deletes

public boolean remove(Object o) {
    //如果元素是null 遍历数组移除第一个null
    if (o == null) {
        for (int index = 0; index < size; index++)
            if (elementData[index] == null) {
                //遍历找到第一个null元素的下标 调用下标移除元素的方法
                fastRemove(index);
                return true;
            }
    } else {
        //找到元素对应的下标 调用下标移除元素的方法
        for (int index = 0; index < size; index++)
            if (o.equals(elementData[index])) {
                fastRemove(index);
                return true;
            }
    }
    return false;
}
//按照下标移除元素(通过数组元素的位置移动来达到删除的效果)
private void fastRemove(int index) {
  modCount++;
  int numMoved = size - index - 1;
  if (numMoved > 0)
    System.arraycopy(elementData, index+1, elementData, index,
                     numMoved);
  elementData[--size] = null; // clear to let GC do its work
}
复制代码

Other methods of ArrayList

ensureCapacity method

The best method before use add ensureCapacity large number of elements to reduce the number of increments of a new allocation

public void ensureCapacity(int minCapacity) {
    int minExpand = (elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
        // any size if not default element table
        ? 0
        // larger than default for default empty table. It's already
        // supposed to be at default size.
        : DEFAULT_CAPACITY;

    if (minCapacity > minExpand) {
        ensureExplicitCapacity(minCapacity);
    }
}
复制代码

ArrayList summary

(1) ArrayListis a variable-length collection class, Based on fixed-length array, using the default constructor initializes out of capacity after 10 (1.7 initialization is delayed, i.e., the first additional element add method is called when it will elementData capacity is initialized to 10).

(2) ArrayListallow nulls and repeating elements, when a large number of its capacity of the underlying array element when added to ArrayList, which will by expansion regenerate a larger array mechanism. ArrayListExpansion length is 1.5 times the original length

(3) Since ArrayListthe underlying array based implementation, it can ensure its O(1)complete lookup operation in random complexity.

(4) ArrayLista non-thread-safe, concurrent environment, a plurality of threads operating the ArrayList, throws an exception or unpredictable errors.

(5) The order of addition is convenient

(6) deletion and insertion Copy array, poor performance (use LinkindList)

(7) Integer.MAX_VALUE - 8: mainly on account of a different JVM, JVM and some will add some data first, when capacity after expansion is greater than MAX_ARRAY_SIZE, we need to compare the minimum capacity and MAX_ARRAY_SIZE comparison, than if it large, can only take Integer.MAX_VALUE, otherwise it is Integer.MAX_VALUE -8. This is only the beginning from jdk1.7

fast-fail mechanism

fail-fast explanation:

In systems design, a fail-fast system is one which immediately reports at its interface any condition that is likely to indicate a failure. Fail-fast systems are usually designed to stop normal operation rather than attempt to continue a possibly flawed process. Such designs often check the system’s state at several points in an operation, so any failures can be detected early. The responsibility of a fail-fast module is detecting errors, then letting the next-highest level of the system handle them.

Probably means: the system design, the system can quickly fail immediately report any case of system failure may indicate. Failure of systems are generally designed for quick stops operating normally, rather than trying to continue the process of potentially defective. This design is usually a plurality of points in a state of operation of the inspection system, and thus any malfunction can be detected early. Rapid failure detection module duties is wrong, then let the system the next highest level of processing errors.

In fact, doing system design time to consider unusual circumstances, once an exception occurs, stop and direct reporting, such as the following simple example

//这里的代码是一个对两个整数做除法的方法,在fast_fail_method方法中,我们对被除数做了个简单的检查,如果其值为0,那么就直接抛出一个异常,并明确提示异常原因。这其实就是fail-fast理念的实际应用。
public int fast_fail_method(int arg1,int arg2){
    if(arg2 == 0){
        throw new RuntimeException("can't be zero");
    }
    return arg1/arg2;
}
复制代码

In the Java collection classes in many parts of the mechanism used to design, if used improperly, the code fail-fast trigger mechanism designed unexpected happens. Fail-fast mechanism we usually say in Java, the default refers to a collection of Java error detection mechanism . When multiple threads to operate on the part of the collection of structural change, is likely to trigger the mechanism, then will throw an exception concurrent modification ** ConcurrentModificationException**. Of course, if not under the multi-threaded environment, if the foreach traversal when using the add / remove method may also be thrown. Reference fast-fail mechanism , here to be a simple summary

之所以会抛出ConcurrentModificationException异常,是因为我们的代码中使用了增强for循环,而在增强for循环中,集合遍历是通过iterator进行的,但是元素的add/remove却是直接使用的集合类自己的方法。这就导致iterator在遍历的时候,会发现有一个元素在自己不知不觉的情况下就被删除/添加了,就会抛出一个异常,用来提示可能发生了并发修改!所以,在使用Java的集合类的时候,如果发生ConcurrentModificationException,优先考虑fail-fast有关的情况,实际上这可能并没有真的发生并发,只是Iterator使用了fail-fast的保护机制,只要他发现有某一次修改是未经过自己进行的,那么就会抛出异常。

Guess you like

Origin juejin.im/post/5d42ab5e5188255d691bc8d6