Detailed analysis Jdk8 Java ArrayList implementation principles underlying source code

Brief introduction

  • ArrayList array is implemented based on, it is a dynamic array, which can automatically increase capacity, similar to the C language dynamic application memory, dynamic memory increase.
  • ArrayList is not thread-safe , can only be used in single-threaded environment, multi-threaded environment could be considered Collections.synchronizedList (List l) function returns a thread-safe ArrayList class, you can use concurrent CopyOnWriteArrayList and contracting under the category.
  • ArrayList implements Serializable interface, it supports serialization of transmission through the sequence, implements RandomAccess interface supports fast random access, fast access by actually index number achieved Cloneable interface, be cloned.

Storage structure

// 当前数据对象存放地方,当前对象不参与序列化
// 这个关键字最主要的作用就是当序列化时,被transient修饰的内容将不会被序列化
transient Object[] elementData;
  • Object type array.

    Data Domain

    // 序列化ID
    private static final long serialVersionUID = 8683452581122892189L;
    // 默认初始容量
    private static final int DEFAULT_CAPACITY = 10;
    // 一个空数组,方便使用,主要用于带参构造函数初始化和读取序列化对象等。
    private static final Object[] EMPTY_ELEMENTDATA = {};
    /**
     * 和官方文档写的一样,DEFAULTCAPACITY_EMPTY_ELEMENTDATA 和EMPTY_ELEMENTDATA 的区别
     * 仅仅是为了区别用户带参为0的构造和默认构造的惰性初始模式对象。
     * 当用户带参为0的构造,第一次add时,数组容量grow到1。
     * 当用户使用默认构造时,第一次add时,容量直接grow到DEFAULT_CAPACITY(10)。
     */
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
 
    // 当前数据对象存放地方,当前对象不参与序列化
    // 这个关键字最主要的作用就是当序列化时,被transient修饰的内容将不会被序列化
    transient Object[] elementData; // non-private to simplify nested class access
    // 当前数组中元素的个数
    private int size;
    // 数组最大可分配容量
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
    // 集合数组修改次数的标识(由AbstractList继承下来)(fail-fast机制)
    protected transient int modCount = 0;
  • ArrayList no argument constructor. Initialization time does not really create a space 10, which is lazy initialization object.
  • And the difference between DEFAULTCAPACITY_EMPTY_ELEMENTDATA EMPTY_ELEMENTDATA only to distinguish the user with reference lazy initialization object for the default configuration and the configuration 0.
  • ArrayList modCount used to record the number of changes in the structure. For Fail-Fast mechanism

Constructor

    public ArrayList() {
        // 只有这个地方会引用DEFAULTCAPACITY_EMPTY_ELEMENTDATA
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
    }
    
    public ArrayList(int initialCapacity) {
        if (initialCapacity > 0) {
            this.elementData = new Object[initialCapacity];
        } else if (initialCapacity == 0) {
            // 使用 EMPTY_ELEMENTDATA,在其他的多个地方可能会引用EMPTY_ELEMENTDATA
            this.elementData = EMPTY_ELEMENTDATA;
        } else {
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        }
    }
   
    public ArrayList(Collection<? extends E> c) {
        // 把传入集合传化成[]数组并浅拷贝给elementData 
        elementData = c.toArray();
        // 转化后的数组长度赋给当前ArrayList的size,并判断是否为0
        if ((size = elementData.length) != 0) {
            //c.toArray可能不会返回 Object[],可以查看 java 官方编号为 6260652 的 bug
            if (elementData.getClass() != Object[].class)
                // 若 c.toArray() 返回的数组类型不是 Object[],则利用 Arrays.copyOf(); 来构造一个大小为 size 的 Object[] 数组
                // 此时elementData是指向传入集合的内存,还需要创建新的内存区域深拷贝给elementData 
                elementData = Arrays.copyOf(elementData, size, Object[].class);
        } else {
            // 传入数组size为零替换空数组
            this.elementData = EMPTY_ELEMENTDATA;
        }
    }
  • And the difference between DEFAULTCAPACITY_EMPTY_ELEMENTDATA EMPTY_ELEMENTDATA only to distinguish the user with reference lazy initialization object for the default configuration and the configuration 0.
  • Note the deep and shallow copy copy .
  • Reference will be configured with an inert initialized to 0, 0 is not configured not lazy initialization.

    add () parse the source code

public boolean add(E e) {
        // 确保数组已使用长度(size)加1之后足够存下 下一个数据
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        // 数组的下一个index存放传入元素。
        elementData[size++] = e;
        // 始终返回true。
        return true;
}
private void ensureCapacityInternal(int minCapacity) {
        ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}
private static int calculateCapacity(Object[] elementData, int minCapacity) {
        // 这里就是DEFAULTCAPACITY_EMPTY_ELEMENTDATA 和
        // EMPTY_ELEMENTDATA 最主要的区别。
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            // 默认构造第一次add返回10。
            return Math.max(DEFAULT_CAPACITY, minCapacity);
        }
        // 带参为0构造第一次add返回 1 (0 + 1)。
        return minCapacity;
}
private void ensureExplicitCapacity(int minCapacity) {
        // 自增修改计数
        modCount++;

        // overflow-conscious code
        // 当前数组容量小于需要的最小容量
        if (minCapacity - elementData.length > 0)
            // 准备扩容数组
            grow(minCapacity);
}
private void grow(int minCapacity) {
        // overflow-conscious code
        // 获得当前数组容量
        int oldCapacity = elementData.length;
        // 新数组容量为1.5倍的旧数组容量
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            // 若 newCapacity 依旧小于 minCapacity
            newCapacity = minCapacity;
            // 判断是需要的容量是否超过最大的数组容量。
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        // 在Arrays.copyOf()中会将原数组整个赋值到扩容的数组中。
        elementData = Arrays.copyOf(elementData, newCapacity);
}
  • Expansion of operations need to call Arrays.copyOf () to copy the entire original array to a new array, the cost of this operation is very high, so it is best to specify the approximate capacity size when you create ArrayList object, reducing the number of expansion operations.

add (int index, E element) source code analysis

// 这是一个本地方法,由C语言实现。
public static native void arraycopy(Object src,  // 源数组
                                    int  srcPos, // 源数组要复制的起始位置
                                    Object dest, // 目标数组(将原数组复制到目标数组)
                                    int destPos, // 目标数组起始位置(从目标数组的哪个下标开始复制操作)
                                    int length   // 复制源数组的长度
                                    );

public void add(int index, E element) {
        // 判断索引是否越界
        rangeCheckForAdd(index);
        // 确保数组已使用长度(size)加1之后足够存下 下一个数据
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        // 运行到这里代表数组容量满足。
        // 数组从传入形参index处开始复制,复制size-index个元素(即包括index在内后面的元素全部复制),
        // 从数组的index + 1处开始粘贴。
        // 这时,index 和 index + 1处元素数值相同。
        System.arraycopy(elementData, index, elementData, index + 1,
                         size - index);
        // 把index处的元素替换成新的元素。
        elementData[index] = element;
        // 数组内元素长度加一。
        size++;
}
  • System.arraycopy need to call (), including the index comprises elements are copied back to the position index + 1, the operation time complexity is O (N), it can be seen ArrayList array head element increases the cost is very high .

remove (int index) source code analysis

public E remove(int index) {
        // 检查index 
        rangeCheck(index);

        modCount++;
        E oldValue = elementData(index);

        int numMoved = size - index - 1;
        if (numMoved > 0)
            // 和 add(int index, E element)原理想通。
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        // 引用计数为0,会自动进行垃圾回收。
        elementData[--size] = null; // clear to let GC do its work
        // 返回旧元素
        return oldValue;
    }
  • System.arraycopy need to call () will be included later index + 1 elements are copied to the index position, the operation time complexity is O (N), it can be seen ArrayList array head element increases the cost is very high .

Fail-Fast mechanism

fail-fast mechanism, i.e. rapid failure mechanism, an error detection mechanism java collection (Collection) in. When the iterative process of the collection of the set changes in the structure of time, there may occur fail-fast, namely ConcurrentModificationException throw an exception. fail-fast mechanism does not guarantee in unsynchronized modification will throw an exception, it just do their best to throw, so this mechanism is generally used only to detect bug.

  • All operations to adjust the internal size of the array or structure change refers to add or delete at least one element, set the value of the element just not a change in the structure.
  • Or when performing a sequence of iterations, a relatively modCount whether to change before and after the operation, if the required change ran ConcurrentModificationException
private class Itr implements Iterator<E> {
        int cursor;
        int lastRet = -1;
        // 期待的修改值等于当前修改次数(modCount)
        int expectedModCount = modCount;
 
        public boolean hasNext() {
            return cursor != size;
        }
 
        public E next() {
            // 检查 expectedModCount是否等于modCount,不相同则抛出ConcurrentModificationException
            checkForComodification();
            /** 省略此处代码 */
        }
 
        public void remove() {
            if (this.lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();
            /** 省略此处代码 */
        }
 
        final void checkForComodification() {
            if (ArrayList.this.modCount == this.expectedModCount)
                return;
            throw new ConcurrentModificationException();
        }
    }

Examples of the fail-fast in a single-threaded environment

     public static void main(String[] args) {
           List<String> list = new ArrayList<>();
           for (int i = 0 ; i < 10 ; i++ ) {
                list.add(i + "");
           }
           Iterator<String> iterator = list.iterator();
           int i = 0 ;
           while(iterator.hasNext()) {
                if (i == 3) {
                     list.remove(3);
                }
                System.out.println(iterator.next());
                i ++;
           }
     }

Serialization

ArrayList implements java.io.Serializable interface, but defines its own serialization and de-serialization. Because ArrayList array based implementation and expansion of dynamic characteristics, it does not necessarily save the array elements will be used, then there is no need to all be serialized . Thus elementData array using transient modification, can be prevented from being automatically serialized.

private void writeObject(java.io.ObjectOutputStream s)
        throws java.io.IOException{
        // Write out element count, and any hidden stuff
        int expectedModCount = modCount;
        // 将当前类的非静态(non-static)和非瞬态(non-transient)字段写入流
        // 在这里也会将size字段写入。
        s.defaultWriteObject();

        // Write out size as capacity for behavioural compatibility with clone()
        // 序列化数组包含元素数量,为了向后兼容
        // 两次将size写入流
        s.writeInt(size);

        // Write out all elements in the proper order.
        // 按照顺序写入,只写入到数组包含元素的结尾,并不会把数组的所有容量区域全部写入
        for (int i=0; i<size; i++) {
            s.writeObject(elementData[i]);
        }
        // 判断是否触发Fast-Fail
        if (modCount != expectedModCount) {
            throw new ConcurrentModificationException();
        }
    }
    private void readObject(java.io.ObjectInputStream s)
        throws java.io.IOException, ClassNotFoundException {
        // 设置数组引用空数组。
        elementData = EMPTY_ELEMENTDATA;

        // Read in size, and any hidden stuff
        // 将流中的的非静态(non-static)和非瞬态(non-transient)字段读取到当前类
        // 包含 size
        s.defaultReadObject();

        // Read in capacity
        // 读入元素个数,没什么用,只是因为写出的时候写了size属性,读的时候也要按顺序来读
        s.readInt(); // ignored

        if (size > 0) {
            // be like clone(), allocate array based upon size not capacity
            // 根据size计算容量。
            int capacity = calculateCapacity(elementData, size);
            // SharedSecrets 一个“共享机密”存储库,它是一种机制,
            // 用于调用另一个包中的实现专用方法,而不使用反射。TODO
            SharedSecrets.getJavaOISAccess().checkArray(s, Object[].class, capacity);
            // 检查是否需要扩容
            ensureCapacityInternal(size);

            Object[] a = elementData;
            // Read in all elements in the proper order.
            // 依次读取元素到数组中
            for (int i=0; i<size; i++) {
                a[i] = s.readObject();
            }
        }
    }

Why ArrayList size to be serialized twice?

S.defaultWriteObject in the code (); in size should also be serialized, why we would go below a single serialized again?
It was written for compatibility reasons.
Older versions of the JDK achieve ArrayList is different, the length field will be serialized.
The new version of the JDK, to optimize the implementation of ArrayList, not a sequence of length field.
This time, if removed s.writeInt (size), then the new version of the JDK serialized object can not be read correctly in older versions,
because of the lack of length field.
Therefore, such an approach seems superfluous, actually ensure compatibility.

summary

  • ArrayList based on an array of ways, without limitation capacity (expansion will)
  • May have to expansion (so it is best to predict what) you add elements, does not reduce the capacity to remove elements (If you want to reduce the capacity can be used trimToSize ()), when you delete an element, the element will be deleted position is set to null, the next gc will reclaim the memory space occupied by these elements.
  • Thread safe
  • add (int index, E element): add elements to the time specified in the array location, location and all the required elements are behind a piece of copy back
  • get (int index): acquiring the specified location on the element can be directly obtained by indexing (O (1))
  • remove (Object o) need to iterate
  • remove (int index) does not need to iterate simply determines whether the index conditions may, efficient than remove (Object o)
  • contains (E) need to iterate

Guess you like

Origin www.cnblogs.com/neverth/p/11786048.html