The sixth chapter Java collection of ArrayList source code analysis

Preface

Starting from this chapter, we will enter the analysis of the JAVA collection part, including most of the collection families, such as the commonly used ArrayList, LinkedList, HashSet, HashMap, TreeSet, TreeMap, etc., as well as their iterators, such as Iterator, ListIterator, Collections tool class, and then there may be a brief analysis of the collections that are not commonly used, and the advantages and disadvantages of the collections are compared and analyzed. In this way, we can have a comprehensive understanding of the collection part of Java, and we can better know what collection to use during the coding process.

ArrayList structure diagram

Insert picture description here
ArrayList is very commonly used, and its code is not much. According to the classification, it is mainly divided into several parts in the figure.

(1) Properties

	// ArrayList的默认容量
	private static final int DEFAULT_CAPACITY = 10;
	
	// ArrayList 内部维持的空数组
	private static final Object[] EMPTY_ELEMENTDATA = {};
	
	// ArrayList内部维持的默认空数组
	private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
	
	// ArrayList维持的可变数组;ArrayList的核心属性。
	transient Object[] elementData;
	
	// 元素在容器中的个数,与elementData.length有区别。
	private int size;
	
	// 容器被修改的次数
	protected transient int modCount = 0;
	
	// ArrayList内部维持的最大数组长度
	private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

(2) Constructor

// 无参构造
ArrayList()	

// 指定容量构造	
ArrayList(int initialCapacity)

// 将指定集合构造成ArrayList
ArrayList(Collection<? extends E> c)

(3) Expansion mechanism

ensureCapacity(int minCapacity)
ensureCapacityInternal(int minCapacity)
ensureExplicitCapacity(int minCapacity)
grow(int minCapacity)
hugeCapacity(int minCapacity)

(4) Common API

a. Increase

clone()
add(E e)
add(int index, E element)
addAll(Collection<? extends E> c)
addAll(int index, Collection<? extends E> c)

b. Delete

clear()
trimToSize()
remove(int index)
remove(Object o)	fastRemove(int index)
removeRange(int fromIndex, int toIndex)
removeAll(Collection<?> c)
retainAll(Collection<?> c)
batchRemove(Collection<?> c, boolean complement)

c. Change

set(int index, E element)

d. Check

size()
isEmpty()
get(int index)
indexOf(Object o)
lastIndexOf(Object o)

e. Judgment

contains(Object o)

f. Conversion

toArray()
toArray(T[] a)

(5) Inner class

 迭代器: Itr
		 ListItr
		 
 集合: SubList

Source code analysis

ArrayList is a collection often used in the encoding process. Its bottom layer maintains an array elementData .
All operations are based on this array. Similar to StringBuffer or StringBuilder. The underlying elements are the same. There are expansion mechanisms and various APIs inside. The difference is that ArrayList is more inclusive and can load any same set of objects. The following focuses on the analysis of its expansion mechanism and commonly used APIs and iterators, as well as why there is an expansion mechanism, how it is implemented, how the iterator works, and what functions it has.
ArrayList was originally defined as a kind of container, so it has many external APIs, such as [Add], [Delete], [Check], [Change], [Judge], [Install and Replace] as classified above. Let's start with analyzing the API and gradually introduce the expansion mechanism.

(1) Capacity increase and expansion mechanism

clone()

The clone method is the local method of the called parent class, which realizes the shallow copy. About shallow copy and deep copy, I will do a special article for research later. I won't repeat it here. The source code is posted below:

/**
 * Returns a shallow copy of this <tt>ArrayList</tt> instance.  (The
 * elements themselves are not copied.)
 *翻译:返回这个实例的浅拷贝。元素本身内容不会被拷贝
 * @return a clone of this <tt>ArrayList</tt> instance
 */
public Object clone() {
    try {
        ArrayList<?> v = (ArrayList<?>) super.clone();   // 调用了父类的本地方法实现克隆
        v.elementData = Arrays.copyOf(elementData, size);
        v.modCount = 0;
        return v;
    } catch (CloneNotSupportedException e) {
        // this shouldn't happen, since we are Cloneable
        throw new InternalError(e);
    }
}

add(E e)、add(int index, E element)

Both of these methods are adding. The difference between the
add(E e) method refers to adding an element, and the adding position is the end.
The add(int index, E element) method adds an element, and the adding position is the place where the subscript is index in the list.
Both of these methods involve changing the internal data structure of the ArrayList to increase the length. The source code analysis is as follows:

add (E and)

public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!  //确保足够的数组长度
    elementData[size++] = e;			// 将元素添加到最后一个位置
    return true;
 }

add(int index, E element)

public void add(int index, E element) {
    rangeCheckForAdd(index);  //下标检查是否越界

    ensureCapacityInternal(size + 1);  // Increments modCount!!   // 确保足够的数组长度
    System.arraycopy(elementData, index, elementData, index + 1,  // 从指定位置添加元素,其后依次后移。
                     size - index);
    elementData[index] = element;
    size++;
}	

The expansion method ensureCapacityInternal(size + 1) is used above. Let's start to analyze the expansion mechanism of ArrayList below.

Expansion mechanism

The birth of the expansion mechanism starts with the addition of elements, that is, the corresponding add method. Only when the capacity length is not enough, will the expansion be involved. Other functions [Change], [Delete], [Check], [Judgment], [Conversion], etc. will not involve the expansion mechanism.

对外API手动扩容:  ensureCapacity(int minCapacity)
自动扩容:ensureCapacityInternal(int minCapacity)

扩容判断: ensureExplicitCapacity(int minCapacity)
	     
扩容核心:  grow(int minCapacity)  //小中容量
扩容核心:  hugeCapacity(int minCapacity)  //大容量

There are four methods for expansion, and the last two are the core implementation methods for expansion. From the outside to the inside, analyze:
Insert picture description here

ensureCapacity(int minCapacity)

This method is provided to the caller and can be manually expanded.

public void ensureCapacity(int minCapacity) {

// 计算最小扩展长度,如果是用无参构造函数,当前数组为默认空数组,那么扩展长度为10。
// 如果是有参构造或者其他,则扩展长度为0.

    int minExpand = (elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
        // any size if not default element table
        ? 0
        // larger than default for default empty table. It's already
        // supposed to be at default size.
        : DEFAULT_CAPACITY;

    if (minCapacity > minExpand) {	// 手动扩容执行条件:传入参数 > 最小扩展长度时,扩容才生效
        ensureExplicitCapacity(minCapacity);
    }
}

Instance Demo
Use reflection to view the value of its internal private properties.

	ArrayList list = new ArrayList();
    Class c = list.getClass();
    ArrayList list2 = (ArrayList) c.newInstance();
    Field f = c.getDeclaredField("elementData");
    f.setAccessible(true);
    list2.add(1);
    list2.ensureCapacity(100);  // 手动调用扩容
    Object[] element = (Object[]) f.get(list2);
    System.out.println(list2.size());
    System.out.println(element.length);
    
    结果:1
         100

ensureCapacityInternal(int minCapacity)

This method is called internally by ArrayList, which is equivalent to automatic expansion. When other methods need to expand, this method will be called. The expansion rules are in its source code.

private void ensureCapacityInternal(int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {	    // 如果为无参构造的默认空数组对象时进行扩容。
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);	// 最小扩容容量为10.
    }

    ensureExplicitCapacity(minCapacity);
}

The
above method of ensureExplicitCapacity(int minCapacity) is to calculate the minimum capacity required by the container. This condition is to determine whether to trigger the expansion.

  // 关于modCount,ArrayList开头有一句话:
  
  A structural modification is
 * any operation that adds or deletes one or more elements, or explicitly
 * resizes the backing array; merely setting the value of an element is not
 * a structural modification.
 * 翻译: 结构修改是添加或删除一个或多个元素,或显式调整后备数组大小的任何操作;仅设置元素的值不是结构修改。 

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;		// 这个变量计算当前ArrayList容器结构上被修改的次数,为什么放在扩容之前呢?
    				// 假如没有执行扩容,modCount也会+1,也就说明即时容器长度不变,
    				// 但是容器内的元素长度被改变了,如【增】【删】,modCount就会加1

    // overflow-conscious code
    if (minCapacity - elementData.length > 0)  // 触发条件:所需最小容量 > 当前容量
        grow(minCapacity);
}

The
core implementation method of grow(int minCapacity) expansion:

private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);   // 扩容倍数为原容量的1.5倍
    if (newCapacity - minCapacity < 0)					  // 如果扩容1.5倍后还是不够当前所需最小容量
        newCapacity = minCapacity;				 		  // 新容量就扩容为当前所需最小容量
        
        /** 从上面可以看出,并不是每次扩容都为原来的1.5倍。 **/
        
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);			// 如果所需容量非常大,超过了 2^31-8,就重新计算扩容容量
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

hugeCapacity(int minCapacity)
large-capacity expansion calculation:

private static int hugeCapacity(int minCapacity) {
    if (minCapacity < 0) // overflow
        throw new OutOfMemoryError();
    return (minCapacity > MAX_ARRAY_SIZE) ?   // 只有两种结果:2^31 ,和 2 ^ 31 - 8
        Integer.MAX_VALUE :
        MAX_ARRAY_SIZE;
}

Going back to the [increment] function, there are mainly four APIs;
add(E e)
add(int index, E element)
addAll(Collection<? extends E> c)
addAll(int index, Collection<? extends E> c)
The first two are single additions, and the latter two APIs are multiple additions. Therefore, the increased demand for capacity is all-round, from 0 to infinity is possible.
The expansion mechanism is clarified, and the related methods for adding elements to the container are also very simple. The add() method was introduced above. Let's look at the addAll() method;

addAll(Collection<? extends E> c)
This method is to append all the elements in the collection C to the current container, and the start position of the addition is the end.

public boolean addAll(Collection<? extends E> c) {
    Object[] a = c.toArray();
    int numNew = a.length;
    ensureCapacityInternal(size + numNew);  // Increments modCount
    System.arraycopy(a, 0, elementData, size, numNew);
    size += numNew;
    return numNew != 0;
}

addAll(int index, Collection<? extends E> c)
This method has the same effect. It appends all the elements in the collection C to the current container. The starting position of the increase is the specified index subscript.

public boolean addAll(int index, Collection<? extends E> c) {
    rangeCheckForAdd(index);

    Object[] a = c.toArray();
    int numNew = a.length;
    ensureCapacityInternal(size + numNew);  // Increments modCount

    int numMoved = size - index;
    if (numMoved > 0)
        System.arraycopy(elementData, index, elementData, index + numNew,
                         numMoved);

    System.arraycopy(a, 0, elementData, index, numNew);
    size += numNew;
    return numNew != 0;
}

(2) Delete and trim

From the add method and expansion mechanism, we can know that when adding elements to the container, when the capacity is insufficient, the expansion method can be used to increase the capacity. Then the opposite operation, when deleting elements, when the capacity is too large, ArrayList also provides a way to trim capacity. Because if the capacity is much larger than the number of elements in the container, it occupies too much memory space, which will cause a waste of space.
Let's take a look at deleting elements and pruning capacity, which are opposite to the mechanism of adding elements and expanding capacity.

trimToSize()    // 修剪容量至当前元素所占长度
clear()			// 清空所有元素
remove(int index)   
remove(Object o)	fastRemove(int index)
removeRange(int fromIndex, int toIndex)
removeAll(Collection<?> c)
retainAll(Collection<?> c)
batchRemove(Collection<?> c, boolean complement)

trimToSize()

public void trimToSize() {
    modCount++;   // 容量长度的修改属于结构修改,modCount会增加
    if (size < elementData.length) {   // 如果小于才进行修剪容量,修剪为当前元素占有长度
        elementData = (size == 0)
          ? EMPTY_ELEMENTDATA
          : Arrays.copyOf(elementData, size);
    }
}

clear()

public void clear() {
    modCount++;  // 元素的删减属于结构修改

    // clear to let GC do its work
    for (int i = 0; i < size; i++)
        elementData[i] = null;  // 将容器中的所有变量与堆中对象解绑,对象将被交给GC处理

    size = 0;
}

remove(int index)

public E remove(int index) {
    rangeCheck(index);  // 下标是否越界

    modCount++;
    E oldValue = elementData(index);

	// 移除指定下标元素,将会导致后续所有元素的移动
    int numMoved = size - index - 1; 
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // clear to let GC do its work

    return oldValue;  // 返回被移除的元素
}

remove(Object o)

public boolean remove(Object o) {
// 因为本容器可以存放null对象,非空对象的判断一般是用equals方法
// 而对于null对象,它不能用equals方法,肯定是要单独判断的。
    if (o == null) {
        for (int index = 0; index < size; index++)
            if (elementData[index] == null) {
                fastRemove(index);
                return true;
            }
    } else {
        for (int index = 0; index < size; index++)
            if (o.equals(elementData[index])) {
                fastRemove(index);
                return true;
            }
    }
    return false;
}

removeAll(Collection<?> c) 、retainAll(Collection<?> c)和 batchRemove(Collection<?> c, boolean complement)

 // removeAll(Collection<?> c)方法中是调用的batchRemove(Collection<?> c, boolean complement)
 // 因此下面详细分析batchRemove(Collection<?> c, boolean complement);
 public boolean retainAll(Collection<?> c) {
    Objects.requireNonNull(c);
    return batchRemove(c, false);
}

retainAll(Collection<?> c)

 public boolean retainAll(Collection<?> c) {
    Objects.requireNonNull(c);
    return batchRemove(c, true);
}

batchRemove(Collection<?> c, boolean complement)
It has two parameters. The first is a collection interface, which means you can pass in List, Set, Vector, Queue, etc. The
second parameter refers to whether it is included.
Together, the two parameters refer to whether to keep all the elements in the set C or to remove all the elements in the set C.

private boolean batchRemove(Collection<?> c, boolean complement) {
    final Object[] elementData = this.elementData;
    int r = 0, w = 0;
    boolean modified = false;
    
    try {
        for (; r < size; r++)
            if (c.contains(elementData[r]) == complement)
    
                elementData[w++] = elementData[r];  // 元素长度不变,将所需的元素,从第一个位置开始装填。
    } finally {
        // Preserve behavioral compatibility with AbstractCollection,
        // even if c.contains() throws.
		如果c.contains()抛出异常
        if (r != size) {
            System.arraycopy(elementData, r,
                             elementData, w,
                             size - r);
            w += size - r;
        }
	
		 // 装填完成后,由于装填数量 <= 容器元素初始数量,所以还要删除多余的尾部部分。
        if (w != size) {
            // clear to let GC do its work
            for (int i = w; i < size; i++)
                elementData[i] = null;
            modCount += size - w;
            size = w;
            modified = true;
        }
    }
    return modified;  // 如果元素移动后还是原来集合的样子,返回false
}

(3) Change


The modification of the set() element will not affect the structure change, so this method will not increase the modCount.

 public E set(int index, E element) {
    rangeCheck(index);

    E oldValue = elementData(index);
    elementData[index] = element;
    return oldValue;  // 返回被换掉的元素
}

(4) Check

For the search function, ArrayList provides a search API for the three dimensions of length, element, and subscript. The source code idea and String are relatively simple, so I won't go into details here.

size()
isEmpty()
get(int index)
indexOf(Object o)
lastIndexOf(Object o)

(5) Judgment


The indexOf method called internally by contains(Object o) is relatively simple.

public boolean contains(Object o) {
    return indexOf(o) >= 0;
}

(6) Conversion

toArray()
toArray(T[] a)   // 这个方法暂时没看懂

toArray()

public Object[] toArray() {
    return Arrays.copyOf(elementData, size);
}

(7) Iterator

The following focuses on the analysis of internal classes, including three internal classes, two of which are iterators and an ArrayList sub-collection class subList.

 迭代器: Itr
		 ListItr
		 
 集合: SubList

There are two iterators in the ArrayList collection, and a class of subList that is similar to the function of a sub-collection.
Among them, Itr Iterator is an implemented Iterator, which has the function of backward judgment and movement, and ListItr is a subclass of Itr, which implements ListIterator. In addition to the father's ability, it also adds the function of forward judgment and movement.
The details are as follows:
Itr

hasNext()
next()
remove()

ListItr

hasNext()
next()
remove()
hasPrevious()
previous()
set()
add()

Comparing the above two iterators, we can see that ListItr adds more functions to the parent class to satisfy various operations on ArrayList.
In the above two iterators, the following three variables are maintained

	int cursor;       // index of next element to return
    int lastRet = -1; // index of last element returned; -1 if no such
    int expectedModCount = modCount;

expectedModCount = modCount; means that when the caller needs to create a collection iterator, the number of times the iterator has been modified will be passed to expectedModCount, the iterator has taken over the operation power of the collection, and only allows itself Modification of the collection is not allowed to modify the iterator (including the collection itself).

程序员:“我有一个集合,现在将它交给你”
迭代器:“放心吧,交给我负责”

Therefore, when a collection operation is handed over to the collection's own iterator. Then all operations that change the structure of the collection will no longer be available. Object operations involving structure modification such as:

add()  addAll()  remove()等等,这些操作将会引起 modCount 的值改变。但是这个值得操作权力已经交给了迭代器。

Therefore, when using the iterator to traverse, when an external function is operating on the collection, a ConcurrentModificationException will be thrown, which means "current parallel operation exception". Therefore, when the iterator is traversing, no external operations are allowed. During the traversal, every operation that changes the structure will be judged based on the modCount value. If it is found to be different from the expected value, it will fail quickly.

(8) subList class

If the iterator appears to meet the repeated traversal requirements of the collection, it can be very convenient to operate on the collection. Then the appearance of the SubList class is to make up for the shortcomings in the scope of the collection.
Just like the previous analysis of the String source code, the String inside is an array with various operations, including operations on substrings. If an array is long enough, but the required data range does not need to traverse the entire array, both String and ArrayList need to provide an operation that can limit the range query or traversal.
The String class provides subString, and the same subList is an internal class that makes up for the local operations of ArrayList.

public List<E> subList(int fromIndex, int toIndex) {
    subListRangeCheck(fromIndex, toIndex, size);
    return new SubList(this, 0, fromIndex, toIndex);
}

This is an api of the arrayList collection, it will return a sub-collection of the collection. The parameters int fromIndex, int toIndex represent where to intercept from.
This subList also inherits AbstractList like arrayList, and also implements almost the same functions as ArrayList, including iterators.

Part of the source code

	private class SubList extends AbstractList<E> implements RandomAccess {
    private final AbstractList<E> parent;
    private final int parentOffset;
    private final int offset;
    int size;

    SubList(AbstractList<E> parent,
            int offset, int fromIndex, int toIndex) {
        this.parent = parent;
        this.parentOffset = fromIndex;
        this.offset = offset + fromIndex;
        this.size = toIndex - fromIndex;
        this.modCount = ArrayList.this.modCount;
    }

    public E set(int index, E e) {
        rangeCheck(index);
        checkForComodification();
        E oldValue = ArrayList.this.elementData(offset + index);
        ArrayList.this.elementData[offset + index] = e;
        return oldValue;
    }

    public E get(int index) {
        rangeCheck(index);
        checkForComodification();
        return ArrayList.this.elementData(offset + index);
    }
    ...
     // subList的迭代器
     public Iterator<E> iterator() {
        return listIterator();
    }

Each function in subList is the corresponding operation on the sub-collection. It can reduce the range of traversal and improve efficiency.

to sum up

The above is the source code analysis of arrayList, please advise me if there is something wrong. Some of the above-mentioned understandings are personal understandings and are only used for learning reference. ArrayList is one of the more common collections, and there are many members in a large family of collections. At present, I plan to do an analysis of the introduction of a large family. The commonly used ones are analyzed separately, and the less commonly used classifications are analyzed.

Guess you like

Origin blog.csdn.net/weixin_43901067/article/details/104757254