Collection series---ArrayList

foreword

In this category, we will write about collections in Java. Collection is a very important and basic content in Java, because what is essential for any data is how the data is stored, and the function of collection is to organize and store data in a certain way. Some of the collections written here are relatively common, and some are not commonly used, but I have seen them personally. For some relatively similar collections (such as HashMap and Hashtable), only one is mentioned, and the difference between them can be highlighted.

Finally, I want to point out that for sets, I think there are four main points of concern:

1. Whether to allow empty

2. Whether to allow duplicate data

3. Is it orderly? Order means whether the order in which the data is read is consistent with the order in which the data is stored.

4. Is it thread safe?

 

ArrayList

ArrayList is the most common and familiar collection class for every Java developer. As the name suggests, ArrayList is a collection implemented in the form of an array . Let’s take a look at the basic elements in ArrayList in a table:

element Use
private transient Object[] elementData; ArrayList is an implementation based on arrays, and elementData is the underlying array
private int size; The number of elements in the ArrayList, it should be noted here that the size is incremented or decremented according to the number of calls to the add and remove methods, so if a null is added into the ArrayList, the size will also increase by 1 

 

Four Focus Answers on ArrayList

In the future, before explaining the code, each article will first give an answer in the form of a table to the four points of concern for a set:

focus point in conclusion
Does ArrayList allow null allow
Does ArrayList allow duplicate data allow
Is the ArrayList ordered? orderly
Is ArrayList thread safe Not thread safe

 

add element

There is this piece of code:

public static void main(String[] args)
{
    List<String> list = new ArrayList<String>();
    list.add("000");
    list.add("111");
}

To see what the bottom layer will do, enter the source code of the add method to see:

1 public boolean add(E e) {
2 ensureCapacity(size + 1);  // Increments modCount!!
3 elementData[size++] = e;
4 return true;
5 }

Ignore the ensureCapacity method in line 2. This method is used for capacity expansion. In fact, the bottom layer only adds a piece of data to a certain position of elementData when calling the add method. If it is represented by a picture, it looks like this:

One more thing, my drawing is somewhat misleading. The elements stored in elementData should be references to elements in the heap memory, not the actual elements. This drawing gives the impression that the actual elements are stored in the elementData array, which is not very rigorous. However, this drawing is mainly for the sake of convenience, as long as you know the problem.

 

Expansion

Let's take a look, when constructing an ArrayList, the default underlying array size is 10:

public ArrayList() {
this(10);
}

Then there is a problem, what should I do if the size of the underlying array is not enough? The answer is expansion, which is why it is always said that the bottom layer of ArrayList is implemented based on dynamic arrays . Dynamic array means that the size of the underlying array is not fixed, but a judgment is made according to the size of the added elements. Dynamic expansion, the expansion code is in ensureCapacity:

复制代码
public void ensureCapacity(int minCapacity) {
modCount++;
int oldCapacity = elementData.length;
if (minCapacity > oldCapacity) {
    Object oldData[] = elementData;
    int newCapacity = (oldCapacity * 3)/2 + 1;
        if (newCapacity < minCapacity)
    newCapacity = minCapacity;
           // minCapacity is usually close to size, so this is a win:
           elementData = Arrays.copyOf(elementData, newCapacity);
}
}
复制代码

看到扩容的时候把元素组大小先乘以3,再除以2,最后加1。可能有些人要问为什么?我们可以想:

1、如果一次性扩容扩得太大,必然造成内存空间的浪费

2、如果一次性扩容扩得不够,那么下一次扩容的操作必然比较快地会到来,这会降低程序运行效率,要知道扩容还是比价耗费性能的一个操作

所以扩容扩多少,是JDK开发人员在时间、空间上做的一个权衡,提供出来的一个比较合理的数值。最后调用到的是Arrays的copyOf方法,将元素组里面的内容复制到新的数组里面去:

复制代码
public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
       T[] copy = ((Object)newType == (Object)Object[].class)
           ? (T[]) new Object[newLength]
           : (T[]) Array.newInstance(newType.getComponentType(), newLength);
       System.arraycopy(original, 0, copy, 0,
                        Math.min(original.length, newLength));
       return copy;
}
复制代码

用一张图来表示就是这样的:

 

删除元素

接着我们看一下删除的操作。ArrayList支持两种删除方式:

1、按照下标删除

2、按照元素删除,这会删除ArrayList中与指定要删除的元素匹配的第一个元素

对于ArrayList来说,这两种删除的方法差不多,都是调用的下面一段代码:

int numMoved = size - index - 1;
if (numMoved > 0)
    System.arraycopy(elementData, index+1, elementData, index,
             numMoved);
elementData[--size] = null; // Let gc do its work

其实做的事情就是两件:

1、把指定元素后面位置的所有元素,利用System.arraycopy方法整体向前移动一个位置

2、最后一个位置的元素指定为null,这样让gc可以去回收它

比方说有这么一段代码:

复制代码
public static void main(String[] args)
{
    List<String> list = new ArrayList<String>();
    list.add("111");
    list.add("222");
    list.add("333");
    list.add("444");
    list.add("555");
    list.add("666");
    list.add("777");
    list.add("888");
    list.remove("333");
}
复制代码

用图表示是这样的:

 

插入元素

看一下ArrayList的插入操作,插入操作调用的也是add方法,比如:

复制代码
 1 public static void main(String[] args)
 2 {
 3     List<String> list = new ArrayList<String>();
 4     list.add("111");
 5     list.add("222");
 6     list.add("333");
 7     list.add("444");
 8     list.add("555");
 9     list.add("666");
10     list.add("777");
11     list.add("888");
12     list.add(2, "000");
13     System.out.println(list);
14 }
复制代码

有一个地方不要搞错了,第12行的add方法的意思是,往第几个元素后面插入一个元素,像第12行就是往第二个元素后面插入一个000。看一下运行结果也证明了这一点:

[111, 222, 000, 333, 444, 555, 666, 777, 888]

还是看一下插入的时候做了什么:

复制代码
 1 public void add(int index, E element) {
 2 if (index > size || index < 0)
 3     throw new IndexOutOfBoundsException(
 4     "Index: "+index+", Size: "+size);
 5     ensureCapacity(size+1);  // Increments modCount!!
 6 System.arraycopy(elementData, index, elementData, index + 1,
 7          size - index);
 8 elementData[index] = element;
 9 size++;
10 }
复制代码

看到插入的时候,按照指定位置,把从指定位置开始的所有元素利用System.arraycopy方法做一个整体的复制,向后移动一个位置(当然先要用ensureCapacity方法进行判断,加了一个元素之后数组会不会不够大),然后指定位置的元素设置为需要插入的元素,完成了一次插入的操作。用图表示这个过程是这样的:

 

ArrayList的优缺点

从上面的几个过程总结一下ArrayList的优缺点。ArrayList的优点如下:

1、ArrayList底层以数组实现,是一种随机访问模式,再加上它实现了RandomAccess接口,因此查找也就是get的时候非常快

2、ArrayList在顺序添加一个元素的时候非常方便,只是往数组里面添加了一个元素而已

不过ArrayList的缺点也十分明显:

1、删除元素的时候,涉及到一次元素复制,如果要复制的元素很多,那么就会比较耗费性能

2、插入元素的时候,涉及到一次元素复制,如果要复制的元素很多,那么就会比较耗费性能

因此,ArrayList比较适合顺序添加、随机访问的场景

 

ArrayList和Vector的区别

ArrayList是线程非安全的,这很明显,因为ArrayList中所有的方法都不是同步的,在并发下一定会出现线程安全问题。那么我们想要使用ArrayList并且让它线程安全怎么办?一个方法是用Collections.synchronizedList方法把你的ArrayList变成一个线程安全的List,比如:

复制代码
List<String> synchronizedList = Collections.synchronizedList(list);
synchronizedList.add("aaa");
synchronizedList.add("bbb");
for (int i = 0; i < synchronizedList.size(); i++)
{
    System.out.println(synchronizedList.get(i));
}
复制代码

另一个方法就是Vector,它是ArrayList的线程安全版本,其实现90%和ArrayList都完全一样,区别在于:

1、Vector是线程安全的,ArrayList是线程非安全的

2、Vector可以指定增长因子,如果该增长因子指定了,那么扩容的时候会每次新的数组大小会在原数组的大小基础上加上增长因子;如果不指定增长因子,那么就给原数组大小*2,源代码是这样的:

int newCapacity = oldCapacity + ((capacityIncrement > 0) ?
                                 capacityIncrement : oldCapacity);

 

为什么ArrayList的elementData是用transient修饰的?

最后一个问题,我们看一下ArrayList中的数组,是这么定义的:

private transient Object[] elementData;

不知道大家有没有想过,为什么elementData是使用transient修饰的呢?关于这个问题,说说我的看法。我们看一下ArrayList的定义:

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable

看到ArrayList实现了Serializable接口,这意味着ArrayList是可以被序列化的,用transient修饰elementData意味着我不希望elementData数组被序列化。这是为什么?因为序列化ArrayList的时候,ArrayList里面的elementData未必是满的,比方说elementData有10的大小,但是我只用了其中的3个,那么是否有必要序列化整个elementData呢?显然没有这个必要,因此ArrayList中重写了writeObject方法:

复制代码
private void writeObject(java.io.ObjectOutputStream s)
        throws java.io.IOException{
// Write out element count, and any hidden stuff
int expectedModCount = modCount;
s.defaultWriteObject();
        // Write out array length
       s.writeInt(elementData.length);
    // Write out all elements in the proper order.
for (int i=0; i<size; i++)
           s.writeObject(elementData[i]);
    if (modCount != expectedModCount) {
           throw new ConcurrentModificationException();
    }
}
复制代码

每次序列化的时候调用这个方法,先调用defaultWriteObject()方法序列化ArrayList中的非transient元素,elementData不去序列化它,然后遍历elementData,只序列化那些有的元素,这样:

1、加快了序列化的速度

2、减小了序列化之后的文件大小

不失为一种聪明的做法,如果以后开发过程中有遇到这种情况,也是值得学习、借鉴的一种思路。

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325989305&siteId=291194637