集合——LinkedList实现原理分析

LinkedList实现原理

ArrayList和linkedList都是实现了List接口的，使用方法上也没有什么区别，但是底层所采用的数据结构是完全不一样。我们都知道ArrayList的是基于数组实现的，那LinkedList呢？

顾名思义，LinkedList底层所采用的数据结构是链表。了解数据结构的朋友都应该知道，链表是由多个节点构成，每个节点都包含三个部分，头部指向上一个节点，中部指向该节点，尾部指向下一个节点。如下图展示了一个链表的数据结构情况。

一个链表中，第一个节点的头部是null，最后一个节点的尾部是null。

下面我们来看一组代码：

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{
    transient int size = 0;

    /**
     * Pointer to first node.
     * Invariant: (first == null && last == null) ||
     *            (first.prev == null && first.item != null)
     */
    transient Node<E> first;

    /**
     * Pointer to last node.
     * Invariant: (first == null && last == null) ||
     *            (last.next == null && last.item != null)
     */
    transient Node<E> last;

    /**
     * Constructs an empty list.
     */
    public LinkedList() {
    }

LinkedList也是使用size属性来记录集合的元素个数。除此之外还有两个非常重要的属性，一个是first，一个是last。它们都是Node类型。first是用来记录链表头节点，last是用来记录链表尾节点的。接下来我们来根据LinkedList的一些常用方法来分析它的原理。

add操作：

    public boolean add(E e) {
        linkLast(e);
        return true;
    }

    /**
     * Links e as last element.
     */
    void linkLast(E e) {
        final Node<E> l = last;
        final Node<E> newNode = new Node<>(l, e, null);
        last = newNode;
        if (l == null)
            first = newNode;
        else
            l.next = newNode;
        size++;
        modCount++;
    }

我们平常使用的add方法，底层实际是调用linkLast方法，根据官方注解可以知道，该方法是把添加进来的节点当做尾节点的。

构造节点，并把原尾节点放在该节点头部，该节点放在中部，尾部为null，并更新属性last为当前节点。

接着判断原last是否为空，若为空，则代表添加的进来的节点是第一个节点，将first也更新为当前节点，若不为空，需要将原last节点的尾部由空更新为当前节点。最后size++，modCount++。

get操作：

    /**
     * Returns the element at the specified position in this list.
     *
     * @param index index of the element to return
     * @return the element at the specified position in this list
     * @throws IndexOutOfBoundsException {@inheritDoc}
     */
    public E get(int index) {
        checkElementIndex(index);
        return node(index).item;
    }

    private void checkElementIndex(int index) {
        if (!isElementIndex(index))
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

    /**
     * Tells if the argument is the index of an existing element.
     */
    private boolean isElementIndex(int index) {
        return index >= 0 && index < size;
    }

get方法中调用的是checkElementIndex()方法，先对index做合法判断。如果index值不合法，就抛出下标越界异常。

判断完index的合法之后，通过node()方法，获取下标为index的节点，并获取节点的中间部分，item表示的是获取节点中间部分，现在我们来具体看看node()方法是怎么实现。

    /**
     * Returns the (non-null) Node at the specified element index.
     */
    Node<E> node(int index) {
        // assert isElementIndex(index);

        if (index < (size >> 1)) {
            Node<E> x = first;
            for (int i = 0; i < index; i++)
                x = x.next;
            return x;
        } else {
            Node<E> x = last;
            for (int i = size - 1; i > index; i--)
                x = x.prev;
            return x;
        }
    }

右移一位相当于除以2，不过在相率上，右移运算远高于除法运算。首先判断要获取的节点是在链表的前半段还是后半段，如果是前半段的话就从第一个节点开始遍历，一个节点一个节点的取，知道取到第index个。反之则从最后一个节点开始遍历。

这么做的原因是链表数据结构只能通过上一个节点的尾部获取到下一个节点的地址，或者通过下一节点的头部获取上一个节点的位置。无法像数组一样，直接通过下标获取数据。

remove操作过程：

    public boolean remove(Object o) {
        if (o == null) {
            for (Node<E> x = first; x != null; x = x.next) {
                if (x.item == null) {
                    unlink(x);
                    return true;
                }
            }
        } else {
            for (Node<E> x = first; x != null; x = x.next) {
                if (o.equals(x.item)) {
                    unlink(x);
                    return true;
                }
            }
        }
        return false;
    }

由上述代码段可以看出，remove的实现原理是从第一个节点开始遍历，直到找到要删除的节点，然后删除。源码中删除的方法是unlink()。

    /**
     * Unlinks non-null node x.
     */
    E unlink(Node<E> x) {
        // assert x != null;
        final E element = x.item;
        final Node<E> next = x.next;
        final Node<E> prev = x.prev;

        if (prev == null) {
            first = next;
        } else {
            prev.next = next;
            x.prev = null;
        }

        if (next == null) {
            last = prev;
        } else {
            next.prev = prev;
            x.next = null;
        }

        x.item = null;
        size--;
        modCount++;
        return element;
    }

上段是unlink()方法的源码，我们来分析一下链表的删除操作。

先判断该节点是否存在上一个节点，即是否有前驱节点。无前驱节点则说明要删除的节点为链表的第一节点，那么只需要把该节点的下一个节点设置为链表的第一个节点。有前驱节点则需要把前驱节点的尾部引用指向该节点的下一个节点。

再判断该节点是否存在下一个节点，即是否有后继节点。无后继节点则说明该节点是链表的最后一个节点，那么只需要把该节点前驱节点设置成链表的最后一个节点即可。若有后继节点则需要把后继节点的头部引用指向该节点的上一个节点。

上述的操作核心就是在于将要删除的节点的前驱节点尾部指向该节点的后继节点，将要删除的节点的后继节点的头部指向该节点的前驱节点。这样便完成了链表的删除操作。

下图为删除操作的图例。

针对于LinkedList的新增和删除方法的实现原理剖析，我们得知，删除和新增方法的实现基本是对该节点的上一个节点和下一个节点的引用设置，不需要操作其他节点，相比于ArrayList来说，效率是非常高的，因为ArrayList的新增和操作需要对数组中的数据做遍历复制操作。但是get()方法相比于ArrayList来说效率是非常低的，因为它要从链表的第一个节点或者最后一个节点开始遍历整个链表，直到找到要查询的元素为止，而ArrayList不需要，它可以直接根据数组下标获取元素，效率是非常快的。

那么ArrayList和Linked谁更好用呢？这主要还得根据实际环境选择。如果该集合需要频繁的新增或者删除，那么可以考虑使用LinkedList；如果需要频繁的查询，对查询的性能要求高是，可以考虑使用ArrayList。

集合——LinkedList实现原理分析

猜你喜欢