Talk about the implementation principle of LinkedList in Java

Preface

When it comes to List in Java, generally we use ArrayList the most. As we all know, ArrayList is implemented using arrays, but there is another data structure that can also implement List, which is the familiar linked list. Corresponding to LinkedList in Java, let's understand the implementation by analyzing the source code of Java's LinkedList.

Implementation of LinkedList

Definition of LinkedList

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable {
    
    
}

It can be seen that in addition to the List interface, LinkedList also implements the Deque interface. Deque stands for double-ended queue. Double-ended queue is a magical data structure. Stack and queue are both a special double-ended queue (stack only operates on the head, while queue operates on both ends).
And Deque inherits the Queue interface, so we can use LinkedList to implement both queues and stacks, and become a winner in life (another winner in life is ArrayDeque, and the underlying structure is an array implementation. Without considering thread safety, you can Instead of the Stack class in Java).

The following is the definition of Deque and Queue.

Definition of Queue

public interface Queue<E> extends Collection<E> {
    
    
	/**
	* 在尾部添加元素
	* 当队列已满时抛出IllegalStateException
	*/
	boolean add(E e);
	
	/**
	* 在尾部添加元素
	*/
	boolean offer(E e);
	
	/**
	* 返回头部元素,并从队列中删除
	* 当队列为空时抛出NoSuchElementException
	*/
	E remove();
	
	/**
	* 返回头部元素,并从队列中删除
	*/
	E poll();
	
	/**
	* 查看头部元素,但不改变队列
	* 当队列为空时抛出NoSuchElementException
	*/
	E element();
	
	/**
	* 查看头部元素,但不改变队列
	*/
	E peek();
}

It can be seen that the operations of Queue appear in pairs. Generally speaking, it distinguishes between the normal version and the abnormal version. This is because different processing methods are determined according to the nature of the queue. For example, for a bounded queue, when the queue is full, An exception may need to be thrown. For LinkedList, since the bottom layer is implemented in the form of a linked list, there is no such thing as a queue length limit. Both methods have the same function.

Definition of Deque

public interface Deque<E> extends Queue<E> {
    
    
	// 栈的经典方法
	void push(E e);
	E pop();

	// 双端队列的具体方法
	E getFirst();
	E getLast();
	boolean offerFirst(E e);
	boolean offerLast(E e);
	E peekFirst();
	E peekLast();
	E pollFirst();
	E pollLast();
	E removeFirst();
	E removeLast();
}

It can be seen here that Deque inherits the Queue interface. Compared with the Queue method, Deque adds push() and pop(). These two methods are typical stack operations, both of which operate on head elements. And some of the following xxxFirst() and xxxLast() are specific methods of operating the head and tail elements.

Internal implementation of LinkedList

After introducing the method definition of LinkedList, let's look at the implementation of LinkedList. The internal implementation of LinkedList is a doubly linked list. Each node will record its own predecessor and successor nodes. The advantage of this is that it is easy to find its own predecessor and successor nodes when inserting elements.

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable {
    
    
    
	transient int size = 0;
	transient Node<E> first;		// 头节点
	transient Node<E> last;			// 尾节点
	
	private static class Node<E> {
    
    
        E item;
        Node<E> next;
        Node<E> prev;

        Node(Node<E> prev, E element, Node<E> next) {
    
    
            this.item = element;
            this.next = next;
            this.prev = prev;
        }
    }
}

It can be seen that the data structure is relatively simple. LinkedList maintains three variables, the size of the linked list plus a head node and a tail node. Let's take a look at how each operation method is implemented.

Add element

    public boolean add(E e) {
    
    
        linkLast(e);
        return true;
    }
	
	void linkLast(E e) {
    
    
        final Node<E> l = last;
        final Node<E> newNode = new Node<>(l, e, null);		// #1
        last = newNode;		// #2
        if (l == null)		// #3
            first = newNode;
        else
            l.next = newNode;
        size++;		// #4
        modCount++;
    }

Relative to the queue, the add() method actually adds an element to the end of the queue. When a new element is added:

  1. First, a new node will be created to maintain the relationship between the predecessor and successor nodes.
  2. The tail pointer last points to this new node (this is inevitable).
  3. Judging the special situation, if the queue is empty at the beginning, it means that the head pointer first and last are the same, and the first pointer points to the new element. Otherwise, the subsequent node next of the tail pointer l at the beginning points to the new element.
  4. The number of queues is +1.

Access elements based on index

	public E get(int index) {
    
    
        checkElementIndex(index);		// 检查索引位置是否有效
        return node(index).item;
    }
    
    Node<E> node(int index) {
    
    
        // assert isElementIndex(index);

        if (index < (size >> 1)) {
    
    		// 小优化,判断索引离头节点近还是离尾节点近
            Node<E> x = first;
            for (int i = 0; i < index; i++)
                x = x.next;
            return x;
        } else {
    
    
            Node<E> x = last;
            for (int i = size - 1; i > index; i--)
                x = x.prev;
            return x;
        }
    }

Due to the characteristics of the linked list, when looking up an element, only one element can be traversed, and the average time complexity is O(n).
There is a small optimization here. When traversing elements, first judge the search index and half of the queue (size >> 1) to compare the size. If the index is less than the midpoint, it means that the target is closer to the head node and traverse from the head node. Otherwise, Then traverse from the tail node.

Insert element

The average time complexity of inserting elements and finding elements is the same, both O(n).

    public void add(int index, E element) {
    
    
        checkPositionIndex(index);		// 检查索引位置是否有效

        if (index == size)
            linkLast(element);
        else
            linkBefore(element, node(index));
    }
    
    void linkBefore(E e, Node<E> succ) {
    
    
        // assert succ != null;
        final Node<E> pred = succ.prev;		// #1
        final Node<E> newNode = new Node<>(pred, e, succ);		// #2
        succ.prev = newNode;		// #3
        if (pred == null)		// #4
            first = newNode;
        else
            pred.next = newNode;
        size++;
        modCount++;
    }

If the index to be inserted is exactly the length of the queue, it means that the easiest way is to add elements from the end of the queue. The method linkLast() was introduced in the adding method above, so I won't repeat it here.
Here we mainly look at the linkBefore() method:

  1. Find succ (successor node) and pred (predecessor node), succ is the element to be inserted has been located in the previous node() method.
  2. new a new node newNode , maintain the new relationship (between pred and succ)
  3. Maintain the predecessor node of the successor node as the newly inserted element newNode
  4. Judging the special case: Since the insertion is inserted before the found element, if the found element itself is the head node, the head node needs to be updated to the newly inserted newNode. Under normal circumstances, just point the next pointer of the found predecessor node to the new element.

Delete elements based on index

Deleting an element according to the index still needs to find the element first, and then perform some pointer operations on the elements before and after to achieve the purpose of deletion. The average time complexity is O(n).

	public E remove(int index) {
    
    
        checkElementIndex(index);
        return unlink(node(index));
    }
    
	E unlink(Node<E> x) {
    
    
        // assert x != null;
        final E element = x.item;
        final Node<E> next = x.next;		// #1
        final Node<E> prev = x.prev;

        if (prev == null) {
    
    		// #2
            first = next;
        } else {
    
    
            prev.next = next;
            x.prev = null;
        }

        if (next == null) {
    
    		// #3
            last = prev;
        } else {
    
    
            next.prev = prev;
            x.next = null;
        }

        x.item = null;
        size--;		// #4
        modCount++;
        return element;
    }

The basic idea of ​​deleting a node is to connect the predecessor and post node of the deleted node. The steps are as follows:

  1. First find the prev node and the post node next to delete the element
  2. Special case: judge whether the predecessor node is empty, if it is empty, it means that the deleted element itself is the head element. At this time, the fisrt pointer must be maintained as the successor node. Under normal circumstances, just point the next pointer of the predecessor node to the post node of the deleted element.
  3. Same as #2, handle special cases: determine whether the trailing node is empty. If it is empty, it means that the deleted element is the trailing element. At this time, the last pointer must be maintained as the predecessor node. Under normal circumstances, just point the prev pointer of the successor node to the predecessor node of the deleted element.
  4. Remember to maintain the number of queues after deleting elements -1

to sum up

LinkedList uses a doubly linked list to realize the function of List. Compared with ArrayList, it has the following characteristics:

  1. The space occupancy is small, and there is no need to allocate memory in advance.
  2. Random access is not allowed. When searching for elements by index, the average time complexity is O(n).
  3. Realize the characteristics of queue and stack at the same time.
  4. The operation efficiency of the head element and the tail element is very high, both are O(1).
Reference

[1]. The logic of Java programming by Ma Junchang

Guess you like

Origin blog.csdn.net/vipshop_fin_dev/article/details/111397643