JavaSE Supplement | Understand data structure and source code analysis of commonly used collections

Table of contents

One: data structure

1. Data structure analysis

1.1 Research object 1: logical relationship between data

1.2 Research object 2: data storage structure (or physical structure)

1.3 Research Object 3: Computational Structure

2. Common storage structure: array

3. Common storage structure: linked list

4. Common storage structure: stack

5. Common storage structure: queue

6. Common storage structures: tree and binary tree

6.1 Tree understanding

6.2 Basic concept of binary tree

6.3 Binary tree traversal

6.4 Classic Binary Tree

Two: Source code analysis of commonly used collections

1. List interface analysis

1.1 List interface features

1.2 Dynamic Array ArrayList and Vector

1.3 Linked List LinkedList

2. Map interface analysis

2.1 Physical structure of hash table

2.2 Data adding process in HashMap

2.3 LinkedHashMap


One: data structure

To put it simply, data structure is a methodology for program design optimization. It studies the logical structure and physical structure of data and the relationship between them, and defines corresponding operations for this structure . The purpose is to speed up program execution and reduce The space occupied by memory .

1. Data structure analysis

1.1 Research object 1: logical relationship between data

The logical structure of data refers to reflecting the logical relationship between data elements, which has nothing to do with the storage of data and is independent of the computer

Set structure :同属一个集合 There is no other relationshipbetween elements in the data structure except for the mutual relationship of "There is no logical relationship between collection elements.

Linear structure : The relationship between the elements in the data structure一对一. For example: queuing. There must be a unique first element and a unique tail element in the structure. Embodied as: one-dimensional array, linked list, stack, queue.

Tree structure : The relationship between the elements in the data structure一对多. For example: family tree, file system, organizational structure.

Graph structure :多对多 The interrelationships thatexist between elements in a data structureFor example: national railway network, subway map. 

1.2 Research object 2: data storage structure (or physical structure)

Physical structure/storage structure of data: including 数据元素的表示and 关系的表示. The storage structure of data is the realization of logical structure in computer language, and it depends on computer language.

Structure 1: Sequential structure

The sequential structure is to use a group of continuous storage units to store logically adjacent elements in sequence.

Advantages: It only needs to apply for the memory space for storing the data itself, supports subscript access, and can also achieve random access.

Disadvantages: Continuous space must be allocated statically, and the utilization rate of memory space is relatively low. Insertion or deletion may require moving a large number of elements, which is relatively inefficient.

Structure 2: chain structure

Instead of using continuous storage space to store the elements of the structure, a node is constructed for each element.

In addition to storing the data itself, the node also needs to store a pointer to the next node! 

Advantages: Not using continuous storage space leads to relatively high utilization of memory space, and overcomes the disadvantage of predicting the number of elements in the sequential storage structure. When inserting or deleting elements, there is no need to move a large number of elements.

Disadvantages: Additional space is required to express the logical relationship between data, subscript access and random access are not supported.

Structure 3: Index structure

In addition to establishing storage node information, an additional index table is also established to record the address of each element node. An index table consists of several index entries. The general form of an index entry is: (keyword, address).

Advantages: Use the index number of the node to determine the storage address of the node, and the retrieval speed is fast.

Disadvantages: Additional index tables are added, which will take up more storage space. When adding and deleting data, the index table needs to be modified, so it will take more time.

Structure 4: Hash Structure

Calculate the storage address of the element directly according to the keyword of the element, also known as Hash storage.

Advantages: The operations of retrieving, adding and deleting nodes are very fast.

Disadvantages: sorting is not supported, and generally requires more space than linear table storage, and the recorded keywords cannot be repeated.

Summary: In development, it is customary to understand the storage structure in the following way

①Linear list (one-to-one relationship): one-dimensional array, one-way linked list, two-way linked list, stack, queue, etc.

② Tree (one-to-many relationship): binary tree, B+ tree, etc.

③ graph (many-to-many relationship).

④Hash table, such as: HashSet, HashMap, etc.

1.3 Research Object 3: Computational Structure

Operations performed on data include the definition and implementation of operations. The definition of operation is aimed at the logic structure, indicating the function of the operation; the realization of the operation is aimed at the storage structure, indicating the specific operation steps of the operation.

① Allocate resources, build structures, and release resources. 

② Insert and delete. 

③Acquisition and traversal. 

④ Modify and sort.

2. Common storage structure: array

In Java, an array is used to store a collection of the same data type. Note that only the same data type can be stored !

//只声明了类型和长度
数据类型[]  数组名称 = new 数据类型[数组长度];

//声明了类型,初始化赋值,大小由元素个数决定
数据类型[] 数组名称 = {数组元素1,数组元素2,......}

Example: array of integers

Example: array of objects

Physical structure features:

① Apply for memory: Apply for a large continuous space at a time. Once the application is made, the memory will be fixed.

②It cannot be dynamically expanded (if the initialization is too large, it is a waste; if it is too small, it is not enough), the insertion is fast, and the deletion and search are slow.

③Storage features: All data is stored in this continuous space, and each element in the array is a specific data (or object), and all data are closely arranged without intervals.

The details are as follows:

Common operations:

package com.zl.array;

public class ArrayTest01 {
    public static void main(String[] args) {
        Array array = new Array(10);
        // 添加
        array.add(1);
        array.add(2);
        array.add(3);
        array.add(4);
        array.add(5);
        // 查找
        System.out.println(array.find(3)); // 2
        // 删除
        System.out.println(array.remove(2)); // true
        // 打印
        array.print();  // 1	3	4	5

    }
}

// 自定义数组
class Array{
    private Object[] elementData;
    private int size;
    // 初始化大小
    public Array(int capacity) {
        elementData = new Object[capacity];
        this.size = 0;
    }
    /**
     * 添加元素
     */
    public void add(Object value){
        if (size >= elementData.length) {
            throw new RuntimeException("数组已满,不可添加!");
        }
        // 把数据放进去
        elementData[size] = value;
        size++;
    }
    /**
     * 查询元素 value 在数组中的索引位置
     */
    public int find(Object value){
        for (int i = 0; i < size; i++) {
            if (elementData[i].equals(value)){
                return i;
            }
        }
        return -1;
    }
    /**
     *从当前数组中移除首次出现的 value 元素
     */
    public boolean remove(Object value){
        // 查找value对应的下标
        int index = find(value);
        if (index == -1){
            return false;
        }
        for (int i = index; i < size-1; i++) {
            elementData[i] = elementData[i+1];
        }
        elementData[size-1] = null;
        size--;
        return true;
    }

    /**
     * 遍历数组中所有数据
     */
    public void print(){
        for (int i = 0; i < size; i++) {
            System.out.print(elementData[i]+"\t");
        }
        System.out.println();
    }

}

3. Common storage structure: linked list

① Logical structure: linear structure.

②Physical structure: Continuous storage space is not required.

③Storage features: The linked list is composed of a series of nodes (each element in the linked list is called a node), and the nodes can be dynamically created during code execution. Each node consists of two parts : one is a data field that stores data elements , and the other is a pointer field that stores the address of the next node .

A common linked list structure has the following form:

Singly linked list storage structure diagram

Simulation implementation:

package com.zl.array;

public class LinkedTest {
    public static void main(String[] args) {
        SingleLinked singleLinked = new SingleLinked();
        // 添加
        singleLinked.add(1);
        singleLinked.add(3);
        singleLinked.add(2);
        // 打印
        singleLinked.print();
    }
}



// 定义节点
class Node{
    // 存储数据
    Object data;
    // 下一个节点的地址
    Node next;
    // 构造方法
    public Node() {
    }
    public Node(Object data, Node next) {
        this.data = data;
        this.next = next;
    }
}

// 定义链表
class SingleLinked{
    // 头节点
    Node header;
    // 元素的个数
    int size;

    // 添加元素
    public void add(Object data){
        // 如果当前为空,把新添加进来的作为头结点
        if (header == null){
            header = new Node(data,null);
        }else {
            // 如果当前非空,根据头结点找到尾结点进行插入操作
            Node lastNode = findLastNode(header);
            lastNode.next = new Node(data,null);
        }
        size++;
    }

    public Node findLastNode(Node node) {
        Node cur = node;
        while (cur.next != null){
            cur = cur.next;
        }
        return cur;
    }

    // 打印
    public void print(){
        Node cur = header;
        while (cur != null){
            System.out.print(cur.data+"\t");
            cur = cur.next;
        }
    }
}

Doubly linked list storage structure diagram

4. Common storage structure: stack

(1) Stack (Stack), also known as stack or stack, is a linear table that restricts insertion and deletion operations to only one end of the table.

(2) The stack 先进后出(FILO,first in last out)stores data according to the principle of , the data that enters first is pushed to the bottom of the stack, and the last data is at the top of the stack. Each deletion (removal) always deletes the last inserted (pushed) element in the current stack, and the first inserted element is placed at the bottom of the stack and cannot be deleted until the end.

(3) The stack structure in the core class library includes Stack and LinkedList.

①Stack is a sequential stack, which is a subclass of Vector.

②LinkedList is a chained stack.

(4) The operation method that embodies the stack structure:

①peek() method: view the top element of the stack without popping up

②pop() method: Pop the stack

③push(E e) method: push into the stack

(5) Time complexity:

①Search:O(n)

②Insert:O(1)

③Remove:O(1)

Graphic:

Simulation implementation

public class MyStack {
    
    private Object[] elements;
    private int index;

    /**
     * 无参数构造方法。默认初始化栈容量10.
     */
    public MyStack() {
        // 一维数组动态初始化
        // 默认初始化容量是10.
        this.elements = new Object[10];
        // 给index初始化
        this.index = -1;
    }

    /**
     * 压栈的方法
     * @param obj 被压入的元素
     */
    public void push(Object obj) throws Exception {
        if(index >= elements.length - 1){
            throw new Exception("压栈失败,栈已满!");
        }
        // 程序能够走到这里,说明栈没满
        // 向栈中加1个元素,栈帧向上移动一个位置。
        index++;
        elements[index] = obj;
        System.out.println("压栈" + obj + "元素成功,栈帧指向" + index);
    }

    /**
     * 弹栈的方法,从数组中往外取元素。每取出一个元素,栈帧向下移动一位。
     * @return
     */
    public Object pop() throws Exception {
        if (index < 0) {
            //方式2:
            throw new Exception("弹栈失败,栈已空!");
        }
        // 程序能够执行到此处说明栈没有空。
        Object obj = elements[index];
        System.out.print("弹栈" + obj + "元素成功,");
        elements[index] = null;
        // 栈帧向下移动一位。
        index--;
        return obj;
    }

  
    // 封装:第一步:属性私有化,第二步:对外提供set和get方法。
    public Object[] getElements() {
        return elements;
    }

    public void setElements(Object[] elements) {
        this.elements = elements;
    }

    public int getIndex() {
        return index;
    }

    public void setIndex(int index) {
        this.index = index;
    }
}

5. Common storage structure: queue

(1) Queue (Queue) is a linear table that only allows insertion at one end and deletion at the other end. (2) The queue is a logical structure, and its physical structure can be an array or a linked list.

(3) The principle of modifying the queue: the modification of the queue is 先进先出(FIFO)的原则carried out according to the rules. Newcomers always join the tail of the queue (that is, "stopping" is not allowed), and the members who leave each time are always at the head of the queue (leaving the team halfway is not allowed), that is, the current "oldest" member leaves the team.

Graphic:

6. Common storage structures: tree and binary tree

6.1 Tree understanding

Explanation of proper nouns:

结点: The data elements in the tree are called nodes.

根节点: The top node is called the root. A tree has only one root and is developed from the root. From another perspective, each node can be considered as the root of its subtree.

父节点: The upper node of the node, as shown in the figure, the parent node of node K is E, and the parent node of node L is G.

子节点: The lower layer node of the node, as shown in the figure, the child node of node E is K node, and the child node of node G is L node.

兄弟节点: Nodes with the same parent node are called sibling nodes, and F, G, and H in the figure are sibling nodes.

结点的度数: The number of subtrees owned by each node is called the degree of the node, such as the degree of node B is 3.

树叶: A node with a degree of 0 is also called a terminal node. D, K, F, L, H, I, and J in the figure are all leaves.

非终端节点(或分支节点): Nodes other than leaves, or nodes whose degree is not 0. The root, A, B, C, E, and G in the figure are all.

树的深度(或高度): The maximum number of levels of nodes in the tree, the depth of the tree in the figure is 4.

结点的层数: The branch tree on the path from the root node to a certain node in the tree is called the number of layers of the node, the number of layers of the root node is specified as 1, and the number of layers of other nodes is equal to the number of layers of its parent node + 1.

同代: Nodes with the same number of levels in the same tree.

6.2 Basic concept of binary tree

Binary tree is an important type of tree structure. The characteristic of a binary tree is that each node can only have at most two subtrees, and there are left and right points. The data structure abstracted from many practical problems is often in the form of a binary tree. The storage structure and algorithm of the binary tree are relatively simple, so the binary tree is particularly important.

6.3 Binary tree traversal

Preorder traversal: middle left (left left)

That is, visit the root node first, then traverse the left subtree in preorder, and finally traverse the right subtree in preorder. The preorder traversal operation visits each node of the binary tree in the order of root, left and right.

Inorder traversal: left center right (left root right)

That is, first traverse the left subtree in preorder, then visit the root node, and finally traverse the right subtree in inorder. The in-order traversal operation visits each node of the binary tree in the order of left, root and right.

Post-order traversal: left and right middle (left and right roots)

That is, the left subtree is traversed in sequence, then the right subtree is traversed in sequence, and finally the root node is visited. The post-order traversal operation visits each node of the binary tree in the order of left, right and root.

Preorder traversal: ABDHIECFG

Inorder traversal: HDIBEAFCG

Post-order traversal: HIDEBFGCA  

6.4 Classic Binary Tree

full binary tree

A binary tree with two child nodes for all nodes on each level except the last level which does not have any child nodes. The number of nodes in the nth layer is 2 to the n-1 power, and the total number of nodes is 2 to the n-th power-1

complete binary tree

 Leaf nodes can only appear in the bottom two layers, and the bottom leaf nodes are all on the left side of the second bottom leaf nodes.

Binary sort/lookup/search tree

It is BST (binary search/sort tree). satisfy the following properties:

(1) If its left subtree is not empty, the values ​​of all nodes on the left subtree are less than the value of its root node;

(2) If the values ​​of all nodes on its right subtree are greater than the value of its root node;

(3) Its left and right subtrees are also binary sorting/finding/searching trees respectively.

Note: Perform in-order traversal on the binary search tree to obtain an ordered collection; easy to retrieve. 

balanced binary tree

(Self-balancing binary search tree, AVL) is first a binary sorting tree, and has the following properties:

(1) It is an empty tree or the absolute value of the height difference between its left and right subtrees does not exceed 1;

(2) And the left and right subtrees are also a balanced binary tree;

(3) It is not required that non-leaf nodes have two child nodes. 

Note: The purpose of balancing the binary tree is to reduce the level of the binary search tree and improve the search speed. Common implementations of balanced binary trees include red-black trees, AVL, scapegoat trees, Treap, and splay trees.

 

red black tree

That is Red-Black Tree. Each node of the red-black tree has a storage bit indicating the color of the node, which can be red (Red) or black (Black). A red-black tree is a self-balancing binary search tree, a data structure used in computer science that was invented by Rudolf Bayer in 1972. A red-black tree is complex, but its operations are 良好的最坏情况运行时间and are 实践中是高效的: it can do lookup, insertion, and deletion in O(log n) time, where n is the number of elements in the tree.

Characteristics of red-black tree:

①Each node is red or black.

②The root node is black.

③Each leaf node (NIL) is black. (Note: The leaf node here refers to the leaf node that is empty (NIL or NULL)).

④ Both child nodes of each red node are black. (There cannot be two consecutive red nodes on all paths from each leaf to the root).

⑤ All paths from any node to each of its leaves contain the same number of black nodes (ensure that no path will be twice as long as the others).

When we insert or delete a node, the existing red-black tree may be destroyed, so that it does not meet the above five requirements, then it needs to be processed at this time, so that it continues to meet the above five requirements:

①recolor: Turn a node red or black.

rotation: Rotate some branches of the red-black tree (left-handed or right-handed).

The red-black tree can ensure the balance of the binary tree as much as possible through the red nodes and black nodes. It is mainly used to store ordered data, and its time complexity is O(logN), which is very efficient! 

Two: Source code analysis of commonly used collections

1. List interface analysis

1.1 List interface features

(1) All elements of the List collection are 线性方式stored in one type, for example, the order of storing elements is 11, 22, 33. Then in the collection, the storage of elements is done in the order of 11, 22, 33).

(2) It is a 存取有序collection of elements. That is, the order in which elements are stored and taken out is guaranteed.

(3) It is a 带有索引collection, and the elements in the collection can be precisely manipulated by indexing (the same reason as the indexing of the array).

(4) The elements that can 重复exist can be compared whether they are duplicate elements through the equals method of the element.

 Note: The List collection cares about whether the elements are ordered, not whether they are repeated!

The main implementation classes of the List interface:

– ArrayList: dynamic array;

– Vector: dynamic array;

– LinkedList: doubly linked list;

– Stack: stack;

1.2 Dynamic Array ArrayList and Vector

There are two dynamic array implementations in the implementation class of Java's List interface : ArrayList and Vector!

(1) The difference between ArrayList and Vector

Their underlying physical structures are arrays, which we call dynamic arrays!

①ArrayList is a new version of dynamic array, which is not thread-safe and has high efficiency. Vector is an old version of dynamic array, which is thread-safe and has low efficiency.

②The expansion mechanism of dynamic arrays is different. The default expansion of ArrayList is 1.5 times of the original size, and the default expansion of Vector is 2 times of the original size.

③ The initialization capacity of the array, if the initialization capacity is not explicitly specified when constructing the collection objects of ArrayList and Vector, then the initial capacity of the internal array of Vector is 10 by default, and the ArrayList is also 10 in JDK 6.0 and earlier versions, JDK8. The version ArrayList after 0 is initialized as an empty array with a length of 0, and then an array with a length of 10 is created when the first element is added. Reason: When using, create an array again to avoid waste. Because the return value of many methods is the ArrayList type, an ArrayList object needs to be returned. For example, in the method of querying objects from the database later, the return value is often an ArrayList. It is possible that the data you want to query does not exist, or return null, or return an ArrayList object without elements.

(2) Partial source code analysis of ArrayList

In JDK7:

① When instantiating an object

In the new ArrayList() collection, the underlying layer will actually initialize an Object array with a length of 10:

ArrayList<String> list = new ArrayList<>();
// 就等价于
Object[] elementData = new Object[10];

② When the add method is called

When the add method is called, the underlying layer will actually assign values ​​to the Object array

list.add("AA");
// 就等价于
elementData[0] = "AA";

③ When expansion is required

When the 11th element is to be added, and the underlying elementData array is full, it needs to be expanded; the default expansion is 1.5 times the original

// 扩容到原来的1.5倍
int newCapacity = oldCapacity + (oldCapacity >> 1); 

and copy the elements of the original array to the new array 

//复制一个新数组
elementData = Arrays.copyOf(elementData, newCapacity);

In jdk8:

①In JDK8, an Object array is also created when instantiated, but the creation is an array with a length of 0

ArrayList<String> list = new ArrayList<>();
// 就等价于
Object[] elementData = new Object[]{}; // 静态初始化一个长度为0的数组 

 ② When the add method is called for the first time, when adding elements, the length of the array will be initialized to 10 first, and then the elements will be added

// 第一步
elementData = new Object[10];
// 第二步
elementData[0] = "AA";

Summary: Before JDK7, ArrayList was similar to the hungry man style in the singleton mode, and the object was created when it came up. After JDK8, ArrayList is similar to the lazy style in the singleton mode, and I will create it when I need it! For the Vector collection, it is also initialized as an Object array with a length of 10, but the expansion is twice the original!

1.3 Linked List LinkedList

There is an implementation of a double-linked list in Java: LinkedList, which is an implementation class of the List interface.

LinkedList is one 双向链表, as shown in the figure:

The difference between linked list and dynamic array

①The underlying physical structure of a dynamic array is an array, so the efficiency of index access is very high. But insertions and deletions at non-end positions are not efficient because elements are moved. In addition, adding operations involves expansion issues, which will increase the consumption of time and space.

②The underlying physical structure of the linked list is a linked list, so the efficiency of accessing based on the index is not high, that is, it is slow to find elements. However, insertion and deletion do not need to move elements, but only need to modify the pointing relationship of front and rear elements, so inserting and deleting elements is fast. And the addition of the linked list will not involve the expansion problem .

LinkedList source code analysis

//属性
transient Node<E> first; //记录第一个结点的位置
transient Node<E> last; //记录当前链表的尾元素
transient int size = 0; //记录最后一个结点的位置

//构造器
public LinkedList() {
}

//方法:add()相关方法
public boolean add(E e) {
    linkLast(e); //默认把新元素链接到链表尾部
    return true;
}

void linkLast(E e) {
    final Node<E> l = last; //用 l 记录原来的最后一个结点
    //创建新结点
    final Node<E> newNode = new Node<>(l, e, null);
    //现在的新结点是最后一个结点了
    last = newNode;
    //如果l==null,说明原来的链表是空的
    if (l == null)
        //那么新结点同时也是第一个结点
        first = newNode;
    else
        //否则把新结点链接到原来的最后一个结点的next中
        l.next = newNode;
    //元素个数增加
    size++;
    //修改次数增加
    modCount++;
}

//其中,Node类定义如下
private static class Node<E> {
    E item; //元素数据
    Node<E> next; //下一个结点
    Node<E> prev; //前一个结点

    Node(Node<E> prev, E element, Node<E> next) {
        this.item = element;
        this.next = next;
        this.prev = prev;
    }
}
//方法:获取get()相关方法
public E get(int index) {
    checkElementIndex(index);
    return node(index).item;
} 

//方法:插入add()相关方法
public void add(int index, E element) {
    checkPositionIndex(index);//检查index范围

    if (index == size)//如果index==size,连接到当前链表的尾部
        linkLast(element);
    else
        linkBefore(element, node(index));
}

Node<E> node(int index) {
    // assert isElementIndex(index);
	/*
	index < (size >> 1)采用二分思想,先将index与长度size的一半比较,如果index<size/2,就只从位置0
	往后遍历到位置index处,而如果index>size/2,就只从位置size往前遍历到位置index处。这样可以减少一部
	分不必要的遍历。
	*/
    //如果index<size/2,就从前往后找目标结点
    if (index < (size >> 1)) {
        Node<E> x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {//否则从后往前找目标结点
        Node<E> x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        return x;
    }
}

//把新结点插入到[index]位置的结点succ前面
void linkBefore(E e, Node<E> succ) {//succ是[index]位置对应的结点
    // assert succ != null;
    final Node<E> pred = succ.prev; //[index]位置的前一个结点

    //新结点的prev是原来[index]位置的前一个结点
    //新结点的next是原来[index]位置的结点
    final Node<E> newNode = new Node<>(pred, e, succ);

    //[index]位置对应的结点的prev指向新结点
    succ.prev = newNode;

    //如果原来[index]位置对应的结点是第一个结点,那么现在新结点是第一个结点
    if (pred == null)
        first = newNode;
    else
        pred.next = newNode;//原来[index]位置的前一个结点的next指向新结点
    size++;
    modCount++;
}

//方法:remove()相关方法
public boolean remove(Object o) {
    //分o是否为空两种情况
    if (o == null) {
        //找到o对应的结点x
        for (Node<E> x = first; x != null; x = x.next) {
            if (x.item == null) {
                unlink(x);//删除x结点
                return true;
            }
        }
    } else {
        //找到o对应的结点x
        for (Node<E> x = first; x != null; x = x.next) {
            if (o.equals(x.item)) {
                unlink(x);//删除x结点
                return true;
            }
        }
    }
    return false;
}
E unlink(Node<E> x) {//x是要被删除的结点
    // assert x != null;
    final E element = x.item;//被删除结点的数据
    final Node<E> next = x.next;//被删除结点的下一个结点
    final Node<E> prev = x.prev;//被删除结点的上一个结点

    //如果被删除结点的前面没有结点,说明被删除结点是第一个结点
    if (prev == null) {
        //那么被删除结点的下一个结点变为第一个结点
        first = next;
    } else {//被删除结点不是第一个结点
        //被删除结点的上一个结点的next指向被删除结点的下一个结点
        prev.next = next;
        //断开被删除结点与上一个结点的链接
        x.prev = null;//使得GC回收
    }

    //如果被删除结点的后面没有结点,说明被删除结点是最后一个结点
    if (next == null) {
        //那么被删除结点的上一个结点变为最后一个结点
        last = prev;
    } else {//被删除结点不是最后一个结点
        //被删除结点的下一个结点的prev执行被删除结点的上一个结点
        next.prev = prev;
        //断开被删除结点与下一个结点的连接
        x.next = null;//使得GC回收
    }
    //把被删除结点的数据也置空,使得GC回收
    x.item = null;
    //元素个数减少
    size--;
    //修改次数增加
    modCount++;
    //返回被删除结点的数据
    return element;
}

public E remove(int index) { //index是要删除元素的索引位置
    checkElementIndex(index);
    return unlink(node(index));
}

2. Map interface analysis

2.1 Physical structure of hash table

The bottom layer of HashMap and Hashtable is a hash table (also called a hash table), which maintains an array table of Entry type whose length is a power of 2. Each index position of the array is called a bucket. Add The mapping relationship (key, value) is finally encapsulated as a Map.Entry type object and placed in a table[index] bucket.

The purpose of using an array is to query and add efficiently, and you can directly locate a certain table[index] according to the index

2.2 Data adding process in HashMap

In JDK7:

(1) HashMap map = new HashMap(); the bottom layer actually creates an array of Entry[] with a length of 16 Entry[] table = new Entry[16];

(2) map.put(key1, value1); Add (key1, value1) to the current hashmap object. First, the hashCode() method of the class where key1 is located is called to calculate the hash value 1 of key1, and the hash value 1 is then subjected to some calculation ( hash() method ) to obtain the hash value 2. This hash value 2 is then subjected to some kind of operation (indexFor() method ) to determine the index position i in the underlying table array.

①If the data at the index of the array i is empty, then (key1, value1) is directly added successfully------position 1;

②If the data on the array index i is not empty and has (key2, value2), further judgment is required: determine whether the hash value 2 of key1 is the same as the hash value of key2:

③If the hash values ​​are different, then (key1, value1) is directly added successfully --- position 2; if the hash values ​​are the same, you need to continue to call the equals() method of the class where key1 is located, and put key2 into equals( ) parameter to judge:

④The equals method returns false: then (key1, value1) is directly added successfully --- position 3; the equals method returns true: by default, value1 will overwrite value2.

(3) Position 1: directly store (key1, value1) in the position of index i of the table array in the form of Entry object.

(4) Position 2, position 3: (key1, value1) and the existing elements are stored in the position of index i of the table array in the form of a linked list, and the newly added element points to the old added element (JDK7 is a header insertion method ) .

(5) In the case of continuous addition, the capacity will be expanded if the following conditions are met:

if ((size >= threshold) && (null != table[bucketIndex]))

By default, when the number of elements to be added exceeds 12 (ie: the result obtained by the length of the array * loadFactor [ loading factor 75% ]), expansion should be considered.

(6) The structure of Entry is as follows:

static class Entry<K,V> implements Map.Entry<K,V> {
    final K key;
    V value;
    Entry<K,V> next;
    int hash;  //使用key得到的哈希值2进行赋值。

    /**
         * Creates new entry.
         */
    Entry(int h, K k, V v, Entry<K,V> n) {
        value = v;
        next = n;
        key = k;
        hash = h;
    }
}

In JDK8:

Differences between JDK8 and JDK7:

① In jdk8, when we create a HashMap instance, the bottom layer does not initialize the table array. When adding (key, value) for the first time, make a judgment, and if it is found that the table has not been initialized, initialize the array.
② In jdk8, the bottom layer of HashMap defines the Node inner class, replacing the Entry inner class in jdk7. Means, the array we create is Node[].
③ In jdk8, if the current (key, value) has undergone a series of judgments, it can be added to the current array subscript i. If there is an element at the position of subscript i at this time. In jdk7 , the new (key, value) points to the existing old element ( head insertion method ), while in jdk 8, the old element points to the new (key, value) element ( tail insertion method ). "Up and down"
jdk7 : array + one-way linked list; jk8 : array + one-way linked list + red-black tree
When will the one-way linked list become a red-black tree?

If the number of elements at the index i position of the array reaches 8 and the length of the array reaches 64, we change the multiple elements at the index i position to use the red-black tree structure for storage. (Why modify? The time complexity of red-black tree for put/get/remove operation is O(logn) , which is better than the time complexity of singly linked list O(n), and the performance is higher.

When will a red-black tree be used to become a singly linked list?

When the number of elements at the index i position of the red-black tree is lower than 6 , the red-black tree structure will be degenerated into a one-way linked list

2.3 LinkedHashMap

LinkedHashMap is a subclass of HashMap; LinkedHashMap adds a pair of two-way linked lists on the basis of the array + one-way linked list + red-black tree used by HashMap, and records the order of the added (key, value); it is convenient for us to traverse all key-value.

Through the source code, it is found that when the put method is called, the put method of HashMap is actually called. Until the hash value 1 and hash value 2 are obtained, and finally the array subscript is actually added, a method newNode is called.

// LinkedHashMap重写了HashMap的如下方法
Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {
    LinkedHashMap.Entry<K,V> p = new LinkedHashMap.Entry<K,V>(hash, key, value, e);
    linkNodeLast(p);
    return p;
}

And the newNode method calls the Entry method of LinkedHashMap, indicating that there is an Entry inside

2. 底层结构:LinkedHashMap内部定义了一个Entry
static class Entry<K,V> extends HashMap.Node<K,V> {
    Entry<K,V> before, after; //增加的一对双向链表
    Entry(int hash, K key, V value, Node<K,V> next) {
        super(hash, key, value, next);
    }
}

Interview classic example

① First add the p1 element, which is the hash value calculated based on 1001 and "AA"; the p2 element is added based on the hash value calculated based on 1002 and "BB".

②When modifying p1.name to "CC", and then go to remove (delete) p1 at this time, it cannot be deleted, because the deletion at this time is to delete the hash value obtained by 1001 and "CC", which is the same as the original 1001 and " AA" calculated hash value is different!

③At this time, it can be added when adding new Person(1001, "CC"). Although this element exists at this time, it was added based on the hash value calculated based on 1001 and "AA", but p1. The name was changed to "CC". At this time, adding is to add according to 1001 and "CC", it can be added!

④new Person(1001, "AA") can also be added, although the hash value obtained at this time is the same as before, and the equals method needs to be called at this time. It used to be 1001 and "CC", but now it is 1001 and "AA ", not the same, added successfully!

package com.atguigu03.map.interview;

import java.util.HashSet;

// Person已重写了HashCode方法和equals方法
public class HashSetDemo {
    public static void main(String[] args) {
        HashSet set = new HashSet();
        Person p1 = new Person(1001,"AA");
        Person p2 = new Person(1002,"BB");

        set.add(p1);
        set.add(p2);
        System.out.println(set);

        p1.name = "CC";
        set.remove(p1);
        System.out.println(set);

        set.add(new Person(1001,"CC"));
        System.out.println(set);

        set.add(new Person(1001,"AA"));
        System.out.println(set);

    }
}

Results of the:

Guess you like

Origin blog.csdn.net/m0_61933976/article/details/130198303