Stage 10: Summary of the topic (Chapter 1: Basics)

Insert image description here
Insert image description here

Chapter 1: Basics

Key points of the basic chapter: algorithms, data structures, basic design patterns
Insert image description here

1. binary search

Require

  • Be able to describe the binary search algorithm in your own language
  • Master the code and details of handwritten binary search
  • Quickly answer multiple choice questions about binary search
  • Able to answer some test questions after changes

Algorithm Description

  1. Prerequisite: yesSorted array A(assuming it has been done)

  2. definitionLeft border L, right border R, determine the search scope,Loop to perform binary search(Steps 3 and 4)

  3. ObtainMiddle index M = Floor((L+R) /2)(Floor represents rounding down, L and R are decimals in js)

  4. The value A[M] of the intermediate index and the value T to be searchedCompare

    ① A[M] == T means found, return the intermediate index

    ② A[M] > T, other elements on the right side of the middle value are greater than T, no need to compare, search on the left side of the middle index, set M - 1 as the right boundary, search again

    ③ A[M] < T, other elements to the left of the middle value are less than T, no need to compare, search to the right of the middle index, set M + 1 to the left boundary, search again

  5. whenWhen L > R, it means that it is not found and the loop should end.

For a more vivid description, please refer to: binary_search.html

Algorithm implementation

public static int binarySearch(int[] a, int t) {
    
    
    int l = 0, r = a.length - 1, m;
    while (l <= r) {
    
    
        //m = (l + r) / 2;
        int m = (l + r) >>> 1;   //移位,向右移一位;效率高;对于正数来说右移一位等价于除二
        if (a[m] == t) {
    
    
            return m;
        } else if (a[m] > t) {
    
    
            r = m - 1;
        } else {
    
    
            l = m + 1;
        }
    }
    return -1;
}

test code

public static void main(String[] args) {
    
    
    int[] array = {
    
    1, 5, 8, 11, 19, 22, 31, 35, 40, 45, 48, 49, 50};
    int target = 47;
    int idx = binarySearch(array, target);
    System.out.println(idx);
}

The program returns: -1

Solve integer overflow problem
When l and r are both large, l + rthey may exceed the integer range (overflow), causing operation errors:
Insert image description here

There are two solutions:

int m = l + (r - l) / 2;

Another one is:

int m = (l + r) >>> 1;   //移位,向右移一位;效率高;对于正数来说右移一位等价于除二

Other test methods

  1. There is an ordered list of 1,5,8,11,19,22,31,35,40,45,48,49,50. When a node with a binary search value of 48 is found, the number of comparisons required to find it successfully is

  2. When using the binary method to find element 81 in the sequence 1,4,6,7,15,33,39,50,64,78,75,81,89,96, ( ) comparisons are required

  3. Binary search for a number in an array with 128 elements requires no more than a maximum number of comparisons.

  For the first two questions, remember a briefJudgment formulaIf the odd number is divided into two parts, take the middle, and if the even number is divided into two parts, take the middle to the left.(The next number does not include this number). For the latter question, you need to know the formula:

n = l o g 2 N = l o g 10 N / l o g 10 2 n = log_2N = log_{10}N/log_{10}2 n=log2N=log10N/log102

where nis the number of searches and N is the number of elements

summary:

  • If you divide an odd number into two parts, take the middle, and if you divide an even number into two parts, take the middle to the left (the next number will not include this number)
  • n = l o g 2 N = l o g 10 N / l o g 10 2 n = log_2N = log_{10}N/log_{10}2n=log2N=log10N/log102

where nis the number of searches and N is the number of elements

Precautions:

Insert image description here

2. Bubble sort (code test)

Ranked interview questions:

  • Master common sorting algorithms (quick sort, bubble, selection, insertion, etc.)Implementation ideas
  • Handwritten bubbling and quick sorting code
  • learnCharacteristics of Each Sorting Algorithm, such as time complexity and whether it is stable

Require

  • Be able to describe the bubble sort algorithm in your own language
  • Ability to handwrite bubble sort code
  • Learn about some optimization methods for bubble sorting

Algorithm Description

  1. Compare the sizes of two adjacent elements in the array in sequence. If a[j] > a[j+1], exchange the two elements. Comparing both elements is called a round of bubbling. The result is that the largest element is ranked to at last
  2. Repeat the above steps until the entire array is sorted

For a more vivid description, please refer to: bubble_sort.html

Algorithm implementation

public static void bubble(int[] a) {
    
    
    for (int j = 0; j < a.length - 1; j++) {
    
    
        // 一轮冒泡
        boolean swapped = false; // 是否发生了交换
        for (int i = 0; i < a.length - 1 - j; i++) {
    
      //(a.length - 1 - j)减少比较次数
            System.out.println("比较次数" + i);
            if (a[i] > a[i + 1]) {
    
    
                swap(a, i, i + 1); //交换
                swapped = true;   //代表发生了交换
            }
        }
        System.out.println("第" + j + "轮冒泡"
                           + Arrays.toString(a));
        if (!swapped) {
    
      //减少冒泡次数;
            break;
        }
    }
}

Exchange function:
Insert image description here

  • Optimization point 1: After each round of bubbling, the inner loop can be reduced once.
  • Optimization point 2: If no exchange occurs in a certain round of bubbling, it means that all data is in order and the outer loop can be ended.

advanced optimization

public static void bubble_v2(int[] a) {
    
    
    int n = a.length - 1;
    while (true) {
    
    
        int last = 0; // 表示最后一次交换索引位置
        for (int i = 0; i < n; i++) {
    
    
            System.out.println("比较次数" + i);
            if (a[i] > a[i + 1]) {
    
    
                swap(a, i, i + 1);
                last = i; //每轮冒泡最后一次交换的i的坐标;for循环结束后,其右侧都是排好的;
            }
        }
        n = last;
        System.out.println("第轮冒泡"
                           + Arrays.toString(a));
        if (n == 0) {
    
    
            break;
        }
    }
}
  • During each round of bubbling, the last exchange index can be used as the number of comparisons for the next round of bubbling. If this value is zero, it means that the entire array is in order, and you can just exit the outer loop.

Summarize:

Insert image description here

3. Select sort (code test)

Require

  • Be able to describe the selection sort algorithm in your own language
  • Able to compare selection sort and bubble sort
  • Understand unstable sorting and stable sorting

Algorithm Description

  1. Divide the array into two subsets, sorted and unsorted. In each round, select the smallest element from the unsorted subset and put it into the sorted subset.

  2. Repeat the above steps until the entire array is sorted

For a more vivid description, please refer to: selection_sort.html

Algorithm implementation
The left side is all arranged (from scratch), the right side is all waiting to be arranged, keep looking to the right for the smallest one and then swap (put it on the left):

public static void selection(int[] a) {
    
    
    for (int i = 0; i < a.length - 1; i++) {
    
    
        // i 代表每轮选择最小元素要交换到的目标索引
        int s = i; // 代表最小元素的索引
        for (int j = s + 1; j < a.length; j++) {
    
    
            if (a[s] > a[j]) {
    
     // j 元素比 s 元素还要小, 更新 s
                s = j;  //s代表较小值的索引,记录右侧最小值的索引
            }
        }
        if (s != i) {
    
    
            swap(a, s, i); //交换;
        }
        System.out.println(Arrays.toString(a));
    }
}
  • Optimization point: In order to reduce the number of exchanges, you can first find the smallest index in each round, and then exchange elements at the end of each round (the above is optimized)

Compare with bubble sort

  1. The average time complexity of both is O ( n 2 ) O(n^2)O ( n2)

  2. Selection sort is generally faster than bubble sort because it requires fewer exchanges.

  3. But if the collection is highly ordered, bubbling is better than selection.

  4. 冒泡Belongs to stable sorting algorithm, but 选择belongs to unstable sorting

    • Stable sorting refers to sorting multiple times by different fields in the object without disrupting the order of elements with the same value.
    • The opposite is true for unstable sorting

Stable sorting and unstable sorting
(the code below is just to illustrate the problem and is not important)

System.out.println("=================不稳定================");
Card[] cards = getStaticCards();
System.out.println(Arrays.toString(cards));
selection(cards, Comparator.comparingInt((Card a) -> a.sharpOrder).reversed());
System.out.println(Arrays.toString(cards));
selection(cards, Comparator.comparingInt((Card a) -> a.numberOrder).reversed());
System.out.println(Arrays.toString(cards));

System.out.println("=================稳定=================");
cards = getStaticCards();
System.out.println(Arrays.toString(cards));
bubble(cards, Comparator.comparingInt((Card a) -> a.sharpOrder).reversed());
System.out.println(Arrays.toString(cards));
bubble(cards, Comparator.comparingInt((Card a) -> a.numberOrder).reversed());
System.out.println(Arrays.toString(cards));

They are all sorted by suit first (♠♥♣♦), and then by number (AKQJ…)

  • When the unstable sorting algorithm is sorted numerically, it will disrupt the order of suits with the same value.

    [[♠7], [♠2], [♠4], [♠5], [♥2], [♥5]]
    [[♠7], [♠5], [♥5], [♠4], [♥2], [♠2]]
    

    It turns out that ♠2 is in the front and ♥2 is in the back. After arranging them numerically, their positions have changed.

  • When the stable sorting algorithm sorts by numbers, it will retain the original order of suits with the same value, as shown below. The relative positions of ♠2 and ♥2 remain unchanged.

    [[♠7], [♠2], [♠4], [♠5], [♥2], [♥5]]
    [[♠7], [♠5], [♥5], [♠4], [♠2], [♥2]]
    

4. insertion sort

Require

  • Be able to describe the insertion sort algorithm in your own language
  • Able to compare insertion sort with selection sort

Algorithm Description

  1. Divide the array into two areas, the sorted area and the unsorted area. In each round, the first element is taken from the unsorted area and inserted into the sorted area (the order needs to be guaranteed)

  2. Repeat the above steps until the entire array is sorted

  3. There is no exchange function;

For a more vivid description, please refer to:insertion_sort.html

The algorithm realizes
that the left side is all arranged (from small to large), and the right side is to be arranged. Each time, the first element on the right is placed on the left, and compared with all the elements on the left, it is the left Always sorted by size:

// 修改了代码与希尔排序一致
public static void insert(int[] a) {
    
    
    // i 代表待插入元素的索引
    for (int i = 1; i < a.length; i++) {
    
    
        int t = a[i]; // 代表待插入的元素值
        int j = i;
        System.out.println(j);
        while (j >= 1) {
    
    
            if (t < a[j - 1]) {
    
     // j-1 是上一个元素索引,如果 > t,后移
                a[j] = a[j - 1];
                j--;
            } else {
    
     // 如果 j-1 已经 <= t, 则 j 就是插入位置
                break; //退出循环,减少比较次数;
            }
        }
        a[j] = t; //将待插入的元素插入合适的位置;
        System.out.println(Arrays.toString(a) + " " + j);
    }
}

Compare with selection sort

  1. The average time complexity of both is O ( n 2 ) O(n^2)O ( n2)

  2. In most cases, insertion is slightly better than selection

  3. The time complexity of sorted set insertion is O ( n ) O(n)O ( n )

  4. Insertion is a stable sorting algorithm, while selection is an unstable sorting algorithm.

hint

Insertion sort is usually looked down upon by students, but in fact its status is very important. When sorting small amounts of data, insertion sort will be given priority.

Test method:
You should choose D for the first question and B for the second question.
Insert image description here

5. Hill sort (an improved version of insertion sort, only masterIdeas, no code required)

Require

  • Be able to describe the Hill sorting algorithm in your own language

Algorithm Description

  1. First select a gap sequence, such as (n/2, n/4 ... 1), n ​​is the length of the array

  2. In each round, elements with equal gaps (such as elements 0, 2, 4, 6, and 8) are regarded as a group, and the elements in the group are inserted and sorted for two purposes.

    ① Insertion sorting of a small number of elements is very fast

    ② Let the elements with larger values ​​in the group move to the back faster

  3. When the gap gradually decreases until it is 1, the sorting can be completed

For a more vivid description, please refer to:shell_sort.html

Algorithm implementation

private static void shell(int[] a) {
    
    
    int n = a.length;
    for (int gap = n / 2; gap > 0; gap /= 2) {
    
     //对2取整
        // i 代表待插入元素的索引
        for (int i = gap; i < n; i++) {
    
    
            int t = a[i]; // 代表待插入的元素值
            int j = i;
            while (j >= gap) {
    
    
                // 每次与上一个间隙为 gap 的元素进行插入排序
                if (t < a[j - gap]) {
    
     // j-gap 是上一个元素索引,如果 > t,后移
                    a[j] = a[j - gap];
                    j -= gap;
                } else {
    
     // 如果 j-1 已经 <= t, 则 j 就是插入位置
                    break;
                }
            }
            a[j] = t;
            System.out.println(Arrays.toString(a) + " gap:" + gap);
        }
    }
}

References

6. Quick sort

Require

  • Be able to describe the quick sort algorithm in your own language
  • Master one of the handwritten unilateral loop and bilateral loop codes
  • Can explain the characteristics of quick sorting
  • Understand the performance comparison of Lomuto and Hall partitioning schemes

Algorithm Description

  1. Each round of sorting selects a pivot point for partitioning
    1. Let elements smaller than the reference point enter one partition, and elements larger than the reference point enter another partition.
    2. When partitioning is complete, the position of the base point element is its final position
  2. Repeat the above process within the sub-partition until the number of elements in the sub-partition is less than or equal to 1, which embodies the idea of ​​divide and conquer.divide-and-conquer
  3. As can be seen from the above description, a key lies in the partitioning algorithm. Common ones include Lomuto partitioning scheme, bilateral loop partitioning scheme, and Hall partitioning scheme.

For a more vivid description, please refer to: quick_sort.html

Unilateral loop quick arrangement(lomuto Lomuto zoning plan)

  1. Select the rightmost element as the base point element

  2. The j pointer is responsible for finding the element smaller than the reference point . Once found, it is exchanged with i

  3. The i pointer maintains the boundary of elements smaller than the reference point and is also the target index of each exchange

  4. Finally, the reference point is exchanged with i, which is the partition position.

The following code arepresents an array, lrepresents the left boundary, and hrepresents the right boundary;

public static void quick(int[] a, int l, int h) {
    
          //递归
    if (l >= h) {
    
    
        return;
    }
    int p = partition(a, l, h); // p 基准点的正确索引值;
    quick(a, l, p - 1); // 左边分区的范围确定
    quick(a, p + 1, h); // 左边分区的范围确定
}

private static int partition(int[] a, int l, int h) {
    
       //实现分区
    int pv = a[h]; // 基准点元素
    int i = l;      //i 指针维护小于基准点元素的边界,也是每次交换的目标索引;
    for (int j = l; j < h; j++) {
    
     //j 指针负责找到比基准点小的元素,一旦找到则与 i 进行交换
        if (a[j] < pv) {
    
    
            if (i != j) {
    
    
                swap(a, i, j);
            }
            i++;
        }
    }
    if (i != h) {
    
      
        swap(a, h, i); //基准点与 i 交换,i 即为分区位置
    }
    System.out.println(Arrays.toString(a) + " i=" + i);
    // 返回值代表了基准点元素所在的正确索引,用它确定下一轮分区的边界
    return i;
}

Bilateral cycle quick arrangement(Not exactly equivalent to hoare partitioning scheme)

  1. Select the leftmost element as the base point element
  2. The j pointer is responsible for finding elements smaller than the reference point from right to left, and the i pointer is responsible for finding elements larger than the reference point from left to right. Once the two are found, they are exchanged until i and j intersect.
  3. Finally, the reference point is exchanged with i (i and j are equal at this time), and i is the partition position.

Main points

  1. The reference point is on the left, and must jbe i

  2. while( i < j && a[j] > pv ) j–

  3. while ( i < j && a[i] <= pv ) i++

The following code arepresents an array, lrepresents the left boundary, and hrepresents the right boundary;

private static void quick(int[] a, int l, int h) {
    
    
    if (l >= h) {
    
    
        return;
    }
    int p = partition(a, l, h);
    quick(a, l, p - 1);
    quick(a, p + 1, h);
}

private static int partition(int[] a, int l, int h) {
    
    
    int pv = a[l];
    int i = l;
    int j = h;
    while (i < j) {
    
    
        // j 从右找小的
        while (i < j && a[j] > pv) {
    
    
            j--;
        }
        // i 从左找大的
        while (i < j && a[i] <= pv) {
    
    
            i++;
        }
        swap(a, i, j); //交换的是i和j位置的元素
    }
    swap(a, l, j); //此时i=j,交换改坐标处的元素与基准点;
    System.out.println(Arrays.toString(a) + " j=" + j);
    return j;
}

Quick sorting features

  1. The average time complexity is O ( nlog 2 ⁡ n ) O(nlog_2⁡n )O ( n l o g2n ) , worst time complexityO ( n 2 ) O(n^2)O ( n2)

  2. When the amount of data is large, the advantages are very obvious

  3. Belongs to unstable sorting

Lomuto Zoning Plan vs Hall Zoning Plan

Supplementary code description

  • day01.sort.QuickSort3 demonstrates a bilateral quick sort improved by the hole method, with fewer comparisons
  • day01.sort.QuickSortHoare demonstrates the implementation of Hall partitioning
  • day01.sort.LomutoVsHoare Comparison of the number of moves implemented by four partitions

7. ArrayList

Require

  • Master ArrayList expansion rules

Expansion rules

  1. ArrayList() will use an array of length

  2. ArrayList(int n) An array with a specified capacity of n will be used.

  3. public ArrayList(Collection<? extends E> c) The size that will be used 集合 c as the array capacity

  4. add(Object o) First expansion to 10(for an array of length ),Expanded again to 1.5 times the previous capacity

  5. addAll(Collection c) When there is no element, expanded to Math.max(10, 实际元素个数),When there are elementsforMath.max(原容量 1.5 倍, 实际元素个数)

Point 4 must be known, and the other points depend on personal circumstances.

hint

  • See the test code day01.list.TestArrayList, which will not be listed here.
  • It should be noted that the reflection method is used in the example to more intuitively reflect the expansion characteristics of ArrayList. However, due to the influence of modularity, JDK 9 has more restrictions on reflection. You need to add VM parameters when running the test code to run it --add-opens java.base/java.util=ALL-UNNAMED. Passed, the following examples all have the same problem

Code description

  • day01.list.TestArrayList#arrayListGrowRule demonstrates the expansion rules of the add(Object) method. The input parameter n represents the array length after printing how many times it has been expanded.

8. Iterator

Used to traverse collection
requirements

  • grasp what is Fail-Fast, what isFail-Safe

Fail-Fast andFail-Safe

  • ArrayList is fail-fasta typical representative ofIt cannot be modified while traversing, and it will fail as soon as possible.

  • CopyOnWriteArrayList Yes fail-safe , a typical representative,It can be modified while traversing. The principle is to separate reading and writing.

hint

  • See the test code day01.list.FailFastVsFailSafe, which will not be listed here.

9. LinkedList (linked list)

Require

  • Able to clearly explain the difference in LinkedList comparison ArrayListand pay attention to correcting some misunderstandings

LinkedList(Basically not used in actual development)

  1. Based on doubly linked list, no need for continuous memory
  2. Random access is slow (traversing along the linked list)
  3. High performance of head-to-tail insertion and deletion, the performance of intermediate additions and deletions is super poor;
  4. Taking up a lot of memory

ArrayList

  1. Array-based, requires contiguous memory
  2. Fast random access (referring to access based on subscripts)
  3. Tail insertion and deletion performance is OKInsert and delete other partswill move data, soPerformance will be low
  4. You can use cpu cache and locality principle (The CPU memory has poor read and write performance, but the CPU cache is fast. The CPU memory can be read and written from the CPU cache.When the cache reads an element in the array, it will also read other elements around the element., the linked list does not work well with the CPU cache)

Not suitable for querying

Code description

  • day01.list.ArrayListVsLinkedList#randomAccess Comparison of random access performance
  • day01.list.ArrayListVsLinkedList#addMiddle compares the performance of inserting into the middle
  • day01.list.ArrayListVsLinkedList#addFirst Comparison of header insertion performance
  • day01.list.ArrayListVsLinkedList#addLast comparison of tail insertion performance
  • day01.list.ArrayListVsLinkedList#linkedListSize prints a LinkedList and takes up memory
  • day01.list.ArrayListVsLinkedList#arrayListSize prints an ArrayList and takes up memory

10. HashMapImplementation details, underlying principles, important

Require

  • Master the basic data structure of HashMap
  • Master arborization
  • Understand the index calculation method, the meaning of secondary hashing, and the impact of capacity on index calculation
  • Master the put process, expansion, and expansion factors
  • Understand the problems that may occur when using HashMap concurrently
  • Understand key design

1) Basic data structure

Q: What is the difference between 1.7 and 1.8 in the underlying storage structure?

  • 1.7 Array + linked list
  • 1.8 Array + (linked list | red-black tree)

    Linked lists and red-black trees can be converted. When there are more elements in the linked list, they are converted into red-black trees. When there are fewer elements in the red-black tree, they degenerate into linked lists.

For a more vivid demonstration, see the information hash-demo.jar. The operation requires jdk14 or above environment. Enter the jar package directory and execute the following command.

java -jar --add-exports java.base/jdk.internal.misc=ALL-UNNAMED hash-demo.jar

  Hash tables can be usedquick search, when checking whether the array in the hash table contains a, the calculated ahash code is calculated by taking the remainder of the length (calculation modulus) from the hash value, and the corresponding bucket subscript is obtained a, and then the corresponding bucket subscript is viewed in the array Whether the subscript is yes aonly requires a few comparisons;

  Expansion :When the number of inserted elements is equivalent to 3/4 of the length of the linked list, expansion will be triggered; the bucket coordinates are recalculated, and different elements are placed in different array subscripts.;
  also do thisThis is to avoid the length of the linked list being too long
  But when the hash values ​​of some elements are the same, no matter how the capacity is expanded, these elements are still in a linked list, and eventually become longer and longer. The solution is for the linked list to evolve into a red-black tree;

2) Arborization and degradation

Question: Why use a red-black tree? Why is it not tree-formed at first? Why is the tree-formation threshold 8? When will it be tree-formed? When will it degenerate into a linked list?

tree meaning

  • Red and black trees are used forAvoid DoS attacks(Construct a large number of values ​​with the same hash value. Without a red-black tree, the linked list will be too long and affect performance),(Tree) Prevent performance degradation when the linked list is too longArborification should be accidental, is a bottom-line strategy
  • The time complexity of hash table lookup and update is O (1) O(1)O ( 1 ) , while the search and update time complexity of red-black tree isO (log 2 ⁡ n) O(log_2⁡n)O(log2n)TreeNode also takes up more space than ordinary Node. If not necessary, try to use a linked list.
  • If the hash value is random enough, it is distributed according to Poisson in the hash table. When the load factor is 0.75, the probability of a linked list with a length exceeding 8 appearing is 0.00000006.The tree formation threshold is chosen to be 8 in order to make the probability of tree formation small enough.(This is to try not to turn into a tree)

There are two conditions for tree formation (both must be met)

  • whenWhen the length of the linked list exceeds the tree threshold 8Try to expand firstto reduce the length of the linked list;
  • ifArray capacity is >=64, will proceedarborization(The child node to the left of the parent node in the tree is smaller than it, and the child node to the right is larger than it)

Degenerate rules

  • Case 1 : InDuring expansion, if the tree is split and the number of tree elements is <= 6, the linked list will be degraded.
  • Case 2 :When removing a tree node (before removal), if one of root, root.left, root.right, root.left.left is null, it will also degenerate into a linked list.

3) Index calculation

Q;How is the index calculated? We already have hashCode, why do we still need to provide hash() method? Why is the array capacity equal to the n power of 2?

Index calculation method

  • First, calculate the object's hashCode()
  • Then call the method HashMapfor hash() secondary hashing
    • The second time hash()is to synthesize the high-level data and make the hash distribution more even (try to avoid the linked list being too long)
  • Finally & (capacity – 1) [bitwise AND (array capacity - 1)] gets the index

Why is the array capacity equal to the n power of 2?

  1. It is more efficient when calculating the index: if it is the nth power of 2, you can use bitwise AND operation instead of modulo
  2. Recalculating indexes during expansion is more efficient: hash & oldCap == 0 The elements of [hash value & old capacity] remain at the original position, otherwise the new position = old position + oldCap

Notice

  • The secondary hashing is to match the design premise that the capacity is the nth power of 2. If the capacity of the hash table is not the nth power of 2 (in this case, the dispersion is good), there is no need to hash twice.
  • The capacity is 2 to the nth power. This design has better index calculation efficiency, but hash the dispersion is not good., requiring secondary hashing as compensation. A typical example of not using this design is Hashtable

4) put and expansion

Question 1: Introduce the put method process. What is the difference between 1.7 and 1.8?
Question 2: Why is the loading factor defaulting to 0.75f
​​put process?

  1. HashMapArrays are created lazily and are created only after the first use.
  2. Calculate index (bucket subscript)
  3. If the bucket subscript is not occupied by anyone yet, create Nodea placeholder and return it.
  4. If the bucket subscript is already occupied by someone
    1. Already TreeNodethe adding or updating logic of the red-black tree
    2. It is normal Nodeand uses the add or update logic of the linked list. If the length of the linked list exceeds the tree threshold, the tree logic will be used.
  5. Before returning, check whether the capacity exceeds the threshold and expand it once it exceeds the threshold.

The difference between 1.7 and 1.8

  1. When inserting nodes into a linked list, 1.7 is the head insertion method and 1.8 is the tail insertion method.

  2. 1.7 is to expand the capacity when it is greater than or equal to the threshold and there is no vacancy, while 1.8 is to expand the capacity when it is greater than the threshold.

  3. 1.8 When expanding and calculating the Node index, it will be optimized

Why does the expansion (loading) factor default to 0.75f?

  1. existA better trade-off between space usage and query time
  2. Greater than this value, space is saved, but the linked list will be longer and affect performance.
  3. If it is less than this value, conflicts will be reduced, but expansion will be more frequent and more space will be occupied.

5) Concurrency issues

Question: What problems will occur under multi-threading?
① Expansion of dead links (1.7)
② Data confusion (1.7, 1.8)

① Expand the dead link (will exist in 1.7)

1.7 The source code is as follows:

void transfer(Entry[] newTable, boolean rehash) {
    
    
    int newCapacity = newTable.length;
    for (Entry<K,V> e : table) {
    
    
        while(null != e) {
    
    
            Entry<K,V> next = e.next;
            if (rehash) {
    
    
                e.hash = null == e.key ? 0 : hash(e.key);
            }
            int i = indexFor(e.hash, newCapacity);
            e.next = newTable[i];
            newTable[i] = e;
            e = next;
        }
    }
}
  • e and next are local variables used to point to the current node and the next node

  • The temporary variables e and next of thread 1 (green) have just referenced these two nodes. Before the node can be moved, a thread switch occurs, and thread 2 (blue) completes the expansion and migration.
    Insert image description here

  • The expansion of thread 2 is completed. Due to the head insertion method, the order of the linked list is reversed. However, the temporary variables e and next of thread 1 still refer to these two nodes, and the migration needs to be done again.
    Insert image description here

  • first cycle

    • The loop runs before thread switching. Note that e points to node a and next points to node b.
    • Insert the a node into the head of e. Note that two copies of the a node are drawn in the picture, but in fact there is only one (two copies are drawn in order to prevent the arrow from scribbling)
    • When the loop ends, e will point to next, which is node b.

Insert image description here

  • second loop
    • next points to node a
    • Head e is inserted into node b
    • When the loop ends, e points to next, which is node a

Insert image description here

  • third cycle
    • next points to null
    • The head of e is inserted into node a, the next of a points to b (a.next has always been null before), the next of b points to a, and the dead link has become
    • When the loop ends, e points to next, which is null, so the fourth loop will exit normally.

Insert image description here

②Data confusion (will exist in 1.7 and 1.8)

  • Code reference day01.map.HashMapMissData, specific debugging steps reference video

Supplementary code description

  • day01.map.HashMapDistribution demonstrates that the length of the linked list in the map conforms to the Poisson distribution
  • day01.map.DistributionAffectedByCapacity demonstrates the impact of capacity and hashCode values ​​on distribution
    • day01.map.DistributionAffectedByCapacity#hashtableGrowRule demonstrates the expansion rule of Hashtable
    • day01.sort.Utils#randomArray If the hashCode is random enough, whether the capacity is the nth power of 2 has little impact.
    • day01.sort.Utils#lowSameArray If the hashCode has the same low bits and the capacity is 2 to the nth power, it will lead to uneven distribution.
    • day01.sort.Utils#evenArray If the hashCode is an even number, the capacity is 2 to the nth power, which will lead to uneven distribution.
    • It follows that for a design with a capacity of 2 raised to the nth power, the secondary hash is very important.
  • day01.map.HashMapVsHashtable demonstrates the difference in distribution between putting the same number of word strings into HashMap and Hashtable

6) Key design

Question 1: Can key be null? What are the requirements for being an object of key?
Question 2: How is the hashCode() of the String object designed? Why is it multiplied by 31 each time?

①Key design requirements

  1. The key of HashMap can be null, but other implementations of Map do not (a null pointer exception will occur)

  2. As an object of key, it must implement hashCodeandequalsAnd the content of key cannot be modified (immutable)

    The hashCode is for the key to have better distribution in the HashMap and improve the query performance;
    in case the indexes calculated by the key are the same, further use equals to compare whether they are the same object;

  3. The hashCode of the key should have good hashing properties.
    If the key is variable, for example, if the age is modified, it will not be queried again.

public class HashMapMutableKey {
    
    
    public static void main(String[] args) {
    
    
        HashMap<Student, Object> map = new HashMap<>();
        Student stu = new Student("张三", 18);
        map.put(stu, new Object());

        System.out.println(map.get(stu));

        stu.age = 19;
        System.out.println(map.get(stu));
    }

    static class Student {
    
    
        String name;
        int age;

        public Student(String name, int age) {
    
    
            this.name = name;
            this.age = age;
        }

        public String getName() {
    
    
            return name;
        }

        public void setName(String name) {
    
    
            this.name = name;
        }

        public int getAge() {
    
    
            return age;
        }

        public void setAge(int age) {
    
    
            this.age = age;
        }

        @Override
        public boolean equals(Object o) {
    
    
            if (this == o) return true;
            if (o == null || getClass() != o.getClass()) return false;
            Student student = (Student) o;
            return age == student.age && Objects.equals(name, student.name);
        }

        @Override
        public int hashCode() {
    
    
            return Objects.hash(name, age);
        }
    }
}

②Design String 对象_ hashCode()_

  • The goal is to achieve a more uniform hashing effect, and the hashCode of each string is unique enough
  • Each character in the string can be represented as a number, called S i S_iSi, where the range of i is 0 ~ n - 1
  • Hash formula为: S 0 ∗ 3 1 ( n − 1 ) + S 1 ∗ 3 1 ( n − 2 ) + … S i ∗ 3 1 ( n − 1 − i ) + … S ( n − 1 ) ∗ 3 1 0 S_0∗31^{(n-1)}+ S_1∗31^{(n-2)}+ … S_i ∗ 31^{(n-1-i)}+ …S_{(n-1)}∗31^0 S031(n1)+S131(n2)+Si31(n1i)+S(n1)310
  • 31 Substituting the formula has better hashing properties, and 31 * h can be optimized as (good performance)
    • That is $32 ∗h -h $
    • That is 2 5 ∗ h − h 2^5 ∗h -h25hh
    • That is, h ≪ 5 − hh ≪5 -hh5h

11. Singleton pattern

Require

  • Master the implementation methods of five singleton patterns
  • Understand why volatile is used to modify static variables when implementing DCL
  • Understand where the singleton pattern is reflected in jdk

Hungry Chinese style

There is no thread safety issue in Hungry style because the code that creates the object is in a static code block

public class Singleton1 implements Serializable {
    
    
    //1、构建私有
    private Singleton1() {
    
    
        if (INSTANCE != null) {
    
    
            throw new RuntimeException("单例对象不能重复创建");
        }
        System.out.println("private Singleton1()");
    }
     
    //2、提供静态成员变量,类型是 单例 类型,值是用私有构造创建的唯一实例
    private static final Singleton1 INSTANCE = new Singleton1();

    //静态变量一般都是私有的,不能直接访问;
    //3、提供一个公共的静态方法,方法的实现就是返回上面的静态成员变量;
    public static Singleton1 getInstance() {
    
    
        return INSTANCE;
    }

    public static void otherMethod() {
    
    
        System.out.println("otherMethod()");
    }

    public Object readResolve() {
    
    
        return INSTANCE;
    }
}
  • The constructor throws an exception to prevent reflection from destroying the singleton
  • readResolve()is to prevent deserialization from destroying the singleton
  • Unsafe

Enumeration of hungry Chinese style

There is no thread safety issue in Hungry style because the code that creates the object is in a static code block

public enum Singleton2 {
    
    
    INSTANCE; //定义一个变量,因此只有一个实例
//下面都不是必要的
    private Singleton2() {
    
    
        System.out.println("private Singleton2()");
    }


    @Override
    public String toString() {
    
     //打印枚举类是把哈希码也打印出来,能看出来是否是同一个对象;
        return getClass().getName() + "@" + Integer.toHexString(hashCode());
    }

    public static Singleton2 getInstance() {
    
    //静态公共方法获取单例;
        return INSTANCE;
    }

    public static void otherMethod() {
    
    //测试是饿汉式还是懒汉式;
        System.out.println("otherMethod()");
    }
}
  • Enumeration style can naturally prevent reflection and deserialization from destroying singletons, unable to prevent unsafe

lazy man style

The lazy style should consider thread safety issues and add locks to solve them;

public class Singleton3 implements Serializable {
    
    
    private Singleton3() {
    
    
        System.out.println("private Singleton3()");
    }

    private static Singleton3 INSTANCE = null;

    // Singleton3.class,静态方法上使用synchronized关键字相当于在类上加了一把锁
    public static synchronized Singleton3 getInstance() {
    
      
    //要考虑多线程的线程安全问题;静态方法上使用synchronized关键字
        if (INSTANCE == null) {
    
       
            INSTANCE = new Singleton3();
        }
        return INSTANCE;
    }

    public static void otherMethod() {
    
    
        System.out.println("otherMethod()");
    }

}
  • In fact, synchronization (calling the lock) is only required when the singleton object is created for the first time, but the code will actually be synchronized every time it is called.Resulting in a loss of performance;
  • Therefore we have the followingDouble check lock improvement

Double check lock lazy style

Double check lock, two if judgments

public class Singleton4 implements Serializable {
    
    
    private Singleton4() {
    
    
        System.out.println("private Singleton4()");
    }

    private static volatile Singleton4 INSTANCE = null; // 可见性,有序性

    public static Singleton4 getInstance() {
    
    
    //双检锁,两个if判断
        if (INSTANCE == null) {
    
    
            synchronized (Singleton4.class) {
    
    
                if (INSTANCE == null) {
    
    
                    INSTANCE = new Singleton4();
                }
            }
        }
        return INSTANCE;
    }

    public static void otherMethod() {
    
    
        System.out.println("otherMethod()");
    }
}

Why it is necessary to add volatile:

  • INSTANCE = new Singleton4()Not atomic, divided into 3 steps:Create objectcall constructorAssign a value to a static variable, the last two steps may be ordered to be repeatedSorting optimization, becomes assign value first, then call constructor
  • If thread 1 performs the assignment first, and thread 2 INSTANCE == nullfinds that INSTANCE is no longer null when executing the first one, an incompletely constructed object will be returned.

Internal lazy style [recommended use]

public class Singleton5 implements Serializable {
    
    
    private Singleton5() {
    
    
        System.out.println("private Singleton5()");
    }

    private static class Holder {
    
    
        static Singleton5 INSTANCE = new Singleton5();//创建过程在静态代码块中
    }

    public static Singleton5 getInstance() {
    
    
        return Holder.INSTANCE;//使用内部类,访问变量,会触发内部类的加载,链接,初始化(初始化时又会在代码块中创建Singleton5对象)
    }

    public static void otherMethod() {
    
    
        System.out.println("otherMethod()");
    }
}
  • Avoids the shortcomings of double check locking

The embodiment of singleton in JDK

Don’t say that the singleton pattern is used in the project; look for the singleton pattern in the JDK;

  • Runtime Reflects the hungry Han style singleton
  • ConsoleReflects the double-check lock lazy-style singleton
  • CollectionsInner EmptyNavigableSet class lazy singleton
  • ReverseComparator.REVERSE_ORDERInternal class lazy singleton
  • Comparators.NaturalOrderComparator.INSTANCEEnumeration of Chinese-style singletons

Guess you like

Origin blog.csdn.net/weixin_52223770/article/details/128712348