Dark Horse Station B Stereotype Study Notes

Video URL: https://www.yuque.com/linxun-bpyj0/linxun/vy91es9lyg7kbfnr

outline

Basic

Basic points: algorithm, data structure, basic design pattern

1. Binary search

Require

  • Be able to describe the binary search algorithm in your own language
  • Ability to hand-write binary search code
  • Be able to answer some changed test methods

Algorithm Description

  1. Premise: There is a sorted array A (assuming it has been done)
  2. Define the left boundary L and the right boundary R, ​​determine the search range, and perform binary search in a loop (steps 3 and 4)
  3. Get intermediate index M = Floor((L+R) /2)
  4. The value A[M] of the intermediate index is compared with the value T to be searched
    ① A[M] == T means found, return the intermediate index
    ② A[M] > T, other elements on the right side of the intermediate value are greater than T, no need Compare, find the middle index on the left, set M - 1 as the right boundary, and search again
    ③ A[M] < T, other elements on the left side of the middle value are less than T, no need to compare, find the middle index on the right, set M + 1 For the left boundary, re-find
  5. When L > R, it means not found, and the loop should end

For a more vivid description, please refer to: binary_search.html

Algorithm implementation

public static int binarySearch(int[] a, int t) {
    int l = 0, r = a.length - 1, m;
    while (l <= r) {
        m = (l + r) / 2;
        if (a[m] == t) {
            return m;
        } else if (a[m] > t) {
            r = m - 1;
        } else {
            l = m + 1;
        }
    }
    return -1;
}

test code

public static void main(String[] args) {
    int[] array = {1, 5, 8, 11, 19, 22, 31, 35, 40, 45, 48, 49, 50};
    int target = 47;
    int idx = binarySearch(array, target);
    System.out.println(idx);
}

Solve the integer overflow problem

When both l and r are large, l + rit may exceed the range of integers and cause operation errors. There are two solutions:

int m = l + (r - l) / 2;

Another is:

int m = (l + r) >>> 1;

Other methods

  1. There is an ordered list of 1, 5, 8, 11, 19, 22, 31, 35, 40, 45, 48, 49, 50 When the node with a binary search value of 48 is found, the number of comparisons required for successful search
  2. When using the dichotomy method to find element 81 in the sequence 1,4,6,7,15,33,39,50,64,78,75,81,89,96, it needs to go through () comparisons
  3. To binary search a number in an array with 128 elements, the maximum number of comparisons required is no more than how many times

For the first two questions, remember a brief formula for judging: the odd two points take the middle, and the even two points take the middle to the left. For the latter question, you need to know the formula:

Where n is the number of searches and N is the number of elements

2. Bubble sort

Require

  • Be able to describe the bubble sort algorithm in your own language
  • Ability to write bubble sort code by hand
  • Understand some optimization methods of bubble sorting

Algorithm Description

  1. Compare the sizes of two adjacent elements in the array in turn. If a[j] > a[j+1], then exchange the two elements. Comparing both of them is called a round of bubbling. The result is to arrange the largest element to at last
  2. Repeat the above steps until the entire array is sorted

For a more vivid description, please refer to: bubble_sort.html

Algorithm implementation

public static void bubble(int[] a) {
    for (int j = 0; j < a.length - 1; j++) {
        // 一轮冒泡
        boolean swapped = false; // 是否发生了交换
        for (int i = 0; i < a.length - 1 - j; i++) {
            System.out.println("比较次数" + i);
            if (a[i] > a[i + 1]) {
                Utils.swap(a, i, i + 1);
                swapped = true;
            }
        }
        System.out.println("第" + j + "轮冒泡"
                           + Arrays.toString(a));
        if (!swapped) {
            break;
        }
    }
}
  • Optimization point 1: After each round of bubbling, the inner loop can be reduced once
  • Optimization point 2: If there is no exchange in a certain round of bubbling, it means that all data is in order, and the outer loop can be ended

advanced optimization

public static void bubble_v2(int[] a) {
    int n = a.length - 1;
    while (true) {
        int last = 0; // 表示最后一次交换索引位置
        for (int i = 0; i < n; i++) {
            System.out.println("比较次数" + i);
            if (a[i] > a[i + 1]) {
                Utils.swap(a, i, i + 1);
                last = i;
            }
        }
        n = last;
        System.out.println("第轮冒泡"
                           + Arrays.toString(a));
        if (n == 0) {
            break;
        }
    }
}
  • During each round of bubbling, the last exchange index can be used as the number of comparisons for the next round of bubbling. If this value is zero, it means that the entire array is in order, and you can just exit the outer loop

3. Selection sort

Require

  • Be able to describe the selection sort algorithm in your own language
  • Ability to compare selection sort with bubble sort
  • Understanding Unstable Sorting and Stable Sorting

Algorithm Description

  1. Divide the array into two subsets, sorted and unsorted, each round selects the smallest element from the unsorted subset and puts it into the sorted subset
  2. Repeat the above steps until the entire array is sorted

For a more vivid description, please refer to: selection_sort.html

Algorithm implementation

public static void selection(int[] a) {
    for (int i = 0; i < a.length - 1; i++) {
        // i 代表每轮选择最小元素要交换到的目标索引
        int s = i; // 代表最小元素的索引
        for (int j = s + 1; j < a.length; j++) {
            if (a[s] > a[j]) { // j 元素比 s 元素还要小, 更新 s
                s = j;
            }
        }
        if (s != i) {
            swap(a, s, i);
        }
        System.out.println(Arrays.toString(a));
    }
}
  • Optimization point: In order to reduce the number of exchanges, you can first find the smallest index in each round, and then exchange elements at the end of each round

Compare with Bubble Sort

  1. The average time complexity of both is
  2. Selection sort is generally faster than bubbling because it has fewer exchanges
  3. But if the collection is highly ordered, bubbling is better than selection
  4. Bubble is a stable sorting algorithm, while selection is an unstable sorting algorithm
    • Stable sorting refers to multiple sorting by different fields in the object without disrupting the order of elements with the same value
    • The opposite is true for unstable sorting

Stable sort and unstable sort

System.out.println("=================不稳定================");
Card[] cards = getStaticCards();
System.out.println(Arrays.toString(cards));
selection(cards, Comparator.comparingInt((Card a) -> a.sharpOrder).reversed());
System.out.println(Arrays.toString(cards));
selection(cards, Comparator.comparingInt((Card a) -> a.numberOrder).reversed());
System.out.println(Arrays.toString(cards));

System.out.println("=================稳定=================");
cards = getStaticCards();
System.out.println(Arrays.toString(cards));
bubble(cards, Comparator.comparingInt((Card a) -> a.sharpOrder).reversed());
System.out.println(Arrays.toString(cards));
bubble(cards, Comparator.comparingInt((Card a) -> a.numberOrder).reversed());
System.out.println(Arrays.toString(cards));

They are all sorted first by suit (♠♥♣♦), and then by number (AKQJ…)

  • When the unstable sorting algorithm sorts by numbers, it will disrupt the order of suits with the same value
[[♠7], [♠2], [♠4], [♠5], [♥2], [♥5]]
[[♠7], [♠5], [♥5], [♠4], [♥2], [♠2]]

It turns out that ♠2 is in the front and ♥2 is in the back, and they are rearranged according to the numbers, and their positions have changed

  • When the stable sorting algorithm sorts by numbers, it will retain the original suit order of the same value, as shown below The relative positions of ♠2 and ♥2 remain unchanged
[[♠7], [♠2], [♠4], [♠5], [♥2], [♥5]]
[[♠7], [♠5], [♥5], [♠4], [♠2], [♥2]]

4. Insertion sort

Require

  • Be able to describe the insertion sort algorithm in your own language
  • Ability to compare insertion sort with selection sort

Algorithm Description

  1. Divide the array into two areas, the sorted area and the unsorted area, each round takes the first element from the unsorted area and inserts it into the sorted area (the order needs to be guaranteed)
  2. Repeat the above steps until the entire array is sorted

For a more vivid description, please refer to: insertion_sort.html

Algorithm implementation

// 修改了代码与希尔排序一致
public static void insert(int[] a) {
    // i 代表待插入元素的索引
    for (int i = 1; i < a.length; i++) {
        int t = a[i]; // 代表待插入的元素值
        int j = i;
        System.out.println(j);
        while (j >= 1) {
            if (t < a[j - 1]) { // j-1 是上一个元素索引,如果 > t,后移
                a[j] = a[j - 1];
                j--;
            } else { // 如果 j-1 已经 <= t, 则 j 就是插入位置
                break;
            }
        }
        a[j] = t;
        System.out.println(Arrays.toString(a) + " " + j);
    }
}

Compare with selection sort

  1. The average time complexity of both is
  2. Insertion is slightly better than selection in most cases
  3. The time complexity of sorted set insertion is
  4. Insertion is a stable sorting algorithm, while selection is an unstable sorting algorithm

hint

Insertion sort is usually underestimated by students, but its status is very important. For small data volume sorting, insertion sorting will be preferred

5. Hill sort

Require

  • Be able to describe the Hill sorting algorithm in your own language

Algorithm Description

  1. First select a gap sequence, such as (n/2, n/4 ... 1), n ​​is the length of the array
  2. In each round, the elements with equal gaps are regarded as a group, and the elements in the group are inserted and sorted for two purposes. ① The
    insertion and sorting speed of a small number of elements is very fast
    . ② Let the elements with larger values ​​​​in the group move to the back faster
  3. When the gap gradually decreases until it is 1, the sorting can be completed

For a more vivid description, please refer to: shell_sort.html

Algorithm implementation

private static void shell(int[] a) {
    int n = a.length;
    for (int gap = n / 2; gap > 0; gap /= 2) {
        // i 代表待插入元素的索引
        for (int i = gap; i < n; i++) {
            int t = a[i]; // 代表待插入的元素值
            int j = i;
            while (j >= gap) {
                // 每次与上一个间隙为 gap 的元素进行插入排序
                if (t < a[j - gap]) { // j-gap 是上一个元素索引,如果 > t,后移
                    a[j] = a[j - gap];
                    j -= gap;
                } else { // 如果 j-1 已经 <= t, 则 j 就是插入位置
                    break;
                }
            }
            a[j] = t;
            System.out.println(Arrays.toString(a) + " gap:" + gap);
        }
    }
}

References

6. Quick Sort

Require

  • Be able to describe the quicksort algorithm in your own language
  • Master one of the handwritten unilateral loop and bilateral loop codes
  • Be able to explain the characteristics of quick sort
  • Understand the performance comparison of Lomuto and Hall partitioning schemes

Algorithm Description

  1. Each round of sorting selects a reference point (pivot) for partitioning
    1. Let the elements smaller than the reference point enter one partition, and the elements larger than the reference point enter another partition
    2. When partitioning is complete, the position of the pivot element is its final position
  1. Repeat the above process in the sub-partition until the number of sub-partition elements is less than or equal to 1, which reflects the idea of ​​​​divide- and-conquer
  2. As can be seen from the above description, a key lies in the partition algorithm, common ones include the Lomuto partition scheme, the bilateral loop partition scheme, and the Hall partition scheme

For a more vivid description, please refer to: quick_sort.html

Unilateral cyclic fast sorting (lomuto partition scheme)

  1. Select the rightmost element as the datum element
  2. The j pointer is responsible for finding an element smaller than the reference point, and once found, it is exchanged with i
  3. The i pointer maintains the bounds of elements smaller than the reference point, and is also the target index for each swap
  4. Finally, the reference point is exchanged with i, and i is the partition position
public static void quick(int[] a, int l, int h) {
    if (l >= h) {
        return;
    }
    int p = partition(a, l, h); // p 索引值
    quick(a, l, p - 1); // 左边分区的范围确定
    quick(a, p + 1, h); // 左边分区的范围确定
}

private static int partition(int[] a, int l, int h) {
    int pv = a[h]; // 基准点元素
    int i = l;
    for (int j = l; j < h; j++) {
        if (a[j] < pv) {
            if (i != j) {
                swap(a, i, j);
            }
            i++;
        }
    }
    if (i != h) {
        swap(a, h, i);
    }
    System.out.println(Arrays.toString(a) + " i=" + i);
    // 返回值代表了基准点元素所在的正确索引,用它确定下一轮分区的边界
    return i;
}

Bilateral circular fast sorting (not exactly equivalent to hoare Hall partition scheme)

  1. Select the leftmost element as the datum element
  2. The j pointer is responsible for finding elements smaller than the reference point from right to left, and the i pointer is responsible for finding elements larger than the reference point from left to right. Once found, the two are exchanged until i, j intersect
  3. Finally, the reference point is exchanged with i (at this time, i and j are equal), and i is the partition position

main point

  1. The reference point is on the left, and j should be followed by i
  2. while( i < j && a[j] > pv ) j–
  3. while ( i < j && a[i] <= pv ) i++
private static void quick(int[] a, int l, int h) {
    if (l >= h) {
        return;
    }
    int p = partition(a, l, h);
    quick(a, l, p - 1);
    quick(a, p + 1, h);
}

private static int partition(int[] a, int l, int h) {
    int pv = a[l];
    int i = l;
    int j = h;
    while (i < j) {
        // j 从右找小的
        while (i < j && a[j] > pv) {
            j--;
        }
        // i 从左找大的
        while (i < j && a[i] <= pv) {
            i++;
        }
        swap(a, i, j);
    }
    swap(a, l, j);
    System.out.println(Arrays.toString(a) + " j=" + j);
    return j;
}

Quick sort features

  1. The average time complexity is , the worst time complexity
  2. When the amount of data is large, the advantage is very obvious
  3. is an unstable sort

Lomuto Zoning Scheme vs Hall Zoning Scheme

Supplementary Code Description

  • day01.sort.QuickSort3 demonstrates cavitation-improved two-sided quicksort with fewer comparisons
  • day01.sort.QuickSortHoare demonstrates the implementation of Hoare partitioning
  • Day01.sort.LomutoVsHoare compares the number of moves achieved by the four partitions

7. ArrayList

Require

  • Master ArrayList expansion rules

Expansion rules

  1. ArrayList() will use a zero-length array
  2. ArrayList(int initialCapacity) will use an array with the specified capacity
  3. public ArrayList(Collection<? extends E> c) will use the size of c as the array capacity
  4. add(Object o) expands the capacity to 10 for the first time, and expands the capacity again to 1.5 times the previous capacity
  5. addAll(Collection c) expands to Math.max(10, the actual number of elements) when there are no elements, and Math.max(1.5 times the original capacity, the actual number of elements) when there are elements

Among them, the fourth point must be known, and the other points depend on individual circumstances.

hint

  • See the test code day01.list.TestArrayList, it will not be listed here
  • It should be noted that reflection is used in the example to more intuitively reflect the expansion characteristics of ArrayList, but since JDK 9, due to the impact of modularization, there are more restrictions on reflection, and it is necessary to add VM parameters when running the test code to --add-opens java.base/java.util=ALL-UNNAMEDrun Passed, the following examples all have the same problem

code description

  • day01.list.TestArrayList#arrayListGrowRule demonstrates the expansion rules of the add(Object) method, and the input parameter n represents how many times to print the expanded array length

8. Iterator

Require

  • Master what is Fail-Fast and what is Fail-Safe

Fail-Fast and Fail-Safe

  • ArrayList is a typical representative of fail-fast, it cannot be modified while traversing, and fails as soon as possible
  • CopyOnWriteArrayList is a typical representative of fail-safe, it can be modified while traversing, the principle is read-write separation

hint

  • See the test code day01.list.FailFastVsFailSafe, it will not be listed here

9. LinkedList

Require

  • Be able to clearly explain the difference between LinkedList and ArrayList, and pay attention to correcting some mistakes

LinkedList

  1. Based on doubly linked list, no continuous memory required
  2. Random access is slow (traversal along the linked list)
  3. Head-to-tail insertion and deletion with high performance
  4. Take up a lot of memory

ArrayList

  1. Array-based, requires contiguous memory
  2. Fast random access (referring to access by subscript)
  3. The performance of tail insertion and deletion is OK, but other parts of insertion and deletion will move data, so the performance will be low
  4. Can use cpu cache, locality principle

code description

  • day01.list.ArrayListVsLinkedList#randomAccess vs random access performance
  • day01.list.ArrayListVsLinkedList#addMiddle compares the performance of inserting into the middle
  • day01.list.ArrayListVsLinkedList#addFirst compare header insertion performance
  • day01.list.ArrayListVsLinkedList#addLast vs. tail insertion performance
  • day01.list.ArrayListVsLinkedList#linkedListSize print a LinkedList occupying memory
  • day01.list.ArrayListVsLinkedList#arrayListSize Print an ArrayList occupying memory

10. HashMap

Require

  • Master the basic data structure of HashMap
  • Master treeization
  • Understand the index calculation method, the meaning of secondary hash, and the impact of capacity on index calculation
  • Master the put process, expansion, and expansion factor
  • Understand the problems that may be caused by concurrent use of HashMap
  • Understand key design

1) Basic data structure

  • 1.7 Array + linked list
  • 1.8 Array+ (linked list | red-black tree)

For a more vivid demonstration, see hash-demo.jar in the data. It needs jdk14 or higher environment to run. Enter the jar package directory and execute the following command

java -jar --add-exports java.base/jdk.internal.misc=ALL-UNNAMED hash-demo.jar

2) Arborization and degradation

tree meaning

  • The red-black tree is used to avoid DoS attacks and prevent performance degradation when the linked list is too long. The tree should be an accidental situation, and it is a bottom-line strategy
  • The time complexity of hash table lookup and update is , while the time complexity of red-black tree lookup and update is , and TreeNode takes up more space than ordinary Node. If it is not necessary, try to use linked list
  • If the hash value is random enough, it will be distributed according to Poisson in the hash table. In the case of a load factor of 0.75, the probability of a linked list with a length exceeding 8 is 0.00000006. The treeing threshold is chosen to be 8 to make the treeing probability small enough

tree rules

  • When the length of the linked list exceeds the treeing threshold of 8, first try to expand the capacity to reduce the length of the linked list. If the array capacity is >=64, the treeing will be performed

degenerate rules

  • Case 1: If the number of tree elements <= 6 when splitting the tree during capacity expansion, the linked list will be degenerated
  • Case 2: When removing tree nodes, if one of root, root.left, root.right, root.left.left is null, it will degenerate into a linked list

3) Index calculation

Index Calculation Method

  • First, calculate the hashCode() of the object
  • Then call the hash() method of HashMap for secondary hashing
    • The second hash() is to synthesize high-level data and make the hash distribution more uniform
  • Finally & (capacity – 1) gets the index

Why is the array capacity the nth power of 2

  1. It is more efficient when calculating the index: if it is the nth power of 2, you can use the bitwise AND operation instead of modulo
  2. It is more efficient to recalculate the index when expanding: the element with hash & oldCap == 0 stays in the original position, otherwise the new position = old position + oldCap

Notice

  • The second hash is to match the design premise that the capacity is the nth power of 2. If the capacity of the hash table is not the nth power of 2, there is no need for the second hash
  • The capacity is the nth power of 2. This design is more efficient in computing indexes, but the dispersion of hash is not good, and a second hash is needed as compensation. A typical example that does not adopt this design is Hashtable

4) put and expansion

put process

  1. HashMap creates an array lazily, and the array is created only when it is used for the first time
  2. Calculate index (bucket subscript)
  3. If the bucket subscript is not occupied yet, create a Node placeholder and return
  4. If the bucket subscript is already occupied
    1. Already a popular addition or update logic of TreeNode black tree
    2. It is an ordinary Node, and the add or update logic of the linked list is followed. If the length of the linked list exceeds the tree threshold, the tree logic is followed
  1. Check whether the capacity exceeds the threshold before returning, and expand the capacity once it exceeds

The difference between 1.7 and 1.8

  1. When inserting a node into a linked list, 1.7 is the head insertion method, and 1.8 is the tail insertion method
  2. 1.7 is greater than or equal to the threshold and there is no space to expand the capacity, and 1.8 is greater than the threshold to expand the capacity
  3. 1.8 When expanding and calculating the Node index, it will be optimized

Why is the expansion (loading) factor defaulted to 0.75f

  1. Good trade-off between space usage and query time
  2. If it is greater than this value, the space is saved, but the linked list will be longer and affect performance
  3. If it is less than this value, conflicts are reduced, but capacity expansion will be more frequent and space will be more occupied

5) Concurrency issues

Expansion dead chain (will exist in 1.7)

1.7 The source code is as follows:

void transfer(Entry[] newTable, boolean rehash) {
    int newCapacity = newTable.length;
    for (Entry<K,V> e : table) {
        while(null != e) {
            Entry<K,V> next = e.next;
            if (rehash) {
                e.hash = null == e.key ? 0 : hash(e.key);
            }
            int i = indexFor(e.hash, newCapacity);
            e.next = newTable[i];
            newTable[i] = e;
            e = next;
        }
    }
}
  • Both e and next are local variables used to point to the current node and the next node
  • The temporary variables e and next of thread 1 (green) have just referenced these two nodes, and before the node can be moved in the future, a thread switch occurs, and thread 2 (blue) completes the expansion and migration

  • The expansion of thread 2 is completed. Due to the head insertion method, the order of the linked list is reversed. However, the temporary variables e and next of thread 1 still refer to these two nodes, so another migration is required

  • first cycle
    • The loop runs before the thread switch, note that at this time e points to node a, and next points to node b
    • Insert a node into the head of e. Note that there are two copies of a node in the picture, but in fact there is only one (two copies are drawn in order to prevent the arrow from being scribbled)
    • When the loop ends, e will point to next, which is the b node

  • second cycle
    • next points to node a
    • e header insert point b
    • When the loop ends, e points to next which is node a

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-A3enIgab-1691920287269)()]

  • third cycle
    • next points to null
    • e header inserts node a, the next of a points to b (before a.next has always been null), the next of b points to a, and the dead link has become
    • When the loop ends, e points to next which is null, so it will exit normally in the fourth loop

Data disorder (1.7, 1.8 will exist)

  • Code reference day01.map.HashMapMissData, specific debugging steps refer to the video

Supplementary Code Description

  • day01.map.HashMapDistribution demonstrates that the length of the linked list in the map conforms to the Poisson distribution
  • day01.map.DistributionAffectedByCapacity demonstrates the influence of capacity and hashCode value on distribution
    • day01.map.DistributionAffectedByCapacity#hashtableGrowRule demonstrates the expansion rule of Hashtable
    • day01.sort.Utils#randomArray If the hashCode is random enough, whether the capacity is 2 to the power of n has little effect
    • day01.sort.Utils#lowSameArray If the hashCode has the same number of low bits, the capacity is 2 to the nth power, which will lead to uneven distribution
    • day01.sort.Utils#evenArray If there are many even numbers of hashCode and the capacity is 2 to the nth power, the distribution will be uneven
    • From this, it can be concluded that the second hash is very important for the design whose capacity is the nth power of 2
  • day01.map.HashMapVsHashtable demonstrates the difference in the distribution of HashMap and Hashtable for the same number of word strings

6) Key design

key design requirements

  1. HashMap's key can be null, but other implementations of Map do not
  2. As a key object, hashCode and equals must be implemented, and the content of the key cannot be modified (immutable)
  3. The hashCode of the key should have good hashability

If the key is variable, for example, if the age is modified, it will not be queried when it is queried again

public class HashMapMutableKey {
    public static void main(String[] args) {
        HashMap<Student, Object> map = new HashMap<>();
        Student stu = new Student("张三", 18);
        map.put(stu, new Object());

        System.out.println(map.get(stu));

        stu.age = 19;
        System.out.println(map.get(stu));
    }

    static class Student {
        String name;
        int age;

        public Student(String name, int age) {
            this.name = name;
            this.age = age;
        }

        public String getName() {
            return name;
        }

        public void setName(String name) {
            this.name = name;
        }

        public int getAge() {
            return age;
        }

        public void setAge(int age) {
            this.age = age;
        }

        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (o == null || getClass() != o.getClass()) return false;
            Student student = (Student) o;
            return age == student.age && Objects.equals(name, student.name);
        }

        @Override
        public int hashCode() {
            return Objects.hash(name, age);
        }
    }
}

hashCode() design of String object

  • The goal is to achieve a relatively uniform hash effect, and the hashCode of each string is unique enough
  • Each character in the string can be expressed as a number, called , where the range of i is 0 ~ n - 1
  • The hash formula is:
  • 31 substitution formula has better hash properties, and 31 * h can be optimized as
    • That is, $32 ∗h -h $
    • Right now
    • Right now

11. Singleton pattern

Require

  • Master the implementation of five singleton patterns
  • Understand why DCL implements volatile to modify static variables
  • Understand the scenarios where singletons are used in jdk

Hungry Chinese style

public class Singleton1 implements Serializable {
    private Singleton1() {
        if (INSTANCE != null) {
            throw new RuntimeException("单例对象不能重复创建");
        }
        System.out.println("private Singleton1()");
    }

    private static final Singleton1 INSTANCE = new Singleton1();

    public static Singleton1 getInstance() {
        return INSTANCE;
    }

    public static void otherMethod() {
        System.out.println("otherMethod()");
    }

    public Object readResolve() {
        return INSTANCE;
    }
}
  • The constructor throws an exception to prevent reflection from destroying the singleton
  • readResolve()is to prevent deserialization from destroying the singleton

enumerate Chinese style

public enum Singleton2 {
    INSTANCE;

    private Singleton2() {
        System.out.println("private Singleton2()");
    }

    @Override
    public String toString() {
        return getClass().getName() + "@" + Integer.toHexString(hashCode());
    }

    public static Singleton2 getInstance() {
        return INSTANCE;
    }

    public static void otherMethod() {
        System.out.println("otherMethod()");
    }
}
  • Enumeration can naturally prevent reflection and deserialization from destroying singletons

Lazy

public class Singleton3 implements Serializable {
    private Singleton3() {
        System.out.println("private Singleton3()");
    }

    private static Singleton3 INSTANCE = null;

    // Singleton3.class
    public static synchronized Singleton3 getInstance() {
        if (INSTANCE == null) {
            INSTANCE = new Singleton3();
        }
        return INSTANCE;
    }

    public static void otherMethod() {
        System.out.println("otherMethod()");
    }

}
  • In fact, synchronization is only required when the singleton object is first created, but the code actually synchronizes every time it is called
  • So with the following double check lock improvement

Double check lock lazy style

public class Singleton4 implements Serializable {
    private Singleton4() {
        System.out.println("private Singleton4()");
    }

    private static volatile Singleton4 INSTANCE = null; // 可见性,有序性

    public static Singleton4 getInstance() {
        if (INSTANCE == null) {
            synchronized (Singleton4.class) {
                if (INSTANCE == null) {
                    INSTANCE = new Singleton4();
                }
            }
        }
        return INSTANCE;
    }

    public static void otherMethod() {
        System.out.println("otherMethod()");
    }
}

Why must add volatile:

  • INSTANCE = new Singleton4()Not atomic, divided into 3 steps: create an object, call a constructor, and assign a value to a static variable. The last two steps may be optimized by instruction reordering, and become assignment first, and then call the constructor
  • If thread 1 executes the assignment first, and thread 2 INSTANCE == nullfinds that INSTANCE is not null when it reaches the first one, it will return an incompletely constructed object

inner class lazy

public class Singleton5 implements Serializable {
    private Singleton5() {
        System.out.println("private Singleton5()");
    }

    private static class Holder {
        static Singleton5 INSTANCE = new Singleton5();
    }

    public static Singleton5 getInstance() {
        return Holder.INSTANCE;
    }

    public static void otherMethod() {
        System.out.println("otherMethod()");
    }
}
  • Avoid the disadvantages of double check lock

The embodiment of singleton in JDK

  • Runtime embodies the hungry Chinese singleton
  • Console embodies double-checked locking lazy singleton
  • EmptyNavigableSet inner class lazy singleton in Collections
  • ReverseComparator.REVERSE_ORDER inner class lazy singleton
  • Comparators.NaturalOrderComparator.INSTANCE enumerates Chinese-style singletons

concurrent articles

1. Thread state

Require

  • Master the six states of Java threads
  • Mastering Java thread state transitions
  • Can understand the difference between five states and six states

Six states and transitions

respectively

  • new build
    • When a thread object is created, but the start method has not been called, it is in the new state
    • Not associated with the underlying thread of the operating system at this time
  • runnable
    • After calling the start method, it will enter the runnable from the newly created
    • At this time, it is associated with the underlying thread and is scheduled for execution by the operating system
  • the end
    • The code in the thread has been executed and enters the finalization from runnable
    • At this time, the association with the underlying thread will be canceled
  • block
    • When the acquisition of the lock fails, the blocking queue that can run into the Monitor is blocked , and no cpu time is occupied at this time
    • When the lock-holding thread releases the lock, it will wake up the blocked thread in the blocking queue according to certain rules, and the awakened thread will enter the runnable state
  • wait
    • When the lock is acquired successfully, but because the condition is not satisfied, the wait() method is called. At this time, the lock is released from the runnable state and enters the Monitor waiting set to wait , which also does not occupy cpu time.
    • When other lock-holding threads call the notify() or notifyAll() method, the waiting threads in the waiting set will be woken up according to certain rules and restored to the runnable state
  • time limit wait
    • When the lock is acquired successfully, but because the condition is not satisfied, the wait(long) method is called. At this time, the lock is released from the runnable state and enters the Monitor waiting set for time-limited waiting , which also does not occupy cpu time.
    • When other lock-holding threads call the notify() or notifyAll() method, the time-limited waiting threads in the waiting set will be awakened according to certain rules , restored to a runnable state, and re-competed for the lock
    • If the wait times out, it will also recover from the time-limited waiting state to the runnable state, and re-compete for the lock
    • Another situation is that calling the sleep(long) method will also enter the time-limited waiting state from the runnable state , but it has nothing to do with the Monitor, and does not need to be actively woken up. When the timeout expires, it will naturally return to the runnable state

Other situations (just need to know)

  • You can use the interrupt() method to interrupt waiting , time-limited waiting threads and restore them to a runnable state
  • Park, unpark and other methods can also make threads wait and wake up

five states

The statement of the five states comes from the division of the operating system level

  • Running state: allocated to the cpu time, can actually execute the code in the thread
  • Ready state: eligible for cpu time, but not yet its turn
  • Blocked state: not eligible for cpu time
    • Covers blocking , waiting , timed waiting mentioned in java state
    • There is more blocking I/O, which means that when a thread calls blocking I/O, the actual work is completed by the I/O device. At this time, the thread has nothing to do but wait.
  • New and final state: similar to the state of the same name in java, no longer verbose

2. Thread pool

Require

  • Master the 7 core parameters of the thread pool

Seven parameters

  1. corePoolSize number of core threads - the maximum number of threads that will be kept in the pool
  2. maximumPoolSize the maximum number of threads - the maximum number of core threads + rescue threads
  3. keepAliveTime Survival time - the survival time of the rescue thread, if there are no new tasks within the survival time, this thread resource will be released
  4. unit time unit - the survival time unit of the emergency thread, such as seconds, milliseconds, etc.
  5. workQueue - When there are no idle core threads, new tasks will be queued in this queue, and when the queue is full, emergency threads will be created to execute tasks
  6. threadFactory thread factory - you can customize the creation of thread objects, such as setting the thread name, whether it is a daemon thread, etc.
  7. handler rejection strategy - when all threads are busy and the workQueue is full, the rejection strategy will be triggered
    1. Throw exception java.util.concurrent.ThreadPoolExecutor.AbortPolicy
    2. Tasks are executed by the caller java.util.concurrent.ThreadPoolExecutor.CallerRunsPolicy
    3. Discard tasks java.util.concurrent.ThreadPoolExecutor.DiscardPolicy
    4. Discard the oldest queued tasks java.util.concurrent.ThreadPoolExecutor.DiscardOldestPolicy

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-U9e6Mevj-1691920287271)()]

code description

day02.TestThreadPoolExecutor demonstrates the core composition of the thread pool in a more vivid way

3. wait vs sleep

Require

  • be able to tell the difference

One commonality, three differences

common ground

  • The effects of wait(), wait(long) and sleep(long) are to make the current thread temporarily give up the right to use the CPU and enter the blocking state

difference

  • Method attribution is different
    • sleep(long) is a static method of Thread
    • And wait(), wait(long) are all member methods of Object, each object has
  • Wake up at different times
    • Threads executing sleep(long) and wait(long) will wake up after waiting for the corresponding milliseconds
    • wait(long) and wait() can also be woken up by notify, if wait() does not wake up, it will wait forever
    • They can all be interrupted to wake up
  • Different lock characteristics (emphasis)
    • The invocation of the wait method must first acquire the lock of the wait object, while sleep has no such restriction
    • After the wait method is executed, the object lock will be released, allowing other threads to acquire the object lock (I give up the cpu, but you can still use it)
    • And if sleep is executed in a synchronized code block, it will not release the object lock (I give up the cpu, and you can't use it)

4. lock vs synchronized

Require

  • Master the difference between lock and synchronized
  • Understand ReentrantLock's fair and unfair locks
  • Understanding condition variables in ReentrantLock

three levels

difference

  • grammatical level
    • synchronized is a keyword, the source code is in jvm, implemented in c++ language
    • Lock is an interface, the source code is provided by jdk and implemented in java language
    • When using synchronized, the exit synchronization code block lock will be released automatically, but when using Lock, you need to manually call the unlock method to release the lock
  • functional level
    • Both belong to pessimistic locks, and both have basic mutual exclusion, synchronization, and lock reentry functions
    • Lock provides many functions that synchronized does not have, such as obtaining waiting state, fair lock, interruptible, timeout, and multiple condition variables
    • Lock has implementations suitable for different scenarios, such as ReentrantLock, ReentrantReadWriteLock
  • performance level
    • When there is no competition, synchronized has done a lot of optimizations, such as biased locks, lightweight locks, and the performance is not bad
    • Lock implementations generally provide better performance when contention is high

fair lock

  • Fair embodiment of fair lock
    • Threads already in the blocking queue (regardless of timeout) are always fair, first in first out
    • Fair lock refers to threads that are not in the blocking queue to compete for the lock. If the queue is not empty, wait honestly to the end of the queue
    • Unfair lock means that threads that are not in the blocking queue compete for the lock, and compete with the thread awakened by the queue head. Whoever grabs it counts.
  • Fair locks will reduce throughput and are generally not used

condition variable

  • The function of the condition variable in ReentrantLock is similar to the normal synchronized wait and notify, which is used in the linked list structure for temporary waiting when the thread obtains the lock and finds that the condition is not satisfied
  • The difference from the synchronized waiting set is that there can be multiple condition variables in ReentrantLock, which can achieve finer waiting and wake-up control

code description

  • day02.TestReentrantLock demonstrates the internal structure of ReentrantLock in a more vivid way

5. volatile

Require

  • Three issues to consider in mastering thread safety
  • What problems can be solved by mastering volatile

atomicity

  • Cause: Under multi-threading, the instructions of different threads are interleaved, which leads to the confusion of reading and writing of shared variables
  • Solution: Use pessimistic locks or optimistic locks, volatile cannot solve atomicity

visibility

  • Cause: Modifications to shared variables due to compiler optimizations, or cache optimizations, or CPU instruction reordering optimizations that are not visible to other threads
  • Solution: Decorating shared variables with volatile can prevent optimizations such as compilers from occurring, and make the modification of shared variables by one thread visible to another thread

orderliness

  • Cause: Due to compiler optimization, or cache optimization, or CPU instruction reordering optimization, the actual execution order of instructions is inconsistent with the writing order
  • Solution: Modifying shared variables with volatile will add different barriers when reading and writing shared variables, preventing other read and write operations from crossing the barriers, thereby achieving the effect of preventing reordering
  • Notice:
    • The barrier added by the volatile variable is to prevent other write operations above the barrier from being queued under the volatile variable
    • The barrier for volatile variable reading is to prevent other read operations below from crossing the barrier and ranking above the volatile variable reading
    • The barriers added by volatile reads and writes can only prevent instruction reordering within the same thread

code description

  • day02.threadsafe.AddAndSubtract demonstrates atomicity
  • day02.threadsafe.ForeverLoop demo visibility
    • Note: This example has been proven to be a visibility problem caused by compiler optimization
  • day02.threadsafe.Reordering demonstrates ordering
    • It needs to be packaged into a jar package and tested
  • Please also refer to the video explanation

6. Pessimistic locking vs optimistic locking

Require

  • Master the difference between pessimistic locking and optimistic locking

Comparing pessimistic locking and optimistic locking

  • Representatives of pessimistic locks are synchronized and Lock locks
    • The core idea is [Threads can only operate shared variables if they own the lock. Only one thread can successfully occupy the lock each time, and the thread that fails to acquire the lock must stop and wait]
    • The thread from running to blocking, and then from blocking to waking up involves thread context switching. If it happens frequently, it will affect performance
    • In fact, when a thread acquires synchronized and Lock locks, if the lock is already occupied, it will do several retries to reduce the chance of blocking
  • The representative of optimistic lock is AtomicInteger, which uses cas to ensure atomicity
    • Its core idea is [no need to lock, only one thread can successfully modify the shared variable each time, other failed threads do not need to stop, keep retrying until success]
    • Since the thread is always running, there is no need to block, so there is no thread context switching involved
    • It requires multi-core cpu support, and the number of threads should not exceed the number of cpu cores

code description

  • day02.SyncVsCas demonstrates the use of optimistic locks and pessimistic locks to solve atomic assignments
  • Please also refer to the video explanation

7. Hashtable vs ConcurrentHashMap

Require

  • Master the difference between Hashtable and ConcurrentHashMap
  • Master the implementation differences of ConcurrentHashMap in different versions

For a more vivid demonstration, see hash-demo.jar in the data. It needs jdk14 or higher environment to run. Enter the jar package directory and execute the following command

java -jar --add-exports java.base/jdk.internal.misc=ALL-UNNAMED hash-demo.jar

Hashtable vs ConcurrentHashMap

  • Both Hashtable and ConcurrentHashMap are thread-safe Map collections
  • Hashtable has low concurrency, the entire Hashtable corresponds to a lock, and only one thread can operate it at the same time
  • ConcurrentHashMap has high concurrency, and the entire ConcurrentHashMap corresponds to multiple locks. As long as threads access different locks, there will be no conflicts

ConcurrentHashMap 1.7

  • Data structure: Segment(大数组) + HashEntry(小数组) + 链表, each segment corresponds to a lock, if multiple threads access different segments, there will be no conflict
  • Concurrency: The size of the Segment array is the concurrency, which determines how many threads can access concurrently at the same time. The Segment array cannot be expanded, which means that the concurrency is fixed when ConcurrentHashMap is created
  • index calculation
    • Assuming that the length of the large array is , the index of the key in the large array is the high m bits of the secondary hash value of the key
    • Assuming that the length of the small array is , the index of the key in the small array is the lower n bits of the secondary hash value of the key
  • Expansion: The expansion of each small array is relatively independent. When the small array exceeds the expansion factor, the expansion will be triggered, and the expansion will be doubled each time.
  • Segment[0] prototype: When creating other small arrays for the first time, this prototype will be used as the basis. The length of the array and the expansion factor will be based on the prototype

ConcurrentHashMap 1.8

  • Data structure: Node 数组 + 链表或红黑树, each head node of the array is used as a lock, if the head nodes accessed by multiple threads are different, there will be no conflict. If competition occurs when generating the head node for the first time, use cas instead of synchronized to further improve performance
  • Concurrency: The size of the Node array is the same as the concurrency. Unlike 1.7, the Node array can be expanded
  • Expansion condition: When the Node array is full of 3/4, the capacity will be expanded
  • Expansion unit: use the linked list as a unit to migrate the linked list from the back to the front. After the migration is completed, replace the old array head node with ForwardingNode
  • Concurrent get during expansion
    • According to whether it is ForwardingNode to decide whether to search in the new array or in the old array, it will not block
    • If the length of the linked list exceeds 1, you need to copy the node (create a new node), fearing that the next pointer will change after the node migration
    • If the index of the last few elements of the linked list remains unchanged after expansion, the node does not need to be copied
  • Concurrent put during capacity expansion
    • If the put thread is the same linked list as the expansion thread operation, the put thread will block
    • If the linked list of the put thread operation has not been migrated, that is, the head node is not ForwardingNode, it can be executed concurrently
    • If the linked list of the put thread operation has been migrated, that is, the head node is ForwardingNode, it can assist in expansion
  • Lazy initialization compared to 1.7
  • capacity represents the estimated number of elements, and capacity / factory is used to calculate the initial array size, which needs to be close to
  • loadFactor is only used when calculating the initial array size, and then the expansion is fixed at 3/4
  • The expansion problem when the tree threshold is exceeded, if the capacity is already 64, directly tree, otherwise do 3 rounds of expansion on the basis of the original capacity

8. ThreadLocal

Require

  • Master the function and principle of ThreadLocal
  • Grasp the memory release timing of ThreadLocal

effect

  • ThreadLocal can realize the thread isolation of [resource objects], let each thread use its own [resource objects], and avoid thread safety problems caused by contention
  • ThreadLocal also implements resource sharing within threads

principle

Each thread has a member variable of ThreadLocalMap type, which is used to store resource objects

  • Calling the set method is to use ThreadLocal itself as the key and the resource object as the value, and put it into the ThreadLocalMap collection of the current thread
  • Calling the get method is to use ThreadLocal itself as the key to find the associated resource value in the current thread
  • Calling the remove method is to use ThreadLocal itself as the key to remove the resource value associated with the current thread

Some features of ThreadLocalMap

  • The hash value of the key is uniformly distributed
  • The initial capacity is 16, the expansion factor is 2/3, and the expansion capacity is doubled
  • Use open addressing to resolve conflicts after key index conflicts

weak reference key

The key in ThreadLocalMap is designed as a weak reference for the following reasons

  • Thread may need to run for a long time (such as threads in the thread pool), if the key is no longer used, the memory it occupies needs to be released when the memory is insufficient (GC)

memory release time

  • Passive GC releases keys
    • Only the memory of the key is released, and the memory associated with the value will not be released
  • Lazy passive release of value
    • When getting the key, if it is found to be a null key, release its value memory
    • When setting the key, heuristic scanning will be used to clear the value memory of the adjacent null key. The number of heuristics is related to the number of elements and whether a null key is found
  • Actively remove to release key, value
    • The key and value memory will be released at the same time, and the value memory of the adjacent null key will also be cleared
    • It is recommended to use it, because it is generally used as a static variable (that is, a strong reference) when using ThreadLocal, so it cannot passively rely on GC recycling

Guess you like

Origin blog.csdn.net/weixin_60257072/article/details/132262280