All computer information processing is finally completed by the CPU, and each (single-core) CPU can only process requests from one thread at the same time.

The operating system is the CPU's broker. The operating system says to all applications: "Our CPU can only handle requests from one thread at the same time. You can split your process by threads and then come back."

After the application program splits its own process into one thread, the operating system provides some kernel threads and application threads one-to-one, and then asks the CPU to process the requests of these kernel threads.

As shown in the figure:

A running application corresponds to one process, and one process corresponds to multiple threads.

2 thread status

Generally speaking, a thread has 5 states:

New state (New): After the thread object is created, it enters the new state.
Ready state (Runnable): After the new thread object is started, it enters the ready state. Threads in the ready state may be scheduled for execution by the CPU at any time.
Running status (Running): The thread is scheduled for execution by the CPU. It should be noted that the thread can only enter the running state from the ready state.
Blocked: The thread gives up the right to use the CPU for some reason and temporarily stops running. Until the thread enters the ready state, it has a chance to continue running.
Dead state (Dead): When a thread finishes its execution or exits due to an exception, the thread ends its life cycle.

When implementing threads in Java, threads are divided into 6 states:

NEW (New): After the thread object is created, it enters the new state.

RUNNABLE (run): In Java, the ready state and running state of the thread are unified into the RUNNABLE state.

BLOCKED (blocked): The thread temporarily stops running, waiting to obtain lock resources.

WAITING (waiting): waiting for other threads to make a specific action (notification or interrupt).

TIMED_WAITING (Waiting overtime): Waiting for other threads to make a specific action (notification or interrupt), or turn to the ready state after a specified time.

TERMINATED (terminated): the thread has finished executing.

3 Thread creation

In Java, there are two basic ways to create threads:

Inherit the Thread class
Implement the Runnable interface

public class CreateThread{
    static class MyThread extends Thread{ //继承Thread类
        @Override
        public void run(){ //重写run()方法
            System.out.println("Hello MyThread!");
        }
    }
 
    static class MyRunnable implements Runnable{ //实现Runnable接口
        @Override
        public void run(){ //重写run()方法
            System.out.println("Hello MyRunnable!");
        }
    }
 
    public static void main(String[] args){
        new MyThread().start(); //调用start()方法启动线程
        new Thread(new MyRunnable()).start(); //调用start()方法启动线程
    }
}

A Callable interface is provided in java.util.concurrent, and a thread created by implementing the Callable interface can have a return value:

import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.FutureTask;

public class CreateThread {
    static class MyCallable implements Callable<Integer> { //泛型规定返回值类型
        @Override
        public Integer call() throws Exception { //重写call()方法，类似于Runnable接口中的run()方法
            System.out.println("Hello MyCallable");
            return 1024;
        }
    }

    public static void main(String[] args) throws ExecutionException, InterruptedException {
        MyCallable myCallable = new MyCallable();
        FutureTask futureTask = new FutureTask(myCallable); //适配器模式
        new Thread(futureTask).start(); //调用start()方法启动线程
        //打印返回值
        Integer result = (Integer) futureTask.get();
        System.out.println(result);
    }
}

In addition, using Lambda expressions can also create threads.

public class CreateThread{
    public static void main(String[] args){
        new Thread(()->{
            System.out.println("Hello Lambda!");
        }).start();
    }
}

Common methods in the Thread class:

sleep(): Sleep, make the thread enter the blocking state, give other threads a chance to execute, and after the sleep time is over, the thread enters the ready state and competes for CPU resources with other threads.

yield(): Politeness, which makes the thread enter the ready state and compete for CPU resources with other threads.

join(): merge, the current thread is blocked, first execute the join thread, and then continue to execute the current thread. Can be used to ensure sequential execution between threads.

4 thread synchronization

Thread synchronization: When a thread is operating on the memory, other threads cannot operate on this memory address until the thread operation is completed. Before the thread operation is completed, other threads are in a waiting state.

The synchronized keyword is used in Java to achieve thread synchronization.

The role of the synchronized keyword is to lock the object, a lock can only be held by one thread at any time.

Question: Does synchronized lock the object or the code?

Answer: Lock the object. For example, in the following code segment, the correct expression is that synchronized locks o, and the thread can execute the code in braces {} after getting o.
public class Test{
    private int count = 10;
    private Object o = new Object();
    public void test(){
        synchronized(o){
            count--;
            System.out.println(Thread.currentThread().getName() + " count = " + count);
        }
    }
}
Because it is troublesome to create a new Object object every time, the above code snippet can be changed to:
public class Test{
    private int count = 10;
    public synchronized void test(){
        count--;
        System.out.println(Thread.currentThread().getName() + " count = " + count);
    }
}
In the above program, the object locked by synchronized is this.

If the synchronized modified method is a static method (static), then the locked object is Test.class.

5 synchronized

5.1 The structure of Java objects

Before talking about the underlying implementation of synchronized, you need to understand the structure of Java objects.

A Java object occupies 16 bytes in the heap (except for array objects), of which:

The first 8 bytes are called markword, which stores lock information, hashcode, GC information, etc.;

In Hotspot (a commonly used JVM), the information recorded in the markword of the object is as shown in the figure:

The 9th to 12th bytes are called klasspointer, which is a pointer to the class corresponding to this object;

The 13th to 16th bytes may be instancedata, instance data (member variables). If the object does not have instance data, it is padding and alignment. Its function is to make up the length of the object to be evenly divisible by 8 bytes.

(If it is an array object, there is an additional arraylength between classpointer and instancedata, the length of the array)

Looking back at the concept of synchronized locking an object, the essence of locking is to record lock information in the markword of the object.

So what is the recorded lock information?

Pointer to the current thread.

5.2 synchronized lock upgrade process

In the early days, the efficiency of synchronized locks was very low (because only heavyweight locks were used). Later versions of jdk optimized it. Now synchronized locks include four states: new, biased lock, and lightweight lock (also called Spin lock), heavyweight lock.

The stronger the lock, the more system resources are consumed, so the problem that can be solved with a lower-level lock should be solved with a lower-level lock as much as possible. When the lower-level lock cannot solve the problem, the lock is upgraded to a higher-level lock.

The lock upgrade process is:

Newly created object (new)-->bias lock-->lightweight lock (spin lock)-->heavyweight lock

(There is a very annoying concept here: lock-free, refers to non-heavyweight locks, understand it, it is not recommended to use this concept)

In fact, the lock upgrade process is more complicated, as shown in the figure:

5.2.1 Bias lock

The biased lock believes that in most cases, the synchronized locked object has only one thread to use.

Give the newly created object a bias lock: record the pointer JavaThread* of the current thread in the markword of the object.

As long as there are other threads to grab this object (light contention), the biased lock is upgraded to a lightweight lock.

By default, the biased lock mechanism starts after the JVM runs for 4 seconds. The reason for the delayed start is that there must be multiple threads competing for the object when the JVM is started.

After the bias lock mechanism is activated, all newly created objects are biased by default-the bias lock position 1 in the markword. When the object's bias lock bit is 1 and there is no thread pointer recorded in the markword, the state is called anonymous bias. When a thread fetches the object, the thread pointer is recorded, and the anonymous bias lock is upgraded to a bias lock.

5.2.2 Lightweight lock (spin lock)

(Under mild competition) The partial lock is upgraded to a lightweight lock: first the object's partial lock state is revoked, and then each thread generates LR (Lock Record) in its own thread stack, and tries to add it to the object's markword Write a pointer to your own LR. Which thread writes the pointer, which thread grabs the lightweight lock of the object.

Lightweight locks are implemented by means of CAS.

CAS: Compare And Swap, compare and exchange.

As shown in the figure, after the system uses data E to perform an operation, it compares the current E value (N in the figure) and the E value before the operation when writing back the result. If they are equal, write the operation result; if not If they are equal, take the current E value and perform the calculation again, and the process is repeated.

There are two classic questions about CAS:

(1) ABA problem: During the operation, other threads have modified the E value many times, but the final E value is still the same as before the operation, causing the problem of "this A is not the other A". how to solve this problem?

Add the version number.

(2) Operation atomicity: How is the atomicity of the two operations guaranteed for the operation "compare E value is consistent with the previous operation" and the operation "write back the calculation result"?

The bottom layer of CAS is an assembly language: lock cmpxchg

lock instruction: lock the bus, the CPU will not be interrupted by other CPUs when executing instructions.

CAS is essentially a continuously looping program, which will take up CPU resources, so when thread competition is fierce, lightweight locks are upgraded to heavyweight locks.

5.2.3 Heavyweight lock

Heavyweight locks put multiple threads that compete fiercely into the waiting queue, and the operating system is responsible for thread scheduling.

Threads placed in the waiting queue do not occupy CPU resources.

Under what circumstances can a lightweight lock be upgraded to a heavyweight lock?

Before jdk1.6, if a thread spins more than 10 times, or the number of spinning threads exceeds one-half of the number of CPU cores, the lock is upgraded.

After jdk1.6, adaptive spin is started by default, and the JVM itself controls whether to upgrade.

6 volatile

Volatile is a characteristic modifier, which is used as an instruction keyword to ensure that this instruction will not be omitted due to compiler optimization, and requires direct reading every time.

For example, the following program:

XBYTE[2]=55;
XBYTE[2]=56;
XBYTE[2]=57;
XBYTE[2]=58;

For external hardware, the above statement means that XBYTE[2] has been assigned four times, but the compiler will optimize the above statement, thinking that only XBYTE[2]=58 is valid (that is, ignore the first three statements, only Generate a machine code). If you type volatile, the compiler will compile one by one and generate the corresponding machine code (four pieces of machine code are generated).

Volatile has two functions: to ensure thread visibility and prevent instruction reordering.

6.1 volatile guarantees thread visibility

The modification of main memory by one thread can be observed by other threads in time. This feature is called visibility.

Two issues will be discussed next:

Why ensure thread visibility?
How does volatile guarantee thread visibility?

6.1.1 Why should thread visibility be guaranteed?

First of all, we need to know that the CPU does not read data directly from the main memory, but needs to be cached in between.

The cache (ie high-speed cache, Cache) is located between the CPU and the main memory, and is divided into three levels: L1, L2, and L3. When the CPU reads data from the main memory, it will first find the required data in L1. If L1 does not find it in L2, L2 does not find it in L3, and if it is not in L3, it reads it from main memory. Conversely, data is first read into L3 from the memory, then into L2, and then into L1.

When the cache reads data from the main memory (following the principle of program locality), it will be read in blocks. Each block of data is called a cache line. The size of the cache line is generally 64 bytes. Therefore, a line of data in the main memory is likely to be read by the caches of multiple CPUs at the same time. After a certain CPU modifies this line of data, although the modified data is written to the main memory, other CPUs still read from their own cache. Read the data before modification, this is not the case.

When the same row of data is read into different CPUs, it is necessary to ensure that the data in each CPU is consistent.

6.1.2 How does volatile guarantee thread visibility?

The implementation of volatile to ensure thread visibility is to ensure data consistency between cache lines.

The way for the CPU to achieve data consistency between cache lines is to comply with the MESI cache consistency protocol, and lock the bus if it still fails.

The MESI cache coherency protocol is the underlying Intel protocol, Modified modification, Exclusive exclusive, Shared sharing, Invalid invalidation.

The lock bus uses the assembly command lock.

6.2 volatile prohibits instruction reordering

CPU out-of-order execution (instruction reordering):

When the CPU needs to execute two instructions, the execution of the first instruction is relatively slow, and the execution of the second instruction is relatively fast. If the two instructions are not related, it is possible to execute the second instruction first, and then execute the first instruction. .

CPU out-of-sequence execution will not cause problems under single thread, but may cause problems under multiple threads.

Volatile prohibits the CPU from reordering instructions. The implementation method is to add a memory barrier, and the instructions before and after the memory barrier are not allowed to be reordered.

The JVM requires four memory barriers to be implemented:

loadload: the barrier between the read instruction and the read instruction, the read instruction below the barrier can be executed only after all the read instructions above the barrier are completed;

storestore: the barrier between write instructions and write instructions, the write instructions below the barrier can only be executed after all the write instructions above the barrier are completed;

loadstore: The barrier between read instructions and write instructions. The write instructions below the barrier can only be executed after all read instructions above the barrier are completed;

storeload: The barrier between write instructions and read instructions. Read instructions below the barrier can only be executed after all write instructions above the barrier are completed.

These four kinds of memory barriers are implemented at the bottom through assembly instructions.

Memory barrier before and after volatile operation:

7 Inter-thread communication

7.1 Producer and consumer issues

Producer and consumer issues:

Multiple threads operate the same variable number at the same time:

Every time the producer operates, number++

Each time the consumer operates, number--

When number == 0, consumers cannot operate on it.

Code:

public class Test {
    public static void main(String[] args) {
        Data data = new Data();
        int n = 4; //消费者数目
        int k = 5; //每个消费者的消费额
        new Thread(() -> {
            for (int i = 0; i < n * k; i++) {
                try {
                    data.increment();
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        }, "Producer").start(); //生产者
        for (int id = 0; id < n; id++) {
            new Thread(() -> {
                for (int i = 0; i < k; i++) {
                    try {
                        data.decrement();
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                }
            }, "Consumer" + id).start(); //消费者们
        }
    }
}

class Data {
    private int number = 0;
    //synchronized方法
    public synchronized void increment() throws InterruptedException {
        while (number > 3) { //while轮询
            this.wait(); //线程等待
        }
        number++;
        System.out.println(Thread.currentThread().getName() + "=>" + number);
        this.notifyAll(); //通知其它线程
    }
    //synchronized方法
    public synchronized void decrement() throws InterruptedException {
        while (number <= 0) { //while轮询
            this.wait(); //线程等待
        }
        number--;
        System.out.println(Thread.currentThread().getName() + "=>" + number);
        this.notifyAll(); //通知其它线程
    }
}

Three key points in the producer and consumer issue:

synchronized
wait和notify
while polling

I won't go into details about synchronized.

7.2 wait和notify

wait(): Make the current thread wait, you can use the notify() or notifyAll() method to wake up.

The difference between wait() and sleep():

wait() is a method in the Object class, and sleep() is a method in the Thread class

Calling wait() will release the lock (entering the waiting state), while calling sleep() will not release the lock (entering the blocking state)

wait() can only be used in synchronous code blocks, while sleep() can be used anywhere

notifyAll(): wake up all waiting threads.

notify(): wake up a waiting thread.

In the comments of the JDK source code, it is said that the thread that notify() chooses to wake up is arbitrary, but it depends on the specific implementation of the JVM.

Hotspot's implementation of notify() is a sequential wakeup, that is, "first in, first out".

7.3 while polling

The wait() method always appears in the loop, which is to prevent false wakeup problems under multithreading.

In the producer-consumer model, the number of products may be 0 at a certain moment, and multiple consumer threads are waiting. At this time, the producer produces a product, all waiting consumer threads are awakened, but in the end only one consumer thread can obtain the product, it is effective when it is awakened; other consumer threads can only continue to wait, and they are awakened Is invalid, that is, false wakeup.

Learning video link:

https://www.bilibili.com/video/BV1xK4y1C7aT

加油！ (d • _ •) d

Conquer Java Multithreading and High Concurrency-Multithreading Basics

1 What is a thread?