interview5-multi-threading

1. Basic knowledge of threads

(1) Threads and processes

A program consists of instructions and data, but for these instructions to run and data to be read and written, the instructions must be loaded into the CPU and the data into the memory. During the execution of instructions, disks, networks and other devices are also required. The process is used to load instructions, manage memory, and manage IO.

Process : When a program is run and the program's code is loaded from disk to memory, a process is started.

A thread is an instruction stream, which delivers instructions in the instruction stream to the CPU for execution in a certain order.

A process can be divided into one or more threads.

Comparison between the two

A process is an instance of a running program. The process contains threads, and each thread performs different tasks.
Different processes use different memory spaces, and all threads under the current process can share the memory space.
Threads are more lightweight, and thread context switching costs are generally lower than process context switching (context switching refers to switching from one thread to another)

(2) Parallelism and concurrency

Concurrency is the ability to deal with multiple things

Parallel is the ability to do multiple things

Nowadays, they are all multi-core CPUs. Under multi-core CPUs,

Concurrency is the ability to handle multiple things at the same time. Multiple threads take turns using one or more CPUs.
Parallelism is the ability to do multiple things at the same time. A 4-core CPU executes 4 threads at the same time.

(3) How to create threads

There are four ways to create threads, namely:

Inherit the Thread class

Implement runnable interface

Implement the Callable interface

Thread pool creates threads

（4）runnable和callable

the difference:

Runnable interface run method has no return value
The call method of the Callable interface has a return value, which is a generic type. It can be used to obtain the results of asynchronous execution in conjunction with Future and FutureTask.
The call() method of the Callable interface allows exceptions to be thrown; while the exceptions of the run() method of the Runnable interface can only be digested internally and cannot be thrown further.

(5) Thread status

Enumeration State in the Thread class in JDK:

New (NEW), runnable (RUNNABLE), blocked (BLOCKED), waiting (WAITING), time waiting (TIMED_WALTING), terminated (TERMINATED)

Thread status changes:

Creating a thread object is in a new state
The start() method is called to transition to a runnable state .
The thread has obtained the execution right of the CPU, and the execution is terminated .
During the executable state, if the execution right of the CPU is not obtained, it may switch to other states.
- If the lock (synchronized or lock) is not acquired and it enters the blocking state, acquire the lock and then switch to the executable state.
- If a thread calls the wait() method and enters the waiting state, other threads can switch to the executable state after calling notify() to wake up.
- If the thread calls the sleep(50) method, it enters the timing waiting state and can switch to the executable state after the time is up.

(6) The difference between wait and sleep methods

The wait and sleep methods are both used to put the current thread into a blocking state, but there are several major differences between them:

Source: The wait method comes from the Object class, and the sleep method comes from the Thread class.
Thread status: The wait method will release the object's lock so that other threads can enter the synchronized method or synchronized code block, while the sleep method will not release the lock during blocking.
Usage scenarios: The wait, notify and notifyAll methods can only be used in synchronization control methods or synchronization control blocks, while the sleep method can be used anywhere.
Timing: The sleep method can accept a time parameter, allowing the thread to automatically exit the blocking state after pausing for a specified time, while the wait method does not have this function. When no waiting time is specified, the thread will wait until it is interrupted by other threads.
Wake-up method: The sleep method does not need to be awakened, and it will automatically exit the blocking state after the waiting time. The wait method can exit the blocking state only after being interrupted by other threads.
Different lock characteristics ( key points ):
1. The call to the wait method must first acquire the lock of the wait object, while sleep does not have this restriction.
2. After the wait method is executed, the object lock will be released, allowing other threads to obtain the object lock (I gave up the CPU, but you can still use it)
3. If sleep is executed in a synchronized code block, the object lock will not be released (I give up the CPU, you can't use it)

In general, although both methods may cause the thread to enter a blocking state, there are significant differences in usage scenarios, thread status, timing and wake-up methods.

(7) How to ensure the sequential execution of threads

Create three new threads T1, T2, and T3. How to ensure that they are executed in order?

This can be solved using the join() method in the thread :

(8) The difference between notify() and notifyAll()

notify: Only randomly wake up a wait thread

notifyAll: wake up all wait threads

(9) The difference between thread run() and start()

start() : used to start a thread, through which the thread calls the run method to execute the logic code defined in the run method. The start method can only be called once.

run() : Encapsulates the code to be executed by the thread and can be called multiple times.

(10) Stop the thread

There are three ways to stop a thread:

Use the exit flag to make the thread exit normally, that is, the thread terminates after the run method is completed.
Use the stop method to forcefully terminate (not recommended, the method is obsolete)
Use the interrupt method to interrupt a thread
- Interrupt the blocked thread (sleep, wait, join) thread, the thread will throw InterruptedException exception
- Interrupt the normal thread, and you can mark whether to exit the thread according to the interruption status.

In Java, Thread.stop()the method has been deprecated as it may cause the thread to stop at any point, which may cause many problems such as data inconsistency, resource leakage, etc.

In contrast, Thread.interrupt()method is a safer way to terminate a thread. This method sets the thread's interrupt status and then lets the thread decide how to respond to the interrupt. Threads can check the interrupt status and decide how to stop themselves gracefully. For example, if a thread is executing a loop, it can check the interrupt status each time it loops and exit the loop when appropriate.

Using Thread.interrupt()methods gives threads a chance to clean up resources, save state, and stop in a predictable manner.

2. Concurrency safety in threads

(1) The underlying principle of the synchronized keyword

synchronized [Object Lock] uses a mutually exclusive method so that at most one thread can hold [Object Lock] at the same time
Its bottom layer is implemented by monitor, which is a jvm-level object (implemented in C++). To obtain a lock, a thread needs to use an object (lock) to associate the monitor.
There are three attributes inside the monitor, namely owner, entrylist, and waitset.
The owner is the thread associated with the lock, and can only be associated with one thread; the entrylist is associated with the thread in the blocked state; the waitset is associated with the thread in the Waiting state.

The lock implemented by monitor is a heavyweight lock. Do you know about lock upgrade ?

Synchronized in Java has three forms: biased lock, lightweight lock, and heavyweight lock, which respectively correspond to the three situations where the lock is held by only one thread, different threads alternately hold the lock, and multi-threads compete for the lock.

Lock	describe
Heavyweight lock	The Monitor implementation used at the bottom layer involves switching between user mode and kernel mode, switching between user mode and kernel mode of the process, and context switching of the process. The cost is high and the performance is relatively low.
lightweight lock	The locking time of threads is staggered (that is, there is no competition), and lightweight locks can be used to optimize it. The lightweight version modifies the lock flag of the object header, and its performance is much improved compared to the heavyweight version. Each modification is a CAS operation, ensuring atomicity.
bias lock	If the lock is only used by one thread for a long period of time, you can use a biased lock. When the lock is acquired for the first time, there will be a CAS operation. After that, the thread acquires the lock again and only needs to determine whether the mark word is its own. The thread ID can be used instead of the CAS command, which is relatively expensive.

Once lock competition occurs, it will be upgraded to a heavyweight lock.

（2）JMM

JMM (Java Memory Model) Java memory model definesthe behavioral specifications for read and write operations of multi-threaded programs in shared memory . These rules are used to standardize the read and write operations of memory to ensure the correctness of instructions.

JMM divides the memory into two parts, one is the working area of the private thread (working memory), and the other is the shared area (main memory) of all threads.

Threads are isolated from each other, and threads need to interact with each other through main memory.

（3）CAS

The full name of CAS is: Compare And Swap, which embodies an optimistic locking idea and ensures the atomicity of thread operations on shared data in a lock-free situation.

Many classes implemented under the JUC (java.util.concurrent) package use CAS operations:

AbstractQueuedSynchronizer (AQS framework)
AtomicXXX classes

The spin lock used when operating shared variables is more efficient. The bottom layer of CAS is the method in the Unsafe class that is called, which are provided by the operating system and implemented in other languages.

CAS is based on the idea of optimistic locking: the most optimistic estimate is that we are not afraid of other threads modifying shared variables. Even if they are modified, it doesn't matter. I will suffer a little loss and try again.

（4）AQS

The full name of AQS is AbstractQueuedSynchronizer, which is abstract queue synchronizer . It is the basic framework for building locks or other synchronization components.

synchronized	AQS
Keywords, c++ language implementation	java language implementation
Pessimistic lock, automatically release the lock	Pessimistic lock, manually open and close
Competition for locks is fierce and all are heavyweight locks with poor performance.	In the case of fierce competition for locks, a variety of solutions are provided

Common implementation classes of AQS:

ReentrantLock blocking lock
Semaphore semaphore
CountDownLatch countdown lock

AQS maintains a first-in-first-out bidirectional queue internally, and the queued threads are stored in the queue.

There is also an attribute state inside AQS. This state is equivalent to a resource. The default is 0 (lock-free state). If a thread in the queue successfully changes the state to 1, the current thread is equivalent to acquiring the resource.

The cas operation used when modifying state ensures atomicity when modified by multiple threads.

(5) Implementation principle of ReentrantLock

ReentrantLock translates as reentrant lock . Compared with synchronized, it has the following characteristics:

Interruptible
Timeout can be set
Fair lock can be set
Support multiple condition variables
Like synchronized, both support reentrancy.

ReentrantLock mainly uses CAS+AQS queue to implement. It supports fair locks and unfair locks. The implementation of the two is similar. The constructor accepts an optional fair parameter ( default unfair lock ). When set to true, it indicates a fair lock, otherwise it is an unfair lock. The efficiency of fair locks is often not as high as that of unfair locks. In the case of access by many threads, fair locks show lower throughput.

(6) What is the difference between synchronized and Lock?

Grammatical level:
- synchronized is a keyword, the source code is in jvm, implemented in c++ language
- Lock is an interface. The source code is provided by jdk and implemented in java language.
- When using synchronized, the lock will be automatically released when exiting the synchronized code block. When using Lock, you need to manually call the unlock method to release the lock.
Functional level:
- Both are pessimistic locks and have basic mutual exclusion, synchronization, and lock reentrancy functions.
- Lock provides many functions that synchronized does not have, such as fair lock, interruptible, timeout, and multiple condition variables.
- Lock has implementations suitable for different scenarios, such as ReentrantLock, ReentrantReadWriteLock (read-write lock)
Performance level:
- When there is no competition, synchronized has done a lot of optimizations, such as biased locks and lightweight locks, and its performance is not bad.
- Implementations of Lock generally provide better performance when competition is high

(7) Deadlock

Conditions for deadlock : A thread needs to acquire multiple locks at the same time, and deadlock is prone to occur.

How to perform deadlock diagnosis?

When a deadlock occurs in the program, we can use the tools that come with jdk: jps and jstack

jps : Output process status information running in the JVM
jstack : View the stack information of threads in the java process
Visual tools jconsole and VisualVM can also check deadlock problems

（8）volatile

Once a shared variable (class member variable, class static member variable) is modified with volatile, it has two levels of semantics: ensuring visibility between threads and prohibiting instruction reordering

Ensure visibility between threads . Using volatile to modify shared variables can prevent compiler and other optimizations from occurring, making modifications to shared variables by one thread visible to another thread.
Disable instruction reordering : Modifying shared variables with volatile will add different barriers when reading and writing shared variables, preventing other read and write operations from crossing the barriers, thereby preventing reordering.

（9）ConcurrentHashMap

ConcurrentHashMap is a thread-safe and efficient Map collection

Underlying data structure:

The bottom layer of JDK1.7 is implemented using segmented array + linked list
The data structure used by JDK1.8 is the same as that of HashMap1.8, array + linked list/red-black binary tree.

Locking method:

JDK1.7 uses Segment segmentation lock, and the bottom layer uses ReentrantLock.
JDK1.8 uses CAS to add new nodes and uses synchronized locking of the first node of the linked list or red-black binary tree. Compared with Segment segment lock, the granularity is finer and the performance is better.

(10) The root cause of problems in concurrent programs

How to ensure the safety of multi-threaded execution in Java programs?

Three major characteristics of Java concurrent programming: atomicity , visibility , and orderliness

Atomicity: A thread's operation in the CPU cannot be suspended or interrupted. Either the execution is completed or it is not executed.
Memory visibility: Make modifications to shared variables by one thread visible to another thread
Orderliness: Instructions are rearranged. In order to improve the efficiency of program operation, the processor may optimize the input code. It does not guarantee that the execution order of each statement in the program is consistent with the order in the code, but it will guarantee the final execution result of the program. The result is consistent with the sequential execution of the code.

3. Thread pool

(1) Core parameters of thread pool

corePoolSize number of core threads
maximumPoolSize maximum number of threads = (maximum number of core threads + emergency threads)
keepAliveTime survival time - the survival time of the emergency thread. If there are no new tasks within the survival time, the resources of this thread will be released.
unit time unit - the survival time unit of the emergency thread, such as seconds, milliseconds, etc.
workQueue - When there are no idle core threads, new tasks will be added to this queue. If the queue is full, an emergency thread will be created to execute the task.
threadFactory thread factory - can customize the creation of thread objects, such as setting the thread name, whether it is a daemon thread, etc.
handler rejection policy - when all threads are busy and the workQueue is full, the rejection policy will be triggered

The execution principle of the thread pool:

Deny policy:

AbortPolicy: Throw an exception directly, default policy;
CallerRunsPolicy: Use the caller's thread to perform tasks;
DiscardOldestPolicy: Discard the frontmost task in the blocking queue and execute the current task;
DiscardPolicy: Discard the task directly;

(2) Common blocking queues in thread pools

workQueue - When there are no idle core threads, new tasks will be added to this queue. If the queue is full, an emergency thread will be created to execute the task.

ArrayBlockingQueue: Bounded blocking queue based on array structure, FIFO .
LinkedBlockingQueue: Bounded blocking queue based on linked list structure, FIFO .
DelayedWorkQueue: It is a priority queue, which can ensure that each task dequeued is the one with the longest execution time in the current queue.
SynchronousQueue: A blocking queue that does not store elements. Each insertion operation must wait for a removal operation.

LinkedBlockingQueue	ArrayBlockingQueue
Unbounded by default, bounded is supported	Enforce bounded
The bottom layer is a linked list	The bottom layer is an array
It is lazy and adds data when creating the node.	Initialize Node array in advance
Joining the team will generate a new Node	Node needs to be created in advance
Two locks (head and tail)	a lock

(3) How to determine the number of core threads

Generally speaking, IO-intensive tasks include: file reading and writing, DB reading and writing, network requests, etc. The number of core threads is set to 2N+1 .

CPU-intensive tasks generally include: computational code, Bitmap conversion, Gson conversion, etc. The number of core threads is set to N+1 .

<span style="background-color:#f8f8f8"><span style="color:#333333"><span style="color:#aa5500">// 查看机器的CPU核数</span>
<span style="color:#000000">System</span>.<span style="color:#000000">out</span>.<span style="color:#000000">println</span>(<span style="color:#000000">Runtime</span>.<span style="color:#000000">getRuntime</span>().<span style="color:#000000">availableProcessors</span>());</span></span>

Concurrency is high and business execution time is long. The key to solving this type of task lies not in the thread pool but in the design of the overall architecture. Seeing whether certain data in these businesses can be cached is the first step. Adding servers is the second step. As for the thread pool settings.

(4) What are the types of thread pools?

1. newFixedThreadPool: Create a fixed-length thread pool, which can control the maximum number of concurrent threads. Exceeding threads will wait in the queue.

2. newSingleThreadExecutor: Create a single-threaded thread pool, which will only use the only working thread to execute tasks, ensuring that all tasks are executed in the specified order (FIFO)

3. newCachedThreadPool: Create a cacheable thread pool. If the length of the thread pool exceeds processing needs, idle threads can be flexibly recycled. If there is no way to recycle, create a new thread.

4. newScheduledThreadPool: A thread pool that can execute delayed tasks and supports scheduled and periodic task execution.

Why is it not recommended to use Executors to create a thread pool?

Executors are a common way to create a thread pool in Java, but the main reasons why it is not recommended are as follows:

Risk of resource exhaustion : The thread pool created by Executors has no limit. If the task submission speed is much faster than the task execution speed, the number of threads in the thread pool will continue to increase until system resources are exhausted.
Thread management issues : The thread pool created by Executors does not provide any management mechanism, such as thread life cycle management, maximum concurrent number of threads settings, task queue management, etc. This may cause the program's behavior to be unpredictable.
Difficulty closing the thread pool : The thread pool created by Executors does not provide a method to close the thread pool. If you want to stop the running of the thread pool, you can only call the shutdown() or shutdownNow() method, but such an operation may cause some unexpected events. Behavior.
Not conducive to performance tuning : Since the thread pool created by Executors does not provide any parameters to set the behavior of the thread pool, it is not conducive to performance tuning according to actual needs.

Therefore, although Executors provide a simple way to create thread pools, in actual production environments, we prefer to use ThreadPoolExecutor to create thread pools because it provides more control and flexibility.

(5) Thread pool usage scenarios

Batch import : Usethread pool + CountDownLatchto import data from the database into ES (any) in batches to avoid OOM.

Data summary : Call multiple interfaces to summarize data. If all interfaces (or some interfaces) have no dependencies, you can usethread pool + futureto improve performance.

Asynchronous thread (thread pool) : In order to prevent the lower-level method from affecting the upper-level method (performance considerations), an asynchronous thread can be used to call the next method (the return value of the lower-level method is not required), which can improve the method response time.

（6）ThreadLocal

ThreadLocal is an operation class for solving thread safety in multi-threading. It allocates an independent thread copy to each thread to solve the problem of concurrent access conflicts of variables. ThreadLocal also implements resource sharing within threads.

ThreadLocal has two main functions:

The first one is to realize thread isolation of resource objects, allowing each thread to use its own resource object to avoid thread safety issues caused by contention.
The second is to realize resource sharing within threads.

Each thread has a member variable of type ThreadLocalMap, which is used to store resource objects. Principle:

Calling the set method uses ThreadLocal itself as the key and the resource object as the value, and puts it into the ThreadLocalMap collection of the current thread.
Calling the get method uses ThreadLocal itself as the key to find the associated resource value in the current thread.
Calling the remove method uses ThreadLocal itself as the key to remove the resource value associated with the current thread.

ThreadLocal memory leak problem: The key in ThreadLocalMap is a weak reference, and the value is a strong reference; the key will be released by GC, but the memory associated with the value will not be released. It is recommended to actively remove the key and value.