Ali interview P6 and above must ask: concurrent programming

Java concurrent programming is widely used in practical work. Sometimes it is necessary to do some things asynchronously through multithreading, and sometimes it is necessary to improve the efficiency of a task execution through multithreading. The most frequently asked questions in Internet company interviews. This article is a bit long and there is a lot of code. Please read it patiently. Improvement is a learning process.

key concepts

context switch

  1. Concept: CPU allocates running time to runnable threads through the time slice algorithm. When switching between different threads, it needs to save the state of the current thread and reply to the thread state information to be executed. This process is context switching.

  2. How to reduce or avoid context switching?

  • lock-free concurrent programming

  • CAS algorithm

  • Use the fewest threads

  • coroutine

deadlock

  1. Concept: Two or more threads hold locks on which each other is waiting

  2. How to avoid deadlock?

  • Avoid a thread from acquiring multiple locks at the same time

  • Avoid a thread occupying multiple resources in the lock at the same time, and try to ensure that each lock only occupies one resource

  • try to use time lock

  • For database locks, locking and unlocking must be in a database connection

The underlying mechanism of Java concurrency

volatile

  1. Function: To ensure the visibility of shared variables between multiple threads in multiprocessor development, that is , when one thread modifies the value of the variable, other threads can immediately see the latest value of the variable.

  2. Principle: When writing to a variable modified by volatile , it will do the following two things

  • Write the data of the current processor cache line to system memory;

  • Invalidate the data cached at the memory address in other CPUs

  1. Use points:

  • Volatile only guarantees visibility, not synchronization. For example: if the changed value of a variable depends on the last changed value, using volatile cannot guarantee concurrency safety

synchronized

  1. Definition: Synchronized is a communication method between multiple threads in Java. There are three specific applications of synchronized:

  • For ordinary synchronized methods, the lock is the current instance object

  • For static synchronized methods, the lock is the Class object of the current class

  • For synchronized code blocks, the lock is the object configured in the synchronized brackets

  1. Use points:

  • Constructors cannot be modified with synchronized

  • It is recommended to reduce the granularity of the lock as much as possible, for example, the use of synchronized code blocks can meet the needs and no need to use synchronized methods

  • If you can confirm that all locks in your application are in most cases contended by different threads, you can disable biased locks with -XX:+UseBiasedLocking to improve performance.

  1. Principle: Introduce two concepts, Monitor Record (a private data structure of the Thread class) and the Java object header. The relationship is: the Java object header stores the address of the Monitor Record, and the Monitor Record records the thread that holds it.

  • monitor: monitor is not a special object, but a method or mechanism. Java controls access to an object through monitor. Every object in Java is associated with a monitor. At the same time, only one thread (Thread) can lock a monitor. When a monitor is locked by a thread, other threads trying to lock the monitor can only block and wait.

  • Object header: The synchronized lock state is described in the header of the Java object. Object headers include Mark word and Klass word.

  1. In a 32-bit virtual machine, the entire object header size is 64bits (ie, 8 bytes), and Mark Word and Klass Word occupy 4 bytes respectively.

  2. Lock state: There are four types of locks in Java from low to high, no lock state --> biased lock --> lightweight lock --> heavyweight lock. The biased lock relies on a field in Mark Word that points to the current thread to identify whether the holder of the lock is the current thread, and if so, directly enters the synchronization code block; assuming that the biased lock is disabled, the lightweight lock refers to two If one thread acquires the lock, one acquires the lock, and the other acquires the lock unsuccessfully, the CAS spin acquires the lock first. If the CAS spin acquisition fails, the lightweight lock will expand into a heavyweight lock. The current lock acquisition fails. The thread enters the blocking state.

  3. The process of lock escalation will only go from low to high, not from high to low, avoiding unnecessary waste of resources. For example, if the status of a lock has reached a heavyweight lock, the threads that compete for the lock later will directly enter the block and will not perform CAS spins. One of the graphs provided in Reference 7 is exquisite, and I put it here:

阿里面试P4以上必问:并发编程

Object Header (32-bit virtual machine)

阿里面试P4以上必问:并发编程

Atomic manipulation

Atomic operations at the CPU level

Atomic operations at the CPU level need to be completed by CPU instructions, which operate data in memory through the bus, so there are two ways in the CPU:

  1. Lock bus: Use the LOCK instruction to send a signal to the bus to implement a

  2. Lock cache: At a certain moment, only the operation of a certain memory address needs to be guaranteed to be atomic;

Atomic operations in JAVA

Atomic operations can be implemented in Java through CAS and locks.

  1. Use CAS to implement atomic operations. Starting from Java 1.5, the java.lang.concurrent package provides many classes to support atomic operations, such as AotmicIntenger and AtomicLong. These classes can atomically increment or decrement the current value of a variable by 1 ;

  2. Use locks to achieve atomic operations, and the lock mechanism ensures that only the thread holding the lock can operate the specified variable;

Having said that, the interview must ask the use, expansion and optimization of the thread pool of concurrent programming

Now it's time to code

I will share a learning route at the end, or you can join the group to get relevant video materials: 433540541.

In short, after using a thread pool, creating a thread handles getting idle threads from the thread pool, and closing a thread becomes returning a thread to the pool. That is, the reuse of threads is improved.

And JDK provides me with ready-made thread pool tools after 1.5, let's learn how to use them today.

  1. What thread pools can the Executors thread pool factory create?

  2. How to manually create a thread pool

  3. How to scale the thread pool

  4. How to optimize the exception information of the thread pool

  5. How to design the number of threads in the thread pool

1. Which thread pools can the Executors thread pool factory create?

Let's start with the simplest example of thread pool usage:

staticclass MyTask implements Runnable {

@Override

public void run() {

System.out

.println(System.currentTimeMillis() + ": Thread ID :" + Thread.currentThread().getId());

try {

Thread.sleep(1000);

} catch (InterruptedException e) {

e.printStackTrace ();

}

}

}

public static void main(String[] args) {

MyTask myTask = new MyTask();

ExecutorService service1 = Executors.newFixedThreadPool(5);

for (int i = 0; i < 10; i++) {

service1.submit(myTask);

}

service1.shutdown();

}

operation result:

阿里面试P4以上必问:并发编程

We created a thread pool instance, set the default number of threads to 5, submitted 10 tasks to the thread pool, and printed the current millisecond time and thread ID respectively. From the results, we can see that there are 5 with the same ID in the results. The thread prints the millisecond time.

This is the simplest example.

Next, let's talk about other thread creation methods.

1. Fixed thread pool ExecutorService service1 = Executors.newFixedThreadPool(5); This method returns a thread pool with a fixed number of threads. The number of threads in this thread pool is always the same. When a new task is submitted, if there is an idle thread in the thread pool, it will be executed immediately. If not, the new task will be temporarily stored in a task queue (the default unbounded queue int maximum number), when a thread is idle , the task in the task queue is processed.

2. Singleton thread pool ExecutorService service3 = Executors.newSingleThreadExecutor(); This method returns a thread pool with only one thread. If more than one task is submitted to the thread pool, the task will be stored in a task queue (the default unbounded queue int maximum number), and when the thread is idle, the tasks in the queue will be executed in the order of first-in, first-out.

3. Cached thread pool ExecutorService service2 = Executors.newCachedThreadPool(); This method returns a thread pool that can adjust the number of threads according to the actual situation. The number of threads in the thread pool is uncertain, but if there are idle threads that can be reused, it will be used first Reusable threads, all threads are working, if a new task is submitted, a new thread will be created to process the task. All threads will return to the thread pool for reuse after the current task is executed.

4. The task calls the thread pool ExecutorService service4 = Executors.newScheduledThreadPool(2); this method also returns a ScheduledThreadPoolExecutor object, which can specify the number of threads.

The usage of the first three threads is no different, the key is the fourth one. Although there are many thread task scheduling frameworks, we can still learn the thread pool. How to use it? Here's an example:

class A {

public static void main(String[] args) {

ScheduledThreadPoolExecutor service4 = (ScheduledThreadPoolExecutor) Executors

.newScheduledThreadPool(2);

// If the previous task is not completed, the scheduling will not start

service4.scheduleAtFixedRate(new Runnable() {

@Override

public void run() {

try {

// If the task execution time is greater than the interval time, then the execution time shall prevail (to prevent tasks from stacking).

Thread.sleep(10000);

System.out.println(System.currentTimeMillis() / 1000);

} catch (InterruptedException e) {

e.printStackTrace ();

}

}// initialDelay (initial delay) represents the first delay time; period represents the interval time

}, 0, 2, TimeUnit.SECONDS);

service4.scheduleWithFixedDelay(new Runnable() {

@Override

public void run() {

try {

Thread.sleep(5000);

System.out.println(System.currentTimeMillis() / 1000);

} catch (InterruptedException e) {

e.printStackTrace ();

}

}// initialDelay (initial delay) represents the delay time; delay + task execution time = equal to the interval time period

}, 0, 2, TimeUnit.SECONDS);

// At a given time, schedule the task once

service4.schedule(new Runnable() {

@Override

public void run() {

System.out.println("execute schedule after 5 seconds");

}

}, 5, TimeUnit.SECONDS);

}

}

}

The above code creates a ScheduledThreadPoolExecutor task scheduling thread pool, which calls three methods respectively. It is necessary to focus on the explanation of the scheduleAtFixedRate and scheduleWithFixedDelay methods. The functions of these two methods are very similar. The former time interval algorithm is based on the specified period time and the task execution time, and the latter is the specified delay time + task execution time. If students are interested, you can run the above code to see. The same can be seen.

Well, JDK encapsulates 4 methods for creating thread pools for us. However, please note that because these methods are highly encapsulated, if they are used improperly, there will be no way to troubleshoot problems. Therefore, I suggest that programmers should Manually create a thread pool, and the premise of manual creation is a high degree of understanding of the parameter settings of the thread pool. So let's take a look at how to manually create a thread pool.

2. How to manually create a thread pool

The following is a template for manually creating a thread pool:

/**

* Default 5 threads (the default number, that is, the minimum number),

* Maximum 20 threads (specifies the maximum number of threads in the thread pool),

* Idle time 0 seconds (when the thread pool combing exceeds the number of cores, the survival time of the excess idle time, that is, the idle threads that exceed the number of core threads, will be destroyed within how long),

* Waiting queue length 1024,

* Thread name [MXR-Task-%d], convenient for backtracking,

* Rejection policy: when the task queue is full, throw RejectedExecutionException

* Exception.

*/

private static ThreadPoolExecutor threadPool = new ThreadPoolExecutor(5, 20, 0L,

TimeUnit.MILLISECONDS, new LinkedBlockingQueue<>(1024)

, new ThreadFactoryBuilder().setNameFormat("My-Task-%d").build()

, new AbortPolicy()

);

We see that ThreadPoolExecutor ,  which is the thread pool, has 7 parameters. Let's take a look at it together:

  1. The number of core threads in the corePoolSize  thread pool

  2. maximumPoolSize  maximum number of threads

  3. keepAliveTime  idle time (when the thread pool combing exceeds the number of cores, the survival time of the excess idle time, that is, the idle threads that exceed the number of core threads, will be destroyed within how long)

  4. unit  time unit

  5. workQueue  When the core thread is full of work, the queue that needs to store tasks

  6. threadFactory  creates a factory for threads

  7. handler  's rejection policy when the queue is full

We won't talk about the first few parameters, it's very simple, mainly the last few parameters, queue, thread factory, and rejection policy.

Let's look at the queue first. The thread pool provides 4 queues by default.

  1. Unbounded queue: The default size is int maximum value, so it may exhaust system memory and cause OOM, which is very dangerous.

  2. Directly submitted queue: There is no capacity, it will not be saved, and new threads are created directly, so a large number of thread pools needs to be set. Otherwise, it is easy to implement the rejection strategy, and it is also very dangerous.

  3. Bounded queue: If the core is full, it is stored in the queue. If the core is full and the queue is full, a thread is created until the maximumPoolSize is reached. If the queue is full and the maximum number of threads has been reached, the rejection policy is executed.

  4. Priority queue: Execute tasks according to priority. Size can also be set.

The landlord used an unbounded queue in his own project, but set the task size to 1024. If you have a lot of tasks, it is recommended to divide into multiple thread pools. Don't put your eggs in one basket.

Let's take a look at the rejection policy. What is the rejection policy? When the queue is full, what to do with those tasks that are still submitted. JDK has 4 strategies by default.

  1. AbortPolicy : Throws an exception directly, preventing the system from working properly.

  2. CallerRunsPolicy : As long as the thread pool is not closed, this policy runs the currently discarded task directly in the caller thread. Obviously doing this will not actually drop the task, however, the performance of the task submitting thread will most likely drop drastically.

  3. DiscardOldestPolicy: This policy will discard the oldest request, that is, a task that is about to be executed, and try to submit the current task again.

  4. DiscardPolicy: This policy silently discards tasks that cannot be processed without any processing. If tasks are allowed to be lost, I think this is the best solution.

Of course, if you are not satisfied with the rejection policy provided by JDK, you can implement it yourself, just implement the RejectedExecutionHandler interface and rewrite the rejectedExecution method.

Finally, the thread factory, all threads of the thread pool are created by the thread factory, and the default thread factory is too single, let's see how the default thread factory creates threads:

/**

* The default thread factory

*/

static class DefaultThreadFactory implements ThreadFactory {

private static final AtomicInteger poolNumber = new AtomicInteger(1);

private final ThreadGroup group;

private final AtomicInteger threadNumber = new AtomicInteger(1);

private final String namePrefix;

DefaultThreadFactory() {

SecurityManager s = System.getSecurityManager();

group = (s != null) ? s.getThreadGroup() :

Thread.currentThread().getThreadGroup();

namePrefix = "pool-" +

poolNumber.getAndIncrement() +

"-thread-";

}

public Thread newThread(Runnable r) {

Thread t = new Thread(group, r,

namePrefix + threadNumber.getAndIncrement(),

0);

if (t.isDaemon())

t.setDaemon(false);

if (t.getPriority() != Thread.NORM_PRIORITY)

t.setPriority(Thread.NORM_PRIORITY);

return t;

}

}

As you can see, the thread name is pool- + thread pool number + -thread- + thread number. Set to non-daemon thread. Priority is default.

如果我们想修改名称呢?对,实现 ThreadFactory 接口,重写 newThread 方法即可。但是已经有人造好轮子了, 比如我们的例子中使用的 google 的 guaua 提供的 ThreadFactoryBuilder 工厂。可以自定义线程名称,是否守护,优先级,异常处理等等,功能强大。

3. 如何扩展线程池

那么我们能扩展线程池的功能吗?比如记录线程任务的执行时间。实际上,JDK 的线程池已经为我们预留的接口,在线程池核心方法中,有2 个方法是空的,就是给我们预留的。还有一个线程池退出时会调用的方法。我们看看例子:

/**

* 如何扩展线程池,重写 beforeExecute, afterExecute, terminated 方法,这三个方法默认是空的。

*

* 可以监控每个线程任务执行的开始和结束时间,或者自定义一些增强。

*

* 在 Worker 的 runWork 方法中,会调用这些方法

*/

public class ExtendThreadPoolDemo {

static class MyTask implements Runnable {

String name;

public MyTask(String name) {

this.name = name;

}

@Override

public void run() {

System.out

.println("正在执行:Thread ID:" + Thread.currentThread().getId() + ", Task Name = " + name);

try {

Thread.sleep(100);

} catch (InterruptedException e) {

e.printStackTrace();

}

}

}

public static void main(String[] args) throws InterruptedException {

ExecutorService es = new ThreadPoolExecutor(5, 5, 0L, TimeUnit.MILLISECONDS,

new LinkedBlockingQueue<>()) {

@Override

protected void beforeExecute(Thread t, Runnable r) {

System.out.println("Ready to execute: " + ((MyTask) r).name);

}

@Override

protected void afterExecute(Runnable r, Throwable t) {

System.out.println("Execution complete: " + ((MyTask) r).name);

}

@Override

protected void terminated() {

System.out.println("Thread pool exit");

}

};

for (int i = 0; i < 5; i++) {

MyTask myTask = new MyTask("TASK-GEYM-" + i);

es.execute(myTask);

Thread.sleep(10);

}

es.shutdown();

}

}

We override the beforeExecute method, which is called before the task is executed, and the afterExecute method is called after the task is executed. There is also a terminated method, which is called when the thread pool exits. What is the execution result?

阿里面试P4以上必问:并发编程

As you can see, the before and after methods are called before and after each task is executed. Equivalent to executing a slice. The terminated method is called after the shutdown method is called.

4. How to optimize the exception information of the thread pool

How to optimize the exception information of the thread pool? Before talking about this problem, let's talk about a bug that is not easy to find:

Look at the code:

public static void main(String[] args) throws ExecutionException, InterruptedException {

ThreadPoolExecutor executor = new ThreadPoolExecutor(0, Integer.MAX_VALUE, 0L,

TimeUnit.MILLISECONDS, new SynchronousQueue<>());

for (int i = 0; i < 5; i++) {

executor.submit(new DivTask(100, i));

}

}

static class DivTask implements Runnable {

int a, b;

public DivTask(int a, int b) {

this.a = a;

this.b = b;

}

@Override

public void run() {

double re = a / b;

System.out.println(re);

}

}

Results of the:

阿里面试P4以上必问:并发编程

Note: There are only 4 results, one of which is swallowed and has no information. why? If you look at the code carefully, you will find that an error will definitely be reported when performing 100/0, but there is no error message, which is a headache. Why? In fact, if you use the execute method, an error message will be printed, and when you use the submit method without calling its get method, the exception will be swallowed because, if an exception occurs, the exception is returned as the return value.

How to do it? Of course we can use the execute method, but we can have another way: rewrite the submit method, the landlord wrote an example, let's take a look:

static class TraceThreadPoolExecutor extends ThreadPoolExecutor {

public TraceThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime,

TimeUnit unit, BlockingQueue<Runnable> workQueue) {

super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue);

}

@Override

public void execute(Runnable command) {

// super.execute(command);

super.execute(wrap(command, clientTrace(), Thread.currentThread().getName()));

}

@Override

public Future<?> submit(Runnable task) {

// return super.submit(task);

return super.submit(wrap(task, clientTrace(), Thread.currentThread().getName()));

}

private Exception clientTrace() {

return new Exception("Client stack trace");

}

private Runnable wrap(final Runnable task, final Exception clientStack,

String clientThreaName) {

return new Runnable() {

@Override

public void run() {

try {

task.run();

} catch (Exception e) {

e.printStackTrace ();

clientStack.printStackTrace();

throw e;

}

}

};

}

}

我们重写了 submit 方法,封装了异常信息,如果发生了异常,将会打印堆栈信息。我们看看使用重写后的线程池后的结果是什么?

阿里面试P4以上必问:并发编程

从结果中,我们清楚的看到了错误信息的原因:by zero!并且堆栈信息明确,方便排错。优化了默认线程池的策略。

5. 如何设计线程池中的线程数量

线程池的大小对系统的性能有一定的影响,过大或者过小的线程数量都无法发挥最优的系统性能,但是线程池大小的确定也不需要做的非常精确。因为只要避免极大和极小两种情况,线程池的大小对性能的影响都不会影响太大,一般来说,确定线程池的大小需要考虑CPU数量,内存大小等因素,在《Java Concurrency in Practice》 书中给出了一个估算线程池大小的经验公式:

阿里面试P4以上必问:并发编程

公式还是有点复杂的,简单来说,就是如果你是CPU密集型运算,那么线程数量和CPU核心数相同就好,避免了大量无用的切换线程上下文,如果你是IO密集型的话,需要大量等待,那么线程数可以设置的多一些,比如CPU核心乘以2.

至于如何获取 CPU 核心数,Java 提供了一个方法:

Runtime.getRuntime().availableProcessors();

返回了CPU的核心数量。

总结

好了,到这里,我们已经对高并发,如何使用线程池有了一个认识,这里,楼主建议大家手动创建线程池,这样对线程池中的各个参数可以有精准的了解,在对系统进行排错或者调优的时候有好处。比如设置核心线程数多少合适,最大线程数,拒绝策略,线程工厂,队列的大小和类型等等,也可以是G家的线程工厂自定义线程。

阿里面试P4以上必问:并发编程

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325269108&siteId=291194637