Catch the Java concurrency model in one go

In this article, we will discuss the concurrent design model.

Different concurrency models can be used to implement concurrent systems. The concurrency model describes how threads in the system cooperate to complete concurrent tasks. Different concurrency models split tasks in different ways, and threads can communicate and collaborate in different ways.

Concurrency model is very similar to distributed system

In fact, concurrent and distributed model is very similar to the system model, the model is the concurrent 线程communicate with each other, while the model in a distributed system is 进程in communication with each other. However, in essence, processes and threads are also very similar. This is why the concurrency model is very similar to the distributed model.

Distributed systems usually face more challenges and problems than concurrent systems, such as process communication, network abnormalities, or remote machine hangs, etc. But a concurrency model also faces problems such as CPU failure, network card problems, hard disk problems, etc.

Because the concurrency model and the distributed model are very similar, they can learn from each other. For example, the model used for thread allocation is similar to the load balancing model in a distributed system environment.

In fact, to put it plainly, the idea of ​​the distributed model is derived from the development of the concurrency model.

Recognize two states

An important aspect of the concurrency model is whether threads should 共享状态, have them 共享状态or not 独立状态. Shared state also means sharing some state between different threads

The state is 数据, for example, one or more objects. When a thread to share data, it will cause 竞态条件or 死锁other issues. Of course, these problems are only possible, and the specific implementation depends on whether you use and access shared objects safely.

The independent state indicates that the state will not be shared between multiple threads. If the threads need to communicate, they can access immutable objects to achieve this. This is the most effective way to avoid concurrency problems, as shown in the figure below Shown

Using independent state makes our design easier, because only one thread can access the object, even if the object is exchanged, it is an immutable object.

Concurrency model

Parallel Worker

The first concurrency model is the parallel worker model. The client will hand over tasks 代理人(Delegator), and then the agents will assign the work to different ones 工人(worker). As shown below

The core idea of ​​a parallel worker is that it mainly has two processes, an agent and a worker. The Delegator is responsible for receiving tasks from the client and sending them to specific Workers for processing. After the Worker processing is completed, the results are returned to the Delegator. After the Delegator receives the results processed by the Worker, it summarizes them and then delivers them to the client.

The parallel worker model is a very common model in the Java concurrency model. Many java.util.concurrentconcurrency utilities in the package are used in this model.

Advantages of Parallel Worker

A very obvious feature of the parallel Worker model is that it is easy to understand. In order to improve the parallelism of the system, you can add multiple Workers to complete tasks.

Another advantage of the parallel Worker model is that it will split a task into multiple small tasks and execute them concurrently. After receiving the processing result of the Worker, the Delegator will return it to the Client. The entire Worker -> Delegator -> Client process Yes 异步.

Disadvantages of Parallel Worker

Similarly, the parallel Worker mode also has some hidden shortcomings

Shared state can become very complicated

The actual parallel Worker is more complicated than what we have drawn in the figure, mainly because the parallel Worker usually accesses some shared data in the memory or shared database.

These shared states may use some work queues to store business data, data caches, database connection pools, etc. In thread communication, threads need to ensure that the shared state can be shared by other threads, rather than just staying in the CPU cache to make themselves available. Of course, these are issues that programmers need to consider when designing. Thread needs to be avoided 竞态条件, 死锁concurrency issues, and many others shared state caused.

When multiple threads access shared data, concurrency will be lost, because the operating system must ensure that only one thread can access the data, which will lead to contention and preemption of shared data. Threads that have not preempted resources will 阻塞.

Modern non-blocking concurrent algorithms can reduce contention and improve performance, but non-blocking algorithms are more difficult to implement.

可持久化的数据结构(Persistent data structures)Is another option. The persistent data structure will always retain the previous version after modification. Therefore, if multiple threads modify a persistent data structure at the same time, and one thread modifies it, the modified thread will obtain a reference to the new data structure.

Although a persistent data structure is a new solution, this method has some problems in its implementation. For example, a persistent list will add new elements to the beginning of the list and return a reference to the added new element. But other threads still only hold a reference to the previous first element in the list, and they cannot see the newly added element.

Persistent data structures such as 链表(LinkedList)poor performance in hardware performance. Each element in the list is an object, and these objects are scattered in computer memory. The sequential access of modern CPUs is often much faster, so the use of sequential access data structures such as arrays can achieve higher performance. The CPU cache can load a large matrix block into the cache and allow the CPU to directly access the data in the CPU cache after loading. For linked lists, it is practically impossible to spread the elements across the entire RAM.

Stateless worker

The shared state can be modified by other threads. Therefore, the worker must read the shared state every time it operates to ensure that it can work correctly on the replica. Workers that do not maintain state within the thread become stateless workers.

The order of work is uncertain

Another disadvantage of the parallel work model is that the order of the jobs is uncertain and there is no guarantee which jobs will be executed first or last. Task A is assigned to workers before task B, but task B may be executed before task A.

assembly line

The second concurrency model is what we often encounter in the production workshop 流水线并发模型, the following is the flow chart of the assembly line design model

This organizational structure is like the workers in the assembly line in the factory. Each worker only completes a part of the total work. After completing a part, the worker forwards the work to the next worker.

Each program runs in its own thread and does not share state with each other. This model is also known as the shared nothing concurrency model.

Using the pipeline concurrency model is usually designed 非阻塞I/O, that is, when no tasks are assigned to the worker, the worker will do other work. Non-blocking I/O means that when a worker starts an I/O operation, such as reading a file from the network, the worker will not wait for the completion of the I/O call. Because I/O operations are slow, waiting for I/O is time consuming. While waiting for I/O, the CPU can do other things, and the result of the I/O operation will be passed to the next worker. The following is a flowchart of non-blocking I/O

In actual situations, tasks usually do not flow along an assembly line. Since most programs need to do a lot of things, they need to flow between different workers according to the different tasks completed, as shown in the figure below.

The task may also require multiple workers to participate in completion

Responsive-Event Driven System

The system uses the pipeline model is sometimes referred to 响应式or 事件驱动系统, this model will respond according to the external event, the event may be an HTTP request or completion of a file is loaded into memory.

Actor model

In the Actor model, each Actor is actually a Worker, and each Actor can handle tasks.

Simply put, the Actor model is a concurrency model that defines a series of general rules for how system components should act and interact. The most famous programming language that uses this set of rules is Erlang. A participant Actorresponds to the received message, and then can create more actors or send more messages, while preparing to receive the next message.

Channels model

In Channel model, worker normally does not directly communicate with the other hand, they usually sends an event to a different 通道(Channel), and then another worker can acquire information on these channels, the following is a model diagram Channel

Sometimes workers do not need to know exactly who the next worker is, they only need to write the author into the channel, and the worker listening to the Channel can subscribe or unsubscribe, which reduces the coupling between the worker and the worker.

Advantages of assembly line design

Compared with the concurrent design model, the pipeline model has some advantages, the specific advantages are as follows

No shared state

Because the pipeline design can ensure that the worker is passed to the next worker after the processing is completed, there is no need to share any state between the worker and the worker, and there is no need to consider the concurrency problems caused by concurrency. You can even think of each worker as a single thread in terms of implementation.

Stateful worker

Because workers know that no other threads modify their own data, workers in pipeline design are stateful. Stateful means that they can keep the data they need to operate in memory. Stateful is usually faster than stateless.

Better hardware integration

Because you can think of the pipeline as single-threaded, and the advantage of single-threaded work is that it can work in the same way as hardware. Because stateful workers usually cache data in the CPU, they can access the cached data faster.

Make tasks more effective

The tasks in the pipeline concurrency model can be sorted, which is generally used for log writing and recovery.

Disadvantages of pipeline design

The disadvantage of the pipeline concurrency model is that the task will involve multiple workers, so it may be scattered in multiple classes of the project code. Therefore, it is difficult to determine which task each worker is performing. It is also difficult to write pipelined code, and the code that designs many nested callback handlers is usually called 回调地狱. Callback hell is hard to track debug.

Functional parallelism

The functional parallel model is a concurrency model that has only recently been proposed. Its basic idea is to use function calls to implement it. The transmission of messages is equivalent to a function call. The parameters passed to the function will be copied, so any entity outside the function cannot manipulate the data in the function. This makes the function perform a similar 原子operation. Each function call can be executed independently of any other function call.

When each function call is executed independently, each function can be executed on a separate CPU. That is to say, functional parallelism is equivalent to each CPU performing its own tasks separately.

In JDK 1.7 ForkAndJoinPoolclass implements the function of the parallel function. Java 8 introduced the concept of stream, and parallel streams can also be used to iterate a large number of collections.

The difficulty of functional parallelism is to know the calling process of functions and which CPUs execute which functions. Cross-CPU function calls will bring additional overhead.

Hello, I’m cxuan. I have written four PDFs by myself. They are Java basic summary, HTTP core summary, computer basic knowledge, and operating system core summary. I have compiled them into PDFs. You can follow the public account Java builders to reply to the PDF to receive Quality information.

Guess you like

Origin blog.csdn.net/qq_36894974/article/details/107971271