The difference between multithreading and multiprocessing

Fish or bear's paw: Talking about the choice of multi-process and multi-threading

Regarding multi-process and multi-threading, the most classic sentence in textbooks is "A process is the smallest unit of resource allocation, and a thread is the smallest unit of CPU scheduling." This sentence is basically enough for the exam, but if you encounter Similar selection problems, it is not so simple. If you choose not well, you will suffer greatly.

 

I often see some XDJM on the Internet asking "multi-process or multi-thread is better?", "Multi-process or multi-thread under Linux?" and so on. I can only say: there is no best, only more Okay. According to the actual situation, which one is more suitable is the better one.

 

Let's take a look at the comparison of multi-threading and multi-process according to different dimensions (Note: Because it is a perceptual comparison, it is relative, not to say that one is extremely good, and the other is unbearable).

 

 

1) Priority threads that need to be frequently created and destroyed

See the comparison above for the reasons.

The most common application of this principle is a web server. A connection is established to establish a thread, and the thread is destroyed when it is disconnected. If a process is used, the cost of creation and destruction is unbearable.

2) Preferential use of threads that require a lot of computation

The so-called large-scale calculation, of course, consumes a lot of CPU, and the switching is frequent. In this case, the thread is the most suitable.

The most common principles of this kind are image processing and algorithm processing.

3) Strongly related processing threads, weakly related processing processes

What is strong correlation and weak correlation? It is difficult to define in theory, but a simple example will make it clear.

The general server needs to complete the following tasks: message sending and receiving, message processing. "Message sending and receiving" and "message processing" are weakly related tasks, and "message processing" may be further divided into "message decoding" and "business processing". These two tasks are relatively more closely related. Therefore, "message sending and receiving" and "message processing" can be designed by process, and "message decoding" and "service processing" can be designed by thread.

Of course, this division method is not static and can also be adjusted according to the actual situation.

4) It may be extended to multi-machine distributed processes and multi-core distributed threads

See the comparison above for the reasons.

5) When all the needs are met, use the method you are most familiar with and best at

As for the so-called "complexity and simplicity" in the dimensions of "data sharing, synchronization", "programming, debugging" and "reliability", I can only say that there is no clear choice. But I can tell you a selection principle: if both multiprocessing and multithreading can meet the requirements, then choose the one you are most familiar with and good at. 

It should be reminded that although I have given so many selection principles, the practical application is basically a combination of "process + thread". Don't really fall into an either-or misunderstanding.

 

Consume resources:

From the kernel's point of view, the purpose of a process is to act as the basic unit for allocating system resources (CPU time, memory, etc.). A thread is an execution flow of a process and is the basic unit of CPU scheduling and dispatch. It is a basic unit smaller than a process that can run independently.

Threads, they use the same address space with each other, share most of the data, the space spent to start a thread is much less than the space spent to start a process, and the time required to switch between threads is also much less than the process time required to switch between. According to statistics, in general, the overhead of a process is about 30 times that of a thread. Of course, on specific systems, this data may be quite different.

communication method:

Data transfer between processes can only be done by means of communication, which is time-consuming and inconvenient. Most of the thread time data is shared (not shared within the thread function), which is fast and convenient. But data synchronization requires locks, especially for static variables

Thread its own advantages:

Improves application responsiveness; makes multi-CPU systems more efficient. The operating system will ensure that when the number of threads is not greater than the number of CPUs, different threads run on different CPUs;

Improve program structure. A long and complex process can be considered to be divided into multiple threads and become several independent or semi-independent running parts. Such a program will be easier to understand and modify.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324629828&siteId=291194637