Talking about the efficiency of multithreading in Python

Regarding multi-process development can use multi-core CPU but single-process multi-threaded development cannot use multi-core CPU?

Interpretation: It is essentially a problem with the official Cpython interpreter. The reason is that in the 1990s, under the background of a single-core CPU, Uncle Turtle was specially designed to solve many problems caused by malicious competition for resources between multiple threads. GIL lock. However, due to the development of technology and changes in computer hardware conditions, this problem is becoming more and more prominent in the context of multi-core CPUs. However, Uncle Turtle has feedback in the community: Taking into account the GIL lock, it has been implanted in the Cpython interpreter for many years. , And are inextricably linked with the interpreter, so forcible adjustment may affect the whole body and cause various problems in the Cpython interpreter. Therefore, Python's multi-threaded concurrency problem has not been effectively solved in moderation.

Then let's talk about the efficiency of multithreading in Python?

This should be viewed from two aspects, CPU-intensive operations and IO-intensive operations:

  • 1. CPU-intensive code (various loop processing, counting, etc. involve scientific computing business parts). In this case, the ticks count will soon reach the threshold, and then trigger the release and re-competition of the GIL (multiple threads switch back and forth) Of course it needs to consume resources), so multithreading under Cpython is not friendly to CPU-intensive code.

  • 2. IO-intensive code (file processing, web crawlers, etc.), multi-threading can effectively improve efficiency (single-threaded IO operation will perform IO waiting, causing unnecessary time waste, and opening multi-threading can wait while thread A , Automatically switch to thread B, without wasting CPU resources, thereby improving the efficiency of program execution). So Cpython's multithreading is more friendly to IO-intensive code.

  • 3. In python3.x, the GIL does not use ticks to count, but instead uses a timer (after the execution time reaches the threshold, the current thread releases the GIL), which is more friendly to CPU-intensive programs, but still does not solve the same problem caused by GIL Time can only execute one thread, so the efficiency is still not satisfactory.

  • 4. Multi-core multi-threading is worse than single-core multi-threading. The reason is that under single-core multi-threading, every time the GIL is released, the thread that wakes up can acquire the GIL lock, so it can execute seamlessly, but under multi-core, CPU0 releases GIL After that, the threads on other CPUs will compete, but the GIL may be immediately obtained by CPU0. As a result, the awakened threads on the other CPUs will be awake and wait until the switching time and then enter the waiting state, which will cause Thread thrashing (thrashing), resulting in lower efficiency

Summary: Back to the original question: We often hear veterans say: "If you want to make full use of multi-core CPUs under the Cpython interpreter, use multi-process". What is the reason?
The reason is: each process has its own independent GIL, which does not interfere with each other, so that it can be executed in parallel in a true sense, so in Cpython, the execution efficiency of multi-process is better than multi-threaded (only for multi-core CPU).

Guess you like

Origin blog.csdn.net/qq_41475067/article/details/112293009