Multithreading of python processes and threads

Python study notes, special records, share with you, I hope it will be helpful to everyone.

Multithreading

Multitasking can be completed by multiple processes, or by multiple threads within a process.

We mentioned earlier that a process is composed of several threads, and a process has at least one thread.

Because threads are execution units directly supported by the operating system, high-level languages usually have built-in multi-threading support, and Python is no exception, and Python threads are real Posix Threads, not simulated threads.

Python's standard library provides two modules: _thread and threading. _thread is a low-level module, and threading is a high-level module, which encapsulates _thread. In most cases, we only need to use the advanced module of threading.

To start a thread is to pass in a function and create a Thread instance, and then call start() to start execution:

import time, threading

# 新线程执行的代码
def loop():
    print 'thread %s is running...' % threading.current_thread().name
    n = 0
    while n < 5:
        n = n + 1
        print 'thread %s >>> %s' % (threading.current_thread().name, n)
        time.sleep(1)
    print 'thread %s ended.' % threading.current_thread().name

print 'thread %s is running...' % threading.current_thread().name
t = threading.Thread(target=loop, name='LoopThread')
t.start()
t.join()
print 'thread %s ended.' % threading.current_thread().name

The execution results are as follows:

thread MainThread is running...
thread LoopThread is running...
thread LoopThread >>> 1
thread LoopThread >>> 2
thread LoopThread >>> 3
thread LoopThread >>> 4
thread LoopThread >>> 5
thread LoopThread ended.
thread MainThread ended.

Process finished with exit code 0

Since any process will start a thread by default, we call this thread the main thread, and the main thread can start a new thread. Python's threading module has a current_thread() function, which always returns an instance of the current thread. The name of the main thread instance is MainThread, the name of the child thread is specified when it is created, and we use LoopThread to name the child thread. The name is only used for display when printing, and has no other meaning at all. If you can't afford the name, Python will automatically name the thread Thread-1, Thread-2...

Lock

The biggest difference between multi-threading and multi-process is that in multi-process, the same variable has a copy in each process and does not affect each other. In multi-thread, all variables are shared by all threads, so any one Variables can be modified by any thread. Therefore, the biggest danger of sharing data between threads is that multiple threads change a variable at the same time, and the content is changed.

Let's take a look at how multiple threads manipulate a variable at the same time to change the content:

import time, threading

# 假定这是你的银行存款:
balance = 0

def change_it(n):
    # 先存后取，结果应该为0:
    global balance
    balance = balance + n
    balance = balance - n

def run_thread(n):
    for i in range(100000):
        change_it(n)

t1 = threading.Thread(target=run_thread, args=(5,))
t2 = threading.Thread(target=run_thread, args=(8,))
t1.start()
t2.start()
t1.join()
t2.join()
print(balance)

We define a shared variable balance with an initial value of 0, and start two threads, store first and then fetch, theoretically the result should be 0, but because thread scheduling is determined by the operating system, when t1 and t2 are executed alternately At this time, as long as the number of cycles is sufficient, the result of the balance is not necessarily 0.

The reason is because a statement in a high-level language is several statements when the CPU is executed, even a simple calculation:

balance = balance + n

It is also divided into two steps:

Calculate balance + n and store it in a temporary variable;
Assign the value of the temporary variable to balance.

It can be viewed as:

x = balance + n
balance = x

Since x is a local variable, each of the two threads has its own x. When the code is executed normally:

初始值 balance = 0

t1: x1 = balance + 5 # x1 = 0 + 5 = 5
t1: balance = x1     # balance = 5
t1: x1 = balance - 5 # x1 = 5 - 5 = 0
t1: balance = x1     # balance = 0

t2: x2 = balance + 8 # x2 = 0 + 8 = 8
t2: balance = x2     # balance = 8
t2: x2 = balance - 8 # x2 = 8 - 8 = 0
t2: balance = x2     # balance = 0
    
结果 balance = 0

But t1 and t2 run alternately. If the operating system executes t1 and t2 in the following order:

初始值 balance = 0

t1: x1 = balance + 5  # x1 = 0 + 5 = 5

t2: x2 = balance + 8  # x2 = 0 + 8 = 8
t2: balance = x2      # balance = 8

t1: balance = x1      # balance = 5
t1: x1 = balance - 5  # x1 = 5 - 5 = 0
t1: balance = x1      # balance = 0

t2: x2 = balance - 8  # x2 = 0 - 8 = -8
t2: balance = x2   # balance = -8

结果 balance = -8

The reason is that multiple statements are required to modify the balance, and when these statements are executed, the thread may be interrupted, causing multiple threads to mess up the content of the same object.

When two threads deposit and withdraw at the same time, the balance may be wrong. You definitely don't want your bank deposit to become negative somehow. Therefore, we must ensure that when one thread modifies the balance, the other thread must not change it.

If we want to ensure that the balance calculation is correct, we must give change_it() a lock. When a thread starts to execute change_it(), we say that because the thread has acquired the lock, other threads cannot execute change_it() at the same time. You can only wait until the lock is released, and then you can change it after acquiring the lock. Since there is only one lock, no matter how many threads, at most only one thread holds the lock at the same time, so there will be no conflict of modification. Creating a lock is achieved by threading.Lock():

balance = 0
lock = threading.Lock()

def run_thread(n):
    for i in range(100000):
        # 先要获取锁:
        lock.acquire()
        try:
            # 放心地改吧:
            change_it(n)
        finally:
            # 改完了一定要释放锁:
            lock.release()

When multiple threads execute lock.acquire() at the same time, only one thread can successfully acquire the lock, and then continue to execute the code, and other threads continue to wait until the lock is acquired.

After the thread that acquired the lock is used up, the lock must be released, otherwise the thread that is waiting for the lock will wait forever and become a dead thread. So we use try...finally to ensure that the lock will be released.

The advantage of the lock is to ensure that a certain key code can only be completely executed by one thread from beginning to end. Of course, there are many disadvantages. First, it prevents concurrent execution of multiple threads. A certain piece of code that contains a lock can only be executed in a single-threaded mode. Implementation, the efficiency is greatly reduced. Secondly, because there can be multiple locks, different threads hold different locks and try to acquire the locks held by the other party, which may cause deadlocks, causing multiple threads to hang up, which can neither be executed nor ended. The operating system can only be forced to terminate.

Multi-core CPU

If you unfortunately have a multi-core CPU, you must be thinking that multi-core should be able to execute multiple threads at the same time.

If you write an infinite loop, what will happen?

Open the Activity Monitor of Mac OS X, or Task Manager of Windows, you can monitor the CPU usage of a certain process.

We can monitor that an endless loop thread will occupy a CPU 100%.

If there are two endless loop threads, in a multi-core CPU, you can monitor that it will occupy 200% of the CPU, that is, occupy two CPU cores.

If you want to run all the cores of the N-core CPU, you must start N endless loop threads.

Try to write an endless loop in Python:

import threading, multiprocessing

def loop():
    x = 0
    while True:
        x = x ^ 1

for i in range(multiprocessing.cpu_count()):
    t = threading.Thread(target=loop)
    t.start()

Start N threads with the same number of CPU cores. On a 4-core CPU, you can monitor that the CPU occupancy rate is only 102%, that is, only one core is used.

But using C, C++ or Java to rewrite the same infinite loop, you can directly run all the cores, 4 cores will run to 400%, and 8 cores will run to 800%. Why is Python not working?

Although the Python thread is a real thread, when the interpreter executes code, there is a GIL lock: Global Interpreter Lock. Before any Python thread executes, the GIL lock must be acquired. Then, every time 100 bytes of code are executed, the interpreter It automatically releases the GIL lock, allowing other threads to have a chance to execute. This GIL global lock actually locks the execution code of all threads. Therefore, multiple threads can only be executed alternately in Python. Even if 100 threads run on a 100-core CPU, only 1 core can be used.

GIL is a historical issue of the design of the Python interpreter. Usually the interpreter we use is the official implementation of CPython, and we must truly use multi-core unless we rewrite an interpreter without GIL.

So, in Python, you can use multiple threads, but don't expect to use multiple cores effectively. If you must use multiple cores through multithreading, it can only be achieved through C extensions, but this loses the simplicity and ease of use of Python.

However, don't worry too much. Although Python cannot use multi-threading to achieve multi-core tasks, it can achieve multi-core tasks through multiple processes. Multiple Python processes have their own independent GIL locks, which do not affect each other.