Python Multithreading Basics of Python Web Study Notes

multi-threaded understanding

Multithreading is a way of running multiple tasks at the same time. For example, in a loop, each loop is regarded as a task. We hope that the second loop can be started before the first loop runs, so as to save time.

The purpose of this simultaneous operation in python is to maximize the use of the computing power of the CPU and make use of a lot of waiting time. This also shows that if the program is time-consuming not because of waiting time, but because there are too many tasks, that is, it takes so long to calculate, then multi-threading cannot improve the running time.

For more information on multithreading understanding, please refer to the following materials

Liao Xuefeng Tutorial
Know the answer
There are a lot of explanations on Baidu, so I won't repeat them here.

Simple to use

Look at the following function

import time
def myfun():
    time.sleep(1)
    a = 1 + 1
    print(a)

If we want to run this function 10 times, its running time mainly depends on sleepthe second each time, and 1 + 1the calculation will not take much time. In this case, multithreading can be used to improve efficiency.

Let's take a look at the time-consuming without multi-threading and the time-consuming with multi-threading

Do not use multithreading

t = time.time()
for _ in range(5):
    myfun ()
print(time.time() - t)

The result is5.002434492111206

Below we use multithreading

from threading import Thread
for _ in range(5):
    th = Thread(target = myfun)
    th.start()

In this way, you can actually use multi-threading. You will find that about 1 second, 5 2will come out at the same time, indicating that the 5 loops are actually running almost at the same time.

Multithreading here only includes two steps

To add a thread, here is to use Threadeach loop as a new thread, and one thread executes a myfunfunction.
start()To start running this thread, each thread needs to be explicitly enabled in this way to run . After a thread is started in this way, you can continue to run the following program without waiting for it to finish running, that is, the next cycle (and then create a second thread, and start the third one before the operation ends...)

One thing to note here: multi-threading is placed inside the loop, and it cannot be turned into multi-threading from the outside after the loop is defined.

The reader may notice that time is calculated programmatically without multithreading, but not with multithreading. This is because some code needs to be added to calculate the time, and the simplest multi-threading cannot be shown, so the time is not calculated first. Next we will talk about join()the usage and calculate the time.

use of join

The thread join()method means that after the thread is finished running, the program will run again. Let's look at the following example

from threading import Thread
t = time.time()
for _ in range(5):
    th = Thread(target = myfun)
    th.start()
    th.join()
print (time.time() - t)
 #The result is 5.0047078132629395 seconds

start()Immediately after this join(), it means that each thread must run to the end before the next cycle can be performed, so it is no different from not using multithreading. However, if you want to calculate the multi-thread running time, you need to use thisjoin()

Let's take a look at join()the unneeded

from threading import Thread
t = time.time()
for _ in range(5):
    th = Thread(target = myfun)
    th.start()
print (time.time() - t)
 #The result is 0.0009980201721191406 seconds

It didn't even wait 1 second before outputting the result, and the 5 2s were output after printing this. This is because it print(time.time() - t)is different from the 6th thread other than the 5th loop thread, it does not wait for the 5th thread to finish running before it starts running. Therefore, it is impossible to obtain the running time of the above 5 threads. We need to join()wait for all 5 threads to finish running.

code show as below

from threading import Thread
t = time.time()
ths = []
for _ in range(5):
    th = Thread(target = myfun)
    th.start()
    ths.append(th)
for th in ths:
    th.join()
print (time.time() - t)
 #The result is 1.0038363933563232

The above definition thslist stores these threads, and finally uses a loop to ensure that each thread has finished running and then calculates the time difference.

join()Not just for this situation. When a step of code execution depends on the previous code execution to complete, a join()command is added.

Now that we have learned the general usage of multithreading, we can use it in most scenarios. Here are some details

other

(1) Thread name

Let's look directly at the code below

import threading
print(threading.current_thread().getName())
def myfun():
    time.sleep(1)
    print(threading.current_thread().name)
    a = 1 + 1
for i in range(5):
    th = threading.Thread(target = myfun, name = 'thread {}'.format(i))
    th.start()
# 输出结果
MainThread
thread 0
thread 1
thread 4
thread 3
thread 2

解释一下

threading.current_thread()表示当前线程，可以调用name或getName()获取线程名称
任何进程都会默认启动一个线程，默认名称为MainThread，也就是主程序占一个线程，这个线程和之后用Thread新加的线程是相互独立的，主线程不会等待其余线程运行结束就会继续往下运行。之前不用join()无法计算运行时间就是因为主线程先运行完了。
Thread表示运行这个函数启动一个新的线程，在其中加一个name参数指定这个函数线程名，则在这个函数内打印线程名就显示这里name参数对应值
在循环中打印有两种。第一种print(threading.current_thread().name)则是MainThread；第二种print(th.name)则是thread 1等

(2)Thread函数

上面我们使用了Thread函数的target name参数，下面来说一下它的其他参数

args指定target对应函数的参数，用元组传入，比如args = (3, )
daemon主线程默认是False，如果没有指定则继承父线程的值。True则如果主线程运行结束，该线程也停止运行；False则该线程会继续运行直到运行结束，无视主线程如何。（要看这个参数的效果要在py文件中编写代码，在cmd里运行，不能在jupyter notebook里，因为这里会多出一些线程干扰）
group是预留的一个参数，用于以后扩展ThreadGroup类，现在没用

(3)Thread对象

上面threading.Thread和threading.current_thread()都创建了一个Thread对象，Thread对象有如下属性和方法

getName() .name 获取线程名
setName() 设置线程名
start() join()这两个之前说过了
join()有一个timeout参数，表示等待这个线程结束时，如果等待时间超过这个时间，就不再等，继续进行下面的代码，但是这个线程不会被中断
run() 也是运行这个线程，但是必须等到这个线程运行结束才会继续执行之后的代码（如果将上面的start全换成run则相当于没有开多线程）
is_alive()如果该线程还没运行完，就是True否则False
daemon 返回该线程的daemon
setDaemon(True)设置线程的daemon

(4)threading

一些直接调用的变量

threading.currentThread(): 返回当前的线程变量
threading.enumerate(): 返回一个包含正在运行的线程的list
threading.activeCount(): 返回正在运行的线程数量，与len(threading.enumerate())有相同的结果

参考