Join() and setDaemon() in Python multithreading

Demo is the best teacher! ! !

Reference link: https://www.cnblogs.com/cnkai/p/7504980.html

Knowledge point one (setDaemon(False)):
When a process is started, a main thread will be generated by default, because the thread is the smallest unit of program execution flow. When multithreading is set, the main thread will create multiple child threads. In python , By default (actually setDaemon(False)), after the main thread finishes its task, the child thread will continue to perform its task until the end of its task.

import threading
import time


def run():
    time.sleep(2)
    print('当前线程的名字是: ', threading.current_thread().name)
    time.sleep(2)


if __name__ == '__main__':

    start_time = time.time()

    print('这是主线程:', threading.current_thread().name)
    thread_list = []
    for i in range(5):
        t = threading.Thread(target=run)
        thread_list.append(t)

    for t in thread_list:
        t.start()

    print('主线程结束!', threading.current_thread().name)
    print('一共用时:', time.time() - start_time)
这是主线程: MainThread
主线程结束! MainThread
一共用时: 0.001401662826538086
当前线程的名字是:  Thread-3
当前线程的名字是:  Thread-4
当前线程的名字是:  Thread-2
当前线程的名字是:  Thread-1
当前线程的名字是:  Thread-5

Process finished with exit code 0

Knowledge point two (setDaemon(True)):
When we use the setDaemon(True) method to set the child thread as a daemon thread, once the execution of the main thread ends, all threads will be terminated. The possible situation is that the child thread The task has not been completely executed, it is forced to stop.

import threading
import time


def run():
    time.sleep(2)
    print('当前线程的名字是: ', threading.current_thread().name)
    time.sleep(2)


if __name__ == '__main__':

    start_time = time.time()

    print('这是主线程:', threading.current_thread().name)
    thread_list = []
    for i in range(5):
        t = threading.Thread(target=run)
        thread_list.append(t)

    for t in thread_list:
        t.setDaemon(True)
        t.start()

    print('主线程结束了!', threading.current_thread().name)
    print('一共用时:', time.time() - start_time)
这是主线程: MainThread
主线程结束了! MainThread
一共用时: 0.002007722854614258

Process finished with exit code 0

Knowledge point three (setDaemon(False) and join):
At this time, the role of join is highlighted. The work done by join is thread synchronization, that is, after the main thread task ends, it enters the blocking state (and does not exit) and will wait forever After the execution of the other child threads is finished, the main thread terminates (exits). Therefore, it is not contradictory to use join() and setDaemon() in multiple threads at the same time.

import threading
import time


def run():
    time.sleep(2)
    print('当前线程的名字是: ', threading.current_thread().name)
    time.sleep(2)


if __name__ == '__main__':

    start_time = time.time()

    print('这是主线程:', threading.current_thread().name)
    thread_list = []
    for i in range(5):
        t = threading.Thread(target=run)
        thread_list.append(t)

    for t in thread_list:
        t.setDaemon(True)
        t.start()

    for t in thread_list:
        t.join()

    print('主线程结束了!', threading.current_thread().name)
    print('一共用时:', time.time() - start_time)
这是主线程: MainThread
当前线程的名字是:  Thread-1
当前线程的名字是:  Thread-3
当前线程的名字是:  Thread-2
当前线程的名字是:  Thread-5
当前线程的名字是:  Thread-4
主线程结束了! MainThread
一共用时: 4.003052711486816

Process finished with exit code 0

Knowledge point four (join(timeout=param)):
join has a timeout parameter: when the daemon thread is set, it means that the time the main thread waits for the timeout for the child thread will kill the child thread, and finally exit the program. So, if there are 10 child threads, the total waiting time is the cumulative sum of each timeout (10 * timeout). Simply put, it is to give each child thread a timeout time and let him execute it. When the time is up, no matter whether the task is completed or not, it will be killed directly.

import threading
import time


def run():
    time.sleep(2)
    print('当前线程的名字是: ', threading.current_thread().name)
    time.sleep(2)


if __name__ == '__main__':

    start_time = time.time()

    print('这是主线程:', threading.current_thread().name)
    thread_list = []
    for i in range(5):
        t = threading.Thread(target=run)
        thread_list.append(t)

    for t in thread_list:
        t.setDaemon(True)
        t.start()

    for t in thread_list:
        # 线程列表总共 5 个元素,所以主线程会在这儿等待 5 * 0.2 秒
        t.join(timeout=0.2)

    print('主线程结束了!', threading.current_thread().name)
    print('一共用时:', time.time() - start_time)
这是主线程: MainThread
主线程结束了! MainThread
一共用时: 1.0173823833465576

Process finished with exit code 0

When the daemon thread is not set, the main thread will wait for the accumulation of timeout and such a period of time. Once the time is up, the main thread ends, but the child thread is not killed (the child thread will be killed if the daemon is set). The time thread can continue to execute until the child threads are all finished and the program exits.

Finally, here is an example I encountered in my actual project:

Requirements: It is known that there is a src.rpm package. I only know that only one of the URLs that are spliced ​​with the address in conf.ini is valid. How to download this package quickly?

conf.ini :

[centos]
src_package1=https://vault.centos.org/7.6.1810/centosplus/Source/SPackages/
src_package2=https://vault.centos.org/7.6.1810/cloud/Source/openstack-ocata/
src_package3=https://vault.centos.org/7.6.1810/cloud/Source/openstack-ocata/common/
src_package4=https://vault.centos.org/7.6.1810/cloud/Source/openstack-pike/
src_package5=https://vault.centos.org/7.6.1810/opstools/Source/common/
src_package6=https://vault.centos.org/7.6.1810/cloud/Source/openstack-queens/
src_package7=https://vault.centos.org/7.6.1810/cloud/Source/openstack-rocky/
src_package8=https://vault.centos.org/7.6.1810/cloud/Source/openstack-stein/
import configparser
import os
import threading
import time


def download(download_url):
    global is_downloaded
    with pool_sema:
        status_code = os.system('wget -q %s' % download_url)
        if status_code == 0:
            # 如果已经成功下载,修改全局变量,方便主线程获取已下载信号后跳出循环,结束程序执行。
            is_downloaded = True


if __name__ == '__main__':
    global is_downloaded
    is_downloaded = False
    src_package = 'centos-release-opstools-1-4.el7.src.rpm'
    conf = configparser.ConfigParser()
    conf.read('conf.ini', encoding="UTF-8")
    thread_list = []
    max_connections = 10  # 定义最大线程数
    pool_sema = threading.BoundedSemaphore(max_connections)
    t1 = time.perf_counter()
    for i in range(1, 9):
        url = conf.get('centos', 'src_package' + str(i))
        download_url = os.path.join(url, src_package)
        thread = threading.Thread(target=download, args=(download_url,))
        thread_list.append(thread)

    # 设置守护,也即主线程(main函数)结束的话(下载超时或者源码包已经成功下载),
    # 所有子线程必须强制退出(后面的download_url就不必再去尝试 wget 了)
    for t in thread_list:
        t.setDaemon(True)
        t.start()
    # 主线程不需要等待子线程全部结束后才退出,
    # 只需要获取到某个子线程已经成功下载的信号(is_downloaded)之后即可自行退出
    # for t in thread_list:
    #     t.join()
    while True:
        t2 = time.perf_counter()
        # 下载超时后跳出循环,结束执行
        if t2 - t1 > 5 * 10:
            print('download timeout')
            break
        if is_downloaded:
            print('download success')
            break
    t2 = time.perf_counter()
    print('download costs %.2f seconds' % (t2 - t1))
[root@localhost insight-tool]# python3 test.py
download success
download costs 4.35 seconds

 

 

Guess you like

Origin blog.csdn.net/TomorrowAndTuture/article/details/114592674