Article directory
Series FeaturedPython
crawler is slow? Learn about concurrent programming
daemon thread
In Python
multi-threading, after the code of the main thread is finished running, if there are other sub-threads that have not finished executing, the main thread will wait for the sub-threads to finish executing before ending Python
; Take a look at an example.
import threading
import time
# 非守护线程
def normal_thread():
for i in range(10000):
time.sleep(1)
print(f'normal thread {
i}')
print(threading.current_thread().name, '线程开始')
thread1 = threading.Thread(target=normal_thread)
thread1.start()
print(threading.current_thread().name, '线程结束')
From the above results, it can be seen that although the main thread ( MainThread
) has ended, the sub-threads are still running. After the sub-threads run, the whole program really ends . Then, if you want to terminate other unfinished threads at the same time as the main thread ends, you can set the thread as a daemon thread . If only the daemon thread is still executing in the program and the main program also ends, then Python
the program can exit normally. threading
The module provides two ways of setting daemon threads.
threading.Thread(target=daemon_thread, daemon=True)
thread.setDaemon(True)
import threading
import time
# 守护线程(强制等待1s)
def daemon_thread():
for i in range(5):
time.sleep(1)
print(f'daemon thread {
i}')
# 非守护线程(无强制等待)
def normal_thread():
for i in range(5):
print(f'normal thread {
i}')
print(threading.current_thread().name, '线程开始')
thread1 = threading.Thread(target=daemon_thread, daemon=True)
thread2 = threading.Thread(target=normal_thread)
thread1.start()
# thread1.setDaemon(True)
thread2.start()
print(threading.current_thread().name, '线程结束')
The above thread1
set is a daemon thread, and the program ends directly after the non-daemon thread and the main thread ( MainThread
) run, so daemon_thread()
the output statement in does not have time to execute. The output in the figure shows that the content in MainThread
the function is still being output after the thread ends normal_thread()
, because it will take some time for the process from the end of the main thread to the forced stop of the daemon thread .
Daemon Thread Inheritance
Child threads will inherit daemon
the properties of the current thread, and the main thread is a non- , so the newly created threads in the main thread are also non-daemon threads by default , but when a new thread is created in a daemon thread , it will inherit daemon
the properties of the current thread, and the child thread is also a daemon thread .
join() blocking
In a multi-threaded crawler, the information of different pages is generally crawled through multiple threads at the same time, and then it is analyzed and processed in a unified manner, and the statistics are stored. This requires waiting for all sub-threads to finish executing before continuing the following processing, which requires the use of the method join()
.
join()
The function of the method is to block (suspend) other threads (unstarted threads and the main thread), and wait for the called thread to finish running before waking up other threads to run. Look at an example.
import threading
import time
def block(second):
print(threading.current_thread().name, '线程正在运行')
time.sleep(second)
print(threading.current_thread().name, '线程结束')
print(threading.current_thread().name, '线程正在运行')
thread1 = threading.Thread(target=block, name=f'thread test 1', args=[3])
thread2 = threading.Thread(target=block, name=f'thread test 2', args=[1])
thread1.start()
thread1.join()
thread2.start()
print(threading.current_thread().name, '线程结束')
The above is only for thread1
use join()
, pay attention to join()
the position where is used, it is thread2.start()
executed before the start, after execution thread2
and the main thread are suspended, only thread1
after the thread execution is completed, thread2
and the main thread will be executed, because thread2
is not a daemon thread, so when the main thread ( MainThread
) is executed, thread2
it will continue to run.
Seeing this, do you have any questions? If the execution process of the above code is followed, the entire program becomes a single-threaded program completely, which is join()
caused by improper use of . Let's modify the code above a little bit.
import threading
import time
def block(second):
print(threading.current_thread().name, '线程正在运行')
time.sleep(second)
print(threading.current_thread().name, '线程结束')
print(threading.current_thread().name, '线程正在运行')
thread1 = threading.Thread(target=block, name=f'thread test 1', args=[3])
thread2 = threading.Thread(target=block, name=f'thread test 2', args=[1])
thread1.start()
thread2.start()
thread1.join()
print(threading.current_thread().name, '线程结束')
Now the program is truly multi-threaded. join()
When the method is used at this time, only the main thread is suspended, and thread1
the main thread will be executed after the execution is completed.
Finally, it needs to be explained that join()
the blocking of the method is regardless of the object, and has nothing to do with whether the daemon thread is the main thread or not. When using it, you need to pay attention. If you want to run with real multi-threads, you must start all the sub-threads and then call them join()
, otherwise it will become a single thread!
That's all for this article, if it feels good. ❤ Like and let's go! ! ! ❤
For those who are just getting startedPython
or want to get startedPython
, you can search [Python New Horizons] on WeChat to exchange and learn together. They are all coming from beginners. Sometimes a simple question is stuck for a long time, but it may suddenly become clear after a few touches from others. I sincerely hope that everyone can make progress together.