Linux Study Notes (2) - Introduction and Examples of Processes and Threads
foreword
This article introduces the concept and comparison of process and thread in detail, both of which are used to improve the execution efficiency of the program.
Tip: The following is the text of this article, and the following cases are for reference
Introduction to Multitasking
-
For example, the parallel download of Baidu network disk, the operating system is a multi-tasking operating system
-
Benefits: Use cpu resources to improve program execution efficiency
Two forms of expression:
- Concurrency: Execute multiple tasks alternately within a period of time (the number of tasks > the number of cores of the cpu)
- Example: Multiple tasks are executed alternately , A0.01s->B0.01s->C0.01s->A0.01s
- Parallelism: Execute multiple tasks together within the same period of time (the number of tasks <= the number of cores of the cpu)
- Example: For a multi-core cpu to handle multitasking, the operating system will arrange a task for each core of the cpu, and multiple cores will execute multiple tasks at the same time
Introduction to the process
The way to achieve multitasking is to use processes
def: Process (Process) is the smallest unit of resource allocation. It is the basic unit for resource allocation and scheduling operation of the operating system . Generally speaking, a running program is a process.
- For example: WeChat, QQ (task manager can view)
The role of multi-process (a way to achieve multi-tasking): to improve the efficiency of program execution
Multi-process to complete multi-task
-
Process creation steps
- import process package
- import multiprocessing
- Create a process object through the process class
- process object = multiprocessing.Process()
- Start the process to execute the task
- process object.start()
- import process package
-
Create a process object through the process class
Process object = multiprocessing.Process(target = task name)
parameter name illustrate target The name of the target task to execute, here refers to the function name (method name) name Process name, usually not set group Process group, currently only None can be used -
Process creation and startup code
# 创建子进程 coding_process = multiprocessing.Process(target = coding) # 创建子进程 music_process = multiprocessing.Process(target = music) # 启动进程 coding_process.start() music_process.start()
-
the case
import time import multiprocessing #编写代码 def coding(): for i in range(3): print("Coding...") time.sleep(0.2) def music(): for i in range(3): print("music...") time.sleep(0.2) if __name__ == '__main__': # coding() # music() # 创建进程对象 coding_process = multiprocessing.Process(target=coding) music_process = multiprocessing.Process(target=music) #启动进程 coding_process.start() music_process.start()
#执行结果如下 music... Coding... music... Coding... music... Coding...
-
knowledge points
Keep in mind the order of creation processes
-
A process executes a task with parameters
args: tuple form ( note the order of parameters )
kwargs: dictionary form ( note that the key must be consistent with the formal parameter name )
import time import multiprocessing #编写代码 def coding(num, name): for i in range(num): print(name) print("Coding...") time.sleep(0.2) def music(count): for i in range(count): print("music...") time.sleep(0.2) if __name__ == '__main__': # coding() # music() # 创建进程对象 coding_process = multiprocessing.Process(target=coding, args = (3,"lalal")) music_process = multiprocessing.Process(target=music,kwargs = { "count" : 2}) #启动进程 coding_process.start() music_process.start()
-
get process id
-
Get the current process number
getpid() method
-
Get the current parent process number
getppid() method
-
example
import time import multiprocessing import os def work(): #获取当前进程的编号 print("work进程编号:",os.getpid()) #获取当前父进程的编号 print("work进程编号:",os.getppid()) #编写代码 def coding(num, name): print("coding>>%d:"%os.getpid()) print("coding_father>>%d:"%os.getppid()) for i in range(num): print(name) print("Coding...") time.sleep(0.2) def music(count): print("music>>%d:"%os.getpid()) print("music_father>>%d:"%os.getppid()) for i in range(count): print("music...") time.sleep(0.2) if __name__ == '__main__': # coding() # music() # 创建进程对象 # coding_process = multiprocessing.Process(target=coding, args = (3,"lalal")) music_process = multiprocessing.Process(target=music,kwargs = { "count" : 2}) #启动进程 coding_process.start() # music_process.start() #
-
There are three processes in the example. The program starts to create the main process, and then creates two sub-processes, coding and music. Referring to the execution results, the parent process numbers of the two child processes are the same.
>>> music>>30304: coding>>43616: coding_father>>32496: lalal Coding... music_father>>32496: music... lalalmusic... Coding... lalal Coding...
-
-
Global variables are not shared between processes
-
In fact, creating a child process is to copy the resources of the main process to generate a new process , where the main process and the child process are independent of each other.
-
example:
#定义global varible import multiprocessing import time my_list = list() def write_data(): for i in range(3): my_list.append(i) print("add",i) print("write_data",my_list) def read_data(): print("read_data",my_list) if __name__ == "__main__": #创建写入数据的进程 write_process = multiprocessing.Process(target = write_data) read_process = multiprocessing.Process(target = read_data) #启动进程 write_process.start() time.sleep(1) #主进程等待写入进程执行完成以后的代码,再继续向往下执行 #write_process.join() read_process.start()
#执行结果 add 0 add 1 add 2 write_data [0, 1, 2] >>> read_data []
The execution results show that the child processes are independent of each other and do not share global variables. It's like a little monkey turned from Monkey King's two fine hairs. Creating a child process will copy the resources of the main process, that is to say, the child process is a copy of the main process, like a pair of twins. The reason why the processes do not share global variables is because the global variables in the same process are not operated . It's just that the names of global variables in different processes are the same .
-
-
Main process and child process end order
-
For example WeChat:
- The main process opens WeChat, opens the sub-process chat window, and when closing WeChat, first closes the chat window and then closes WeChat.
-
example:
def work(): for i in range(10): print("工作中。。") time.sleep(0.2) if __name__ == '__main__': # 创建进程对象 work_process = multiprocessing.Process(target=work) # 启动进程 work_process.start() time.sleep(1) print("主进程结束")
#执行结果表明:主进程会等待子进程结束后,再终结程序 工作中。。 工作中。。 工作中。。 工作中。。 主进程结束 工作中。。 工作中。。 工作中。。 工作中。。 工作中。。 工作中。。
-
Set up the guardian main process and destroy the child process (you can ensure that the child process is destroyed after the main process ends)
-
example:
-
After setting the guardian main process, the child process will be destroyed directly after the main process exits, and the code in the child process will no longer be executed.
def work(): for i in range(10): print("工作中。。") time.sleep(0.2) if __name__ == '__main__': # 创建进程对象 work_process = multiprocessing.Process(target=work) # 启动守护主进程 work_process.daemon = True # 启动进程 work_process.start() time.sleep(1) print("主进程结束")
if __name__ == '__main__': # 创建进程对象 work_process = multiprocessing.Process(target=work) # 启动进程 work_process.start() time.sleep(1) # 手动销毁子进程 work_process.terminate() print("主进程结束")
#上述代码得可直接结束主进程 工作中。。 工作中。。 工作中。。 工作中。。 主进程结束
Introduction to threads
A process is the smallest unit of resource allocation . Once a process is created, certain resources will be allocated, just like two people need to open two qq software to chat on QQ.
-
Why use multithreading:
Thread is the smallest unit of program execution . In fact, the process is only responsible for allocating resources, and it is the thread that uses these resources to execute the program. That is to say, the process is the thread container. At least one thread in a process is responsible for executing the program. At the same time, the thread itself Does not own system resources, only needs a few resources that are essential in operation, but it can be used with **Other threads belonging to the same process share all resources owned by the process**. This is like opening multiple windows (multiple threads) to chat with multiple people through one QQ software (one process), which saves resources while realizing multitasking.
-
The role of multithreading
Multi-threading to complete multi-tasking
-
Thread creation steps
-
import thread module
import threading
-
Create a thread object through the thread class
Thread object = threading.Thread(target = task name)
parameter name illustrate target The name of the target task to execute, here refers to the function name (method name) name Process name, usually not set group Process group, currently only None can be used -
Start the thread to execute the task
thread object.start()
-
-
Code for thread creation and startup
# 创建子进程 coding_process = threading.Thread(target = coding) # 创建子进程 music_process = threading.Thread(target = music) # 启动进程 coding_process.start() music_process.start()
-
the case
import time import threading import os #编写代码 def coding(): for i in range(3): print("Coding...") time.sleep(0.2) def music(): for i in range(3): print("music...") time.sleep(0.2) if __name__ == '__main__': # coding() # music() #创建子线程 coding_thread = threading.Thread(target = coding) #创建子线程 music_thread = threading.Thread(target=music) coding_thread.start() music_thread.start()
#执行结果 Coding... music... Coding... music... Coding... music...
-
A thread executes a task with parameters (same process, not an example)
-
End order of main thread and child thread
-
The main thread will wait for all the sub-threads to finish before ending
-
example
import time import threading import os def work(): for i in range(10): print("工作中。。") time.sleep(0.2) if __name__ == '__main__': # coding() # music() # 创建进程对象 # work_thread = threading.Thread(target=work) #启动进程 work_thread.start() time.sleep(1) print("主线程结束")
#结果同进程范例 工作中。。 工作中。。 工作中。。 工作中。。 工作中。。 主线程结束 工作中。。 工作中。。 工作中。。 工作中。。 工作中。。
-
-
Setting up the guardian main thread and destroying sub-threads (same process example)
import time import threading def work(): for i in range(10): print("work..") time.sleep(0.2) if __name__ == "__main__": work_thread = threading.Thread(target = work, daemon= True) work_thread.start() time.sleep(1) print("主线程执行完毕")
import time import threading def work(): for i in range(10): print("work..") time.sleep(0.2) if __name__ == "__main__": work_thread = threading.Thread(target = work) work_thread.setDaemon(True) work_thread.start() time.sleep(1) print("主线程执行完毕")
The purpose of setting the guardian main thread is that the main thread exits the sub-thread and destroys it , preventing the main thread from waiting for the sub-thread to execute
There are two ways to set the daemon main thread:
1.threading.Thread(tarhet = name, daemon = True)
2. Thread object.setDaemon(True)
-
Execution order among threads
-
When threads are executed out of order , it is determined by CPU scheduling that a thread will be executed first
-
Get current thread information
# 通过current_thread方法获取线程对象 current_thread = threading.current_thread() # 通过打印current_thread可以知道当前的线程信息,例如被创建的顺序 print(current_thread)
import time import threading def get_info(): time.sleep(0.5) # 获取线程信息 current_thread = threading.current_thread() print(current_thread) if __name__ == '__main__': # 创建子线程 for i in range(10): sub_thread = threading.Thread(target = get_info) #注意是函数名称 sub_thread.start()
#结果 <Thread(Thread-1, started 20796)> <Thread(Thread-2, started 20740)> <Thread(Thread-3, started 38556)> <Thread(Thread-5, started 27744)> <Thread(Thread-4, started 13812)> <Thread(Thread-9, started 41968)> <Thread(Thread-6, started 26052)> <Thread(Thread-7, started 11520)> <Thread(Thread-8, started 10692)> <Thread(Thread-10, started 36928)>
-
-
Share global variables between threads
import threading import time # 定义global varible my_list = list() def write_data(): for i in range(3): my_list.append(i) print("add", i) print("write_data", my_list) def read_data(): print("read_data", my_list) if __name__ == "__main__": # 创建写入数据的进程 write_thread = threading.Thread(target = write_data) read_thread = threading.Thread(target = read_data) # 启动进程 write_thread.start() time.sleep(1) # 主进程等待写入进程执行完成以后的代码,再继续向往下执行 # write_thread.join() read_thread.start()
# 执行结果,验证了线程间是共享全局变量的 add 0 add 1 add 2 write_data [0, 1, 2] read_data [0, 1, 2]
-
There is an error problem when sharing global variable data between threads
-
need
- Define two functions to achieve 1 million loops, and give the global variable +1 after each loop
- Create two sub-threads to execute the corresponding two functions, and view the calculated results
glob_v = 0 # 对全局变量加1 def sum_num1(): for i in range(1000000):# 分别测试 十万 与 一百万 global glob_v glob_v += 1 print("glob_v1",glob_v) def sum_num2(): for i in range(1000000): global glob_v glob_v += 1 print("glob_v2",glob_v) if __name__ == "__main__": sum1_thread = threading.Thread(target=sum_num1) sum2_thread = threading.Thread(target=sum_num2) # 启动线程 sum1_thread.start() sum2_thread.start()
#执行结果 # 十万 glob_v1 100000 glob_v2 200000 # 百万(出现错误) glob_v1 1166860 glob_v2 1260019
When a thread is working, another thread may also process the global variable while one thread is processing it, resulting in the addition of 1 being performed twice, but the actual value is only increased by 1.
-
solution
- Thread synchronization, coordination and synchronization, running in a predetermined order, like walkie-talkies in real life, half-duplex communication
- Use thread synchronization: ensure that only one thread can operate global variables at the same time
-
-
Thread synchronization method: mutual exclusion lock
-
def : Lock the shared data to ensure that only one thread can operate at the same time.
-
Note: Mutual exclusion locks are grabbed by multiple threads together . The ones that grab the lock are executed first, and the threads that do not grab the lock wait. After they are used up and released, other waiting ones will grab the lock next time.
-
Steps for usage
-
Mutex creation
mutex = thread.Lock()
-
locked
mutex.acquire()
-
unlock
mutex.relase()
-
-
Routine:
# 定义全局变量 glob_v = 0 # 对全局变量加1 def sum_num1(): # 上锁 mutex.acquire() for i in range(1000000):# 分别测试 十万 与 一百万 global glob_v glob_v += 1 # 解锁 mutex.release() print("glob_v1",glob_v) def sum_num2(): # 上锁 mutex.acquire() for i in range(1000000): # 分别测试 十万 与 一百万 global glob_v glob_v += 1 # 解锁 mutex.release() print("glob_v2",glob_v) if __name__ == "__main__": #创建锁 mutex = threading.Lock() sum1_thread = threading.Thread(target=sum_num1) sum2_thread = threading.Thread(target=sum_num2) # 启动线程 sum1_thread.start() sum2_thread.start() #执行结果 # glob_v1 1000000 # glob_v2 2000000
-
-
deadlock
- The situation of waiting for the other party to release the lock is a deadlock
- Result: It will cause the application to stop responding, and will not be processed for other tasks!
Pay attention to releasing the lock in actual work.
-
process vs thread
process thread Relationship comparison A process defaults to one thread, but multiple Threads are attached to processes, there are no threads without processes difference contrast 1. Global variables cannot be shared
2. Creating a process resource is expensive
3. The process is the basic unit of system resources1. Global variables can be used interoperably to prevent resource competition and deadlocks. Available solutions: mutex or thread synchronization
2. The resource overhead of creating threads is very small
3. Threads are the basic unit of CPU scheduling
4. Threads cannot run automaticallyComparison of advantages and disadvantages Can use multi-core, resource overhead The resource overhead is small, and multi-core cannot be used
-
-