Article directory
Preface
Previous article [Python] From Getting Started to Top—Multi-Threading (9) I have already talked about the difference between Python threads and processes and that Python threads have a GIL lock. Any thread needs to obtain the lock before execution. 因此Python多线程不能有效利用多核优势实现多任务
1.Multiple processes
1.fork() system call
To enable Python programs to implement multi-process ( multiprocessing
), we first understand 操作系统
the relevant knowledge.
-
Unix/Linux operating systems provide one
fork()系统调用
, and it is very special. Ordinary function calls are called once and returned once. Howeverfork()调用一次,返回两次
, because the operating system automatically makes父进程
a copy (called) of the current process (called子进程
), and then, respectively, in父进程和子进程内返回。
- The child process always returns 0, and the parent process returns the ID of the child process . The reason for this is that,
一个父进程可以fork出很多子进程
therefore, the parent process needs to write down the ID of the parent process个子进程的ID
, and the child process only needs to callgetppid()就
to get the ID of the parent process.
- The child process always returns 0, and the parent process returns the ID of the child process . The reason for this is that,
2.OS module
Python os模块
encapsulates common subprocesses 系统调用
, including fork
:
import os
print('Process (%s) start...' % os.getpid())
# Only works on Unix/Linux/Mac:
pid = os.fork()
if pid == 0:
print('I am child process (%s) and my parent is %s.' % (os.getpid(), os.getppid()))
else:
print('I (%s) just created a child process (%s).' % (os.getpid(), pid))
Since Windows does not have a fork call, 上面的代码在Windows上无法运行
. The Mac system is based on it BSD(Unix的一种)内核
, so there is no problem running it under Mac. It is recommended for everyone to use Mac学Python
!
- Yes
fork调用
, a process can do so when it receives a new task , which is复制出一个子进程来处理新任务
most commonly caused by .Apache服务器
父进程监听端口,每当有新的http请求时,就fork出子进程来处理新的http请求
3.multiprocessing module
If you plan to write a multi-process service program, this Unix/Linux
is undoubtedly the right choice. Since Windows does not have a fork call, is it impossible to write multi-process programs in Python on Windows?
-
Since Python is cross-platform, it should naturally provide one
跨平台的多进程支持
. The multiprocessing module is跨平台版本的多进程模块
.- The multiprocessing module provides
Process类
a进程对象
- The multiprocessing module provides
For example: start a child process and wait for it to end:
from multiprocessing import Process
import os
# 子进程要执行的代码
def run_proc(name):
print('Run child process %s (%s)...' % (name, os.getpid()))
if __name__=='__main__':
print('Parent process %s.' % os.getpid())
p = Process(target=run_proc, args=('test',))
print('Child process will start.')
p.start()
p.join()
print('Child process end.')
- When creating a child process, you only need to create one
Process实例
, pass it in执行函数和函数的参数
, and start it withstart()
a method. This makes creating a process evenfork()
simpler.- The join() method works
等待子进程结束后再继续往下运行
and is usually used进程间的同步
.
- The join() method works
4. Process pool (multiprocessing Pool module)
If you want to start a large number of child processes, you can 进程池
create child processes in batches using:
from multiprocessing import Pool
import os, time, random
def long_time_task(name):
print('Run task %s (%s)...' % (name, os.getpid()))
start = time.time()
time.sleep(random.random() * 3)
end = time.time()
print('Task %s runs %0.2f seconds.' % (name, (end - start)))
if __name__ == '__main__':
print('Parent process %s.' % os.getpid())
#创建长度为4的进程池
p = Pool(4)
#循环启动进程,传入调用函数和参数
for i in range(13):
p.apply_async(long_time_task, args=(i,))
print('Waiting for all subprocesses done...')
#关闭进程池,等待进程池所有子进程执行完毕
p.close()
p.join()
print('All subprocesses done.')
The execution results are as follows:
Code interpretation:
Pool对象
The calling join()
method will wait for all child processes to complete execution , which must be done before calling join() 先调用close(
. After calling close(), you cannot continue to add new processes .
- Please note that the output results
task 0,1,2,3 4 5
are executed immediately, while tasks 6-14 have to wait for a previous task to be completed before being executed. SincePool的默认大小是CPU的物理核数
my computer has 6 cores, you need to submit at least 7 sub-processes to see the above. waiting effect.
, therefore, 13 child processes are created, and up to 6 processes can be executed simultaneously . This is an intentional design limitation of the Pool, not a limitation of the operating system. If changed to:
5. Subprocess (subprocess module)
Many times, the child process is not itself, but one 外部进程
. After we create the subprocess, we also need to control the input and output of the subprocess.
subprocess模块
It allows us to start one very conveniently子进程
and then control it输入和输出。
For example: running a command in Python code (querying the resolution record of a specified type of domain name) nslookup www.python.org
, this has the same effect as running it directly from the command line:
import subprocess
print('$ nslookup www.baidu.com')
r = subprocess.call(['nslookup', 'www.baidu.com'])
print('Exit code:', r)
Code execution
command line execution
If the child process still needs input, it can communicate()
be input through the method:
import subprocess
print('$ nslookup')
p = subprocess.Popen(['nslookup'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output, err = p.communicate(b'set q=mx\nbaidu.com\nexit\n')
print(output.decode('gbk'))
print('Exit code:', p.returncode)
The above code is equivalent to executing the command nslookup on the command line and then manually entering:
set q=mx
baidu.com
exit
code execution
Command line execution
6. Inter-process communication
Processes definitely need to communicate with each other. The operating system provides many mechanisms to achieve inter-process communication. Python's multiprocessing module wraps the underlying mechanism and provides Queue、Pipes
a variety of ways to exchange data.
- Taking Queue as an example, create two child processes in the parent process, one writes data to the Queue, and the other reads data from the Queue:
from multiprocessing import Process, Queue
import os, time, random
# 写数据进程执行的代码:
def write(q):
print('Process to write: %s' % os.getpid())
#循环写入队尾
for value in ['1', '2', '3', '4', '5']:
print('Put %s to queue...' % value)
q.put(value)
time.sleep(random.random())
# 读数据进程执行的代码:
def read(q):
print('Process to read: %s' % os.getpid())
#循环读取队列队头数据
while True:
value = q.get(True)
print('Get %s from queue.' % value)
if __name__ == '__main__':
# 父进程创建Queue,并传给各个子进程:
q = Queue()
#写进程
pw = Process(target=write, args=(q,))
#读进程
pr = Process(target=read, args=(q,))
# 启动子进程pw,写入:
pw.start()
# 启动子进程pr,读取:
pr.start()
# 等待pw结束:
pw.join()
# pr进程里是死循环,无法等待其结束,只能强行终止:
pr.terminate()
7. Summary
7.1. Learning summary
-
Under Unix/Linux, you can use fork() to call multiple processes.
-
To achieve this
跨平台的多进程
, you can use the multiprocessing module.- Under Unix/Linux,
multiprocessing模块
it is encapsulatedfork()系统调用
so that we do not need to pay attention to the details of fork()- Since Windows does not have a fork call,
multiprocessing需要“模拟”出fork()的效果
all Python objects of the parent process must be passed to the parent processpickle序列化
,子进程去
so,如果multiprocessing在Windows下调用失败了,要先考虑是不是pickle失败了
.
- Since Windows does not have a fork call,
- Under Unix/Linux,
-
Inter-process communication is implemented through Queue, Pipes, etc.
7.2. Python distributed process error: pickle module cannot serialize lambda function
reason:
It turns out that it is caused by a problem with the Windows operating system, so we need to define our own functions to implement serialization.
- Modify the code slightly and define two functions
return_task_queue和return_result_queue实现序列化
# task_master.py
import random, time, queue
from multiprocessing.managers import BaseManager
# 发送任务的队列:
task_queue = queue.Queue()
# 接收结果的队列:
result_queue = queue.Queue()
#windows要我们自己定义函数,实现序列化,然后注册到 QueueManager.register,Unix/Linux不需要
def return_task_queue():
global task_queue
return task_queue
def return_result_queue():
global result_queue
return result_queue
# 从BaseManager继承的QueueManager:
class QueueManager(BaseManager):
pass
if __name__ == '__main__':
# 把两个Queue都注册到网络上,callable参数关联了Queue对象
QueueManager.register('get_task_queue', callable=return_task_queue)
QueueManager.register('get_result_queue', callable=return_result_queue)
# 绑定端口5000,设置验证码abc
manager = QueueManager(address=('127.0.0.1', 5000), authkey=b'abc')
# 启动queue
manager.start()
# 获得通过网络访问的Queue对象
task = manager.get_task_queue()
result = manager.get_result_queue()
# 放几个任务
for i in range(10):
n = random.randint(0, 1000)
print('添加任务 %d' % n)
task.put(n)
# 从result队列读取结果
print('尝试获取结果')
for i in range(10):
r = result.get(timeout=10)
print('结果是:%s' % r)
# 关闭
manager.shutdown()
print('master exit')
2. Distributed process
1.What is Python’s distributed process?
Python should prefer Process in the summary of threads and processes, because Process is more stable, and Process can be distributed to multiple machines, while threads can only be distributed to multiple CPUs on the same machine at most.
- Not only does Python
multiprocessing模块
support it多进程
, but **managers子模块
also supports it多进程分布到多台机器上
. **A service process can serve as a scheduler, distributing tasks to multiple other processes, relying on network communication.-
- **Since the managers module is well encapsulated, it can be easily written without knowing the details of network communication
分布式多进程程序
.
- **Since the managers module is well encapsulated, it can be easily written without knowing the details of network communication
-
For example: If we already have a Queue通信
multi-process program 同一台机器
running on the computer, now, due to the heavy workload of the process processing the task, we want to distribute the 发送任务的进程
sum 处理任务的进程
to 2 machines. How to implement it using distributed process?
- The original Queue can continue to be used , but,
通过managers模块把Queue通过网络暴露出去
others can机器的进程
access the Queue.
2. How to implement distributed processes
Write service process
-
服务进程负责启动Queue,把Queue注册到网络上,然后往Queue里面写入任务
:# task_master.py import random, time, queue from multiprocessing.managers import BaseManager # 发送任务的队列: task_queue = queue.Queue() # 接收结果的队列: result_queue = queue.Queue() # windows要我们自己定义函数,实现序列化,然后注册到 QueueManager.register,Unix/Linux不需要 def return_task_queue(): global task_queue return task_queue def return_result_queue(): global result_queue return result_queue # 从BaseManager继承的QueueManager: class QueueManager(BaseManager): pass if __name__ == '__main__': # 把两个Queue都注册到网络上, callable参数关联了Queue对象: QueueManager.register('get_task_queue', callable=return_task_queue) QueueManager.register('get_result_queue', callable=return_result_queue) # 绑定端口5000, 设置验证码'abc': manager = QueueManager(address=('127.0.0.1', 5000), authkey=b'abc') # 启动Queue: manager.start() # 获得通过网络访问的Queue对象: task = manager.get_task_queue() result = manager.get_result_queue() # 放几个任务进去: for i in range(10): n = random.randint(0, 10000) print('Put task %d...' % n) task.put(n) # 从result队列读取结果: print('Try get results...') for i in range(10): r = result.get(timeout=10) print('Result: %s' % r) # 关闭: manager.shutdown() print('master exit.')
important! ! ! ! !
- When we
一台机器
write a multi-process program on , the created Queue can直接拿来用
- However, in a distributed multi-process environment,
添加任务到Queue不可以直接对原始的task_queue进行操作
this bypasses the encapsulation of QueueManager必须通过manager.get_task_queue()获得的Queue接口添加
.
- However, in a distributed multi-process environment,
Write task process
- Start the task process on another machine (it can also be started on this machine):
# task_worker.py
import time, sys, queue
from multiprocessing.managers import BaseManager
# 创建类似的QueueManager:
class QueueManager(BaseManager):
pass
if __name__ == '__main__':
# 由于这个QueueManager只从网络上获取Queue,所以注册时只提供名字:
QueueManager.register('get_task_queue')
QueueManager.register('get_result_queue')
# 连接到服务器,也就是运行task_master.py的机器:
server_addr = '127.0.0.1'
print('Connect to server %s...' % server_addr)
# 端口和验证码注意保持与task_master.py设置的完全一致:
m = QueueManager(address=(server_addr, 5000), authkey=b'abc')
# 从网络连接:
m.connect()
# 获取Queue的对象:
task = m.get_task_queue()
result = m.get_result_queue()
# 从task队列取任务,并把结果写入result队列:
for i in range(10):
try:
n = task.get(timeout=1)
print('run task %d * %d...' % (n, n))
r = '%d * %d = %d' % (n, n, n * n)
time.sleep(1)
result.put(r)
except queue.Empty:
print('task queue is empty.')
# 处理结束:
print('worker exit.')
- The task process needs to connect to the service process through the network, so the IP of the service process must be specified.
Start the service process first to start execution
Put task 0...
Put task 1...
Put task 2...
Put task 3...
Put task 4...
Put task 5...
Put task 6...
Put task 7...
Put task 8...
Put task 9...
Try get results...#等待任务线程写入队列
- After the task_master.py process sends the task, it starts waiting
result队列
for the result.
Now start task_worker.py
the process:
Connect to server 127.0.0.1...
run task 0 * 0...
run task 1 * 1...
run task 2 * 2...
run task 3 * 3...
run task 4 * 4...
run task 5 * 5...
run task 6 * 6...
run task 7 * 7...
run task 8 * 8...
run task 9 * 9...
worker exit.
- The task_worker.py process ends, in the task_master.py process
会继续打印出结果
:
Master/Worker模型
What is the use of this simple one ?- In fact, this is a simple but truly distributed computing . By slightly modifying the code and starting multiple workers, you can replace
任务分布到几台甚至几十台机器上
, for example, the code for calculating n*n发送邮件
, and it is implemented邮件队列的异步发送
.
- In fact, this is a simple but truly distributed computing . By slightly modifying the code and starting multiple workers, you can replace
Where are Queue objects stored?
- Notice that
task_worker.py
there is no code to create a Queue at all, so the Queue object is stored intask_master.py进程
:
- The reason why Queue can be
网络
accessed is throughQueueManager
implementation. Because of QueueManage管理的不止一个Queue
, it is necessary to give each Queue网络调用接口起个名字
, such as get_task_queue.
What is the use of authkey?
保证两台机器正常通信,不被其他机器恶意干扰
. Iftask_worker.py
the authkey andtask_master.py
the authkey are inconsistent, the connection will definitely not be possible.
3. Summary
-
Python's distributed process interface is simple and well-encapsulated, making it suitable for environments where heavy tasks need to be distributed to multiple machines. ·
-
Note that the role of Queue is to
传递任务和接收结果
describe each task数据量要尽量小
. For example, when sending a task to process a log file, you do not need to send几百兆
the log file itself. Instead志文件存放的完整路径
, the Worker process reads the file from the shared disk on the day it is sent.