[Python] From entry to advanced—multi-process and distributed process (10)

Preface

Previous article [Python] From Getting Started to Top—Multi-Threading (9) I have already talked about the difference between Python threads and processes and that Python threads have a GIL lock. Any thread needs to obtain the lock before execution. 因此Python多线程不能有效利用多核优势实现多任务

Insert image description here

1.Multiple processes

1.fork() system call

To enable Python programs to implement multi-process ( multiprocessing), we first understand 操作系统the relevant knowledge.

  • Unix/Linux operating systems provide one fork()系统调用, and it is very special. Ordinary function calls are called once and returned once. However fork()调用一次,返回两次, because the operating system automatically makes 父进程a copy (called) of the current process (called 子进程), and then, respectively, in父进程和子进程内返回。

    • The child process always returns 0, and the parent process returns the ID of the child process . The reason for this is that, 一个父进程可以fork出很多子进程therefore, the parent process needs to write down the ID of the parent process 个子进程的ID, and the child process only needs to call getppid()就to get the ID of the parent process.

2.OS module

Python os模块encapsulates common subprocesses 系统调用, including fork:

import os

print('Process (%s) start...' % os.getpid())
# Only works on Unix/Linux/Mac:
pid = os.fork()
if pid == 0:
    print('I am child process (%s) and my parent is %s.' % (os.getpid(), os.getppid()))
else:
    print('I (%s) just created a child process (%s).' % (os.getpid(), pid))

Insert image description here

Since Windows does not have a fork call, 上面的代码在Windows上无法运行. The Mac system is based on it BSD(Unix的一种)内核, so there is no problem running it under Mac. It is recommended for everyone to use Mac学Python!

  • Yes fork调用, a process can do so when it receives a new task , which is 复制出一个子进程来处理新任务most commonly caused by .Apache服务器父进程监听端口,每当有新的http请求时,就fork出子进程来处理新的http请求

3.multiprocessing module

If you plan to write a multi-process service program, this Unix/Linuxis undoubtedly the right choice. Since Windows does not have a fork call, is it impossible to write multi-process programs in Python on Windows?

  • Since Python is cross-platform, it should naturally provide one 跨平台的多进程支持. The multiprocessing module is 跨平台版本的多进程模块.

    • The multiprocessing module provides Process类a进程对象

For example: start a child process and wait for it to end:

from multiprocessing import Process
import os

# 子进程要执行的代码
def run_proc(name):
    print('Run child process %s (%s)...' % (name, os.getpid()))

if __name__=='__main__':
    print('Parent process %s.' % os.getpid())
    p = Process(target=run_proc, args=('test',))
    print('Child process will start.')
    p.start()
    p.join()
    print('Child process end.')

Insert image description here

  • When creating a child process, you only need to create one Process实例, pass it in 执行函数和函数的参数, and start it with start()a method. This makes creating a process even fork()simpler.
    • The join() method works 等待子进程结束后再继续往下运行and is usually used 进程间的同步.

4. Process pool (multiprocessing Pool module)

If you want to start a large number of child processes, you can 进程池create child processes in batches using:

from multiprocessing import Pool
import os, time, random


def long_time_task(name):
    print('Run task %s (%s)...' % (name, os.getpid()))
    start = time.time()
    time.sleep(random.random() * 3)
    end = time.time()
    print('Task %s runs %0.2f seconds.' % (name, (end - start)))


if __name__ == '__main__':
    print('Parent process %s.' % os.getpid())
    #创建长度为4的进程池
    p = Pool(4)
    
    #循环启动进程,传入调用函数和参数
    for i in range(13):
        p.apply_async(long_time_task, args=(i,))
        
    print('Waiting for all subprocesses done...')
    
    #关闭进程池,等待进程池所有子进程执行完毕
    p.close()
    p.join()
    print('All subprocesses done.')

The execution results are as follows:

Insert image description here

Code interpretation:

Pool对象The calling join()method will wait for all child processes to complete execution , which must be done before calling join() 先调用close(. After calling close(), you cannot continue to add new processes .

  • Please note that the output results task 0,1,2,3 4 5are executed immediately, while tasks 6-14 have to wait for a previous task to be completed before being executed. Since Pool的默认大小是CPU的物理核数my computer has 6 cores, you need to submit at least 7 sub-processes to see the above. waiting effect.
    , therefore, 13 child processes are created, and up to 6 processes can be executed simultaneously . This is an intentional design limitation of the Pool, not a limitation of the operating system. If changed to:

5. Subprocess (subprocess module)

Many times, the child process is not itself, but one 外部进程. After we create the subprocess, we also need to control the input and output of the subprocess.

  • subprocess模块It allows us to start one very conveniently 子进程and then control it输入和输出。

For example: running a command in Python code (querying the resolution record of a specified type of domain name) nslookup www.python.org, this has the same effect as running it directly from the command line:

import subprocess

print('$ nslookup www.baidu.com')
r = subprocess.call(['nslookup', 'www.baidu.com'])
print('Exit code:', r)

Code execution
Insert image description here
command line execution
Insert image description here

If the child process still needs input, it can communicate()be input through the method:

import subprocess

print('$ nslookup')
p = subprocess.Popen(['nslookup'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output, err = p.communicate(b'set q=mx\nbaidu.com\nexit\n')
print(output.decode('gbk'))
print('Exit code:', p.returncode)

The above code is equivalent to executing the command nslookup on the command line and then manually entering:

set q=mx
baidu.com
exit

code execution
Insert image description here

Command line execution
Insert image description here

6. Inter-process communication

Processes definitely need to communicate with each other. The operating system provides many mechanisms to achieve inter-process communication. Python's multiprocessing module wraps the underlying mechanism and provides Queue、Pipesa variety of ways to exchange data.

  • Taking Queue as an example, create two child processes in the parent process, one writes data to the Queue, and the other reads data from the Queue:
from multiprocessing import Process, Queue
import os, time, random


# 写数据进程执行的代码:
def write(q):
    print('Process to write: %s' % os.getpid())
    #循环写入队尾
    for value in ['1', '2', '3', '4', '5']:
        print('Put %s to queue...' % value)
        q.put(value)
        time.sleep(random.random())


# 读数据进程执行的代码:
def read(q):
    print('Process to read: %s' % os.getpid())
    #循环读取队列队头数据
    while True:
        value = q.get(True)
        print('Get %s from queue.' % value)


if __name__ == '__main__':
    # 父进程创建Queue,并传给各个子进程:
    q = Queue()
    #写进程
    pw = Process(target=write, args=(q,))
    #读进程
    pr = Process(target=read, args=(q,))

    # 启动子进程pw,写入:
    pw.start()
    # 启动子进程pr,读取:
    pr.start()

    # 等待pw结束:
    pw.join()
    # pr进程里是死循环,无法等待其结束,只能强行终止:
    pr.terminate()

Insert image description here

7. Summary

7.1. Learning summary

  • Under Unix/Linux, you can use fork() to call multiple processes.

  • To achieve this 跨平台的多进程, you can use the multiprocessing module.

    • Under Unix/Linux, multiprocessing模块it is encapsulated fork()系统调用so that we do not need to pay attention to the details of fork()
      • Since Windows does not have a fork call, multiprocessing需要“模拟”出fork()的效果all Python objects of the parent process must be passed to the parent process pickle序列化, 子进程去so ,如果multiprocessing在Windows下调用失败了,要先考虑是不是pickle失败了.
  • Inter-process communication is implemented through Queue, Pipes, etc.

7.2. Python distributed process error: pickle module cannot serialize lambda function

Insert image description here
reason:
Insert image description here

It turns out that it is caused by a problem with the Windows operating system, so we need to define our own functions to implement serialization.

  • Modify the code slightly and define two functionsreturn_task_queue和return_result_queue实现序列化
# task_master.py

import random, time, queue
from multiprocessing.managers import BaseManager

# 发送任务的队列:
task_queue = queue.Queue()
# 接收结果的队列:
result_queue = queue.Queue()


#windows要我们自己定义函数,实现序列化,然后注册到 QueueManager.register,Unix/Linux不需要
def return_task_queue():
    global task_queue
    return task_queue

def return_result_queue():
    global result_queue
    return result_queue

# 从BaseManager继承的QueueManager:
class QueueManager(BaseManager):
    pass

if __name__ == '__main__':
    # 把两个Queue都注册到网络上,callable参数关联了Queue对象
    QueueManager.register('get_task_queue', callable=return_task_queue)
    QueueManager.register('get_result_queue', callable=return_result_queue)

    # 绑定端口5000,设置验证码abc
    manager = QueueManager(address=('127.0.0.1', 5000), authkey=b'abc')

    # 启动queue
    manager.start()

    # 获得通过网络访问的Queue对象
    task = manager.get_task_queue()
    result = manager.get_result_queue()

    # 放几个任务
    for i in range(10):
        n = random.randint(0, 1000)
        print('添加任务 %d' % n)
        task.put(n)

    # 从result队列读取结果
    print('尝试获取结果')
    for i in range(10):
        r = result.get(timeout=10)
        print('结果是:%s' % r)

    # 关闭
    manager.shutdown()
    print('master exit')

Insert image description here

2. Distributed process

1.What is Python’s distributed process?

Python should prefer Process in the summary of threads and processes, because Process is more stable, and Process can be distributed to multiple machines, while threads can only be distributed to multiple CPUs on the same machine at most.

  • Not only does Python multiprocessing模块support it 多进程, but ** managers子模块also supports it 多进程分布到多台机器上. **A service process can serve as a scheduler, distributing tasks to multiple other processes, relying on network communication.
      • **Since the managers module is well encapsulated, it can be easily written without knowing the details of network communication 分布式多进程程序.

For example: If we already have a Queue通信multi-process program 同一台机器running on the computer, now, due to the heavy workload of the process processing the task, we want to distribute the 发送任务的进程sum 处理任务的进程to 2 machines. How to implement it using distributed process?

  • The original Queue can continue to be used , but, 通过managers模块把Queue通过网络暴露出去others can 机器的进程access the Queue.

2. How to implement distributed processes

Write service process

  • 服务进程负责启动Queue,把Queue注册到网络上,然后往Queue里面写入任务

    # task_master.py
    import random, time, queue
    from multiprocessing.managers import BaseManager
    
    # 发送任务的队列:
    task_queue = queue.Queue()
    # 接收结果的队列:
    result_queue = queue.Queue()
    
    
    # windows要我们自己定义函数,实现序列化,然后注册到 QueueManager.register,Unix/Linux不需要
    def return_task_queue():
        global task_queue
        return task_queue
    
    
    def return_result_queue():
        global result_queue
        return result_queue
    
    
    # 从BaseManager继承的QueueManager:
    class QueueManager(BaseManager):
        pass
    
    
    if __name__ == '__main__':
        # 把两个Queue都注册到网络上, callable参数关联了Queue对象:
        QueueManager.register('get_task_queue', callable=return_task_queue)
        QueueManager.register('get_result_queue', callable=return_result_queue)
        
        # 绑定端口5000, 设置验证码'abc':
        manager = QueueManager(address=('127.0.0.1', 5000), authkey=b'abc')
        
        # 启动Queue:
        manager.start()
        
        # 获得通过网络访问的Queue对象:
        task = manager.get_task_queue()
        result = manager.get_result_queue()
        
        # 放几个任务进去:
        for i in range(10):
            n = random.randint(0, 10000)
            print('Put task %d...' % n)
            task.put(n)
            
        # 从result队列读取结果:
        print('Try get results...')
        for i in range(10):
            r = result.get(timeout=10)
            print('Result: %s' % r)
            
        # 关闭:
        manager.shutdown()
        print('master exit.')
    

important! ! ! ! !

  • When we 一台机器write a multi-process program on , the created Queue can直接拿来用
    • However, in a distributed multi-process environment, 添加任务到Queue不可以直接对原始的task_queue进行操作this bypasses the encapsulation of QueueManager 必须通过manager.get_task_queue()获得的Queue接口添加.

Write task process

  • Start the task process on another machine (it can also be started on this machine):
# task_worker.py

import time, sys, queue
from multiprocessing.managers import BaseManager


# 创建类似的QueueManager:
class QueueManager(BaseManager):
    pass


if __name__ == '__main__':
    # 由于这个QueueManager只从网络上获取Queue,所以注册时只提供名字:
    QueueManager.register('get_task_queue')
    QueueManager.register('get_result_queue')

    # 连接到服务器,也就是运行task_master.py的机器:
    server_addr = '127.0.0.1'
    print('Connect to server %s...' % server_addr)

    # 端口和验证码注意保持与task_master.py设置的完全一致:
    m = QueueManager(address=(server_addr, 5000), authkey=b'abc')
    # 从网络连接:
    m.connect()

    # 获取Queue的对象:
    task = m.get_task_queue()
    result = m.get_result_queue()

    # 从task队列取任务,并把结果写入result队列:
    for i in range(10):
        try:
            n = task.get(timeout=1)
            print('run task %d * %d...' % (n, n))
            r = '%d * %d = %d' % (n, n, n * n)
            time.sleep(1)
            result.put(r)
        except queue.Empty:
            print('task queue is empty.')

    # 处理结束:
    print('worker exit.')
  • The task process needs to connect to the service process through the network, so the IP of the service process must be specified.

Start the service process first to start execution

Put task 0...
Put task 1...
Put task 2...
Put task 3...
Put task 4...
Put task 5...
Put task 6...
Put task 7...
Put task 8...
Put task 9...
Try get results...#等待任务线程写入队列
  • After the task_master.py process sends the task, it starts waiting result队列for the result.

Now start task_worker.pythe process:

Connect to server 127.0.0.1...
run task 0 * 0...
run task 1 * 1...
run task 2 * 2...
run task 3 * 3...
run task 4 * 4...
run task 5 * 5...
run task 6 * 6...
run task 7 * 7...
run task 8 * 8...
run task 9 * 9...
worker exit.
  • The task_worker.py process ends, in the task_master.py process 会继续打印出结果:

Insert image description here

  • Master/Worker模型What is the use of this simple one ?
    • In fact, this is a simple but truly distributed computing . By slightly modifying the code and starting multiple workers, you can replace 任务分布到几台甚至几十台机器上, for example, the code for calculating n*n 发送邮件, and it is implemented 邮件队列的异步发送.

Where are Queue objects stored?

  • Notice that task_worker.pythere is no code to create a Queue at all, so the Queue object is stored in task_master.py进程:

Insert image description here

  • The reason why Queue can be 网络accessed is through QueueManagerimplementation. Because of QueueManage 管理的不止一个Queue, it is necessary to give each Queue 网络调用接口起个名字, such as get_task_queue.

What is the use of authkey?

  • 保证两台机器正常通信,不被其他机器恶意干扰. If task_worker.pythe authkey and task_master.pythe authkey are inconsistent, the connection will definitely not be possible.

3. Summary

  • Python's distributed process interface is simple and well-encapsulated, making it suitable for environments where heavy tasks need to be distributed to multiple machines. ·

  • Note that the role of Queue is to 传递任务和接收结果describe each task 数据量要尽量小. For example, when sending a task to process a log file, you do not need to send 几百兆the log file itself. Instead 志文件存放的完整路径, the Worker process reads the file from the shared disk on the day it is sent.

Guess you like

Origin blog.csdn.net/qq877728715/article/details/132830457