[Python] Multi-process return value comparison

every blog every motto: Light tomorrow with today.

0. Preface

The current articles on the return value of multiple processes are scattered on the Internet, and this article focuses on a simple summary.
Description:

  • The function return value of the tested function is used as the function parameter . Therefore, the running time of using multiple processes does not decrease, but is slower. This needs to be explained. Regarding the running time, only general results are shown, not the focus of this article.
  • For the comparison of the running time of the apply and apply_async methods, please refer to the time comparison

1. Text

Decorator is used in the test function time, please refer to decorator for details


Import the module:

import threading
from queue import Queue
import multiprocessing
from multiprocessing import Manager
import time

loop_number = 30000 # 循环次数

1.1 No multi-process situation

def fun(k):
    """被测试函数"""
    print('-----fun函数,参数为{}----'.format(k))
    m = k + 10
    return m 
@count_time
def call_fun():
    """没有使用多线/进程的情况"""
    number = 0

    for i in range(loop_number):
        print('number:{}'.format(number))
        number = fun(number)

def main():
    """主程序"""

    # 1. 没有使用多线程
    call_fun()


if __name__ == '__main__':
    main()

result:
Insert picture description here

1.2 Multi-process return value

1.2.1 Method 1: Process Pool Pool

For the apply of multiple process pools, the time comparison of apply_async can refer to Reference 1

1. apply

def fun(k):
    """被测试函数"""
    print('-----fun函数,参数为{}----'.format(k))
    m = k + 10
    return m  
   
@count_time
def my_process():
    """多进程"""

    # 方法一:apply/apply_async
    pool = multiprocessing.Pool(4)  # 创建4个进程
    k = 0
    for i in range(loop_number):
        k = pool.apply(fun, args=(k,)
        print('返回值为:', k)

 
def main():
    """主程序"""

    # 3. 使用多进程
    my_process()


if __name__ == '__main__':
    main()

result:
Insert picture description here

2. apply_async

def fun(k):
    """被测试函数"""
    print('-----fun函数,参数为{}----'.format(k))
    m = k + 10
    return m  
   
@count_time
def my_process():
    """多进程"""

    # 方法一:apply/apply_async
    pool = multiprocessing.Pool(4)  # 创建4个进程
    k = 0
    for i in range(loop_number):
        k = pool.apply_async(fun, args=(k,))
        k = k.get()
        print('返回值为:', k)

 
def main():
    """主程序"""

    # 3. 使用多进程
    my_process()


if __name__ == '__main__':
    main()

Result:
Insert picture description here
Note: The
above running time is for general reference only. The running time is not the focus of this article. Summary of specific reference 1
:

  1. Because of the particularity of the test function (the return value of the function is passed into the function as a parameter ), the two methods are equivalent to serial, so there is no difference in time
  2. The apply_async method returns the result as an object , so you need to use the get method to get the result

apply_async discusses the position of the returned result:
1. The intermediate output of the returned result:
@count_time
def my_process():
    """多进程"""

    # 方法一:apply/apply_async
    pool = multiprocessing.Pool(4)  # 创建4个进程
    k = 0
    for i in range(loop_number):
        t = pool.apply_async(fun, args=(k,))

        print('返回值为:', t.get())

result:
Insert picture description here

2. The final output of the returned result:
@count_time
def my_process():
    """多进程"""

    # 方法一:apply/apply_async
    pool = multiprocessing.Pool(4)  # 创建4个进程
    k = 0
    li = [] # 空列表
    for i in range(loop_number):
        t = pool.apply_async(fun, args=(k,))
		li.append(t)
	for i in li:
        print('返回值为:', i.get())

Result:
Insert picture description here
The reason for the different time:

When each result is obtained, the result is appended to the list. Otherwise, the process will be blocked when the result get is obtained, thereby turning the multi-process into a single process. When the list is used for storage, the data is getting Time, you can set the timeout time, for example, get(timeout=5).

summary:

  • For apply_async return results, the final output time is shorter
  • There is no obvious difference between the two apply, readers can verify by themselves

1.2.2 Method Two: Manger

from multiprocessing import Manager, Process

def fun(k,result_dict):
    """被测试函数"""
    print('-----fun函数内部,参数为{}----'.format(k))
    m = k + 10

    result_dict[k] = m # 方法二:manger

@count_time
def my_process():
    """多进程"""

    # 方法二:Manger
    manger = Manager()
    result_dict = manger.dict()  # 使用字典
    jobs = []

    for i in range(10):
        p = Process(target=fun, args=(i, result_dict))
        jobs.append(p)
        p.start()

    for pr in jobs:
        pr.join()
    var = result_dict
    print('返回结果', var.values())

def main():
    """主程序"""


    # 3. 使用多进程
    my_process()


if __name__ == '__main__':
    main()

result:
Insert picture description here

Description:

  • The above is a dictionary, you can also use a list
  • The returned result can only be obtained after all the child processes have finished running. From this point of view, it does not meet our requirements. As a method of obtaining multiple processes, it is attached.

1.2.3 Method Three: Pipe

Definition class

class MyProcess(multiprocessing.Process):
    def __init__(self, name, func, args):
        super(MyProcess, self).__init__()
        self.name = name
        self.func = func
        self.args = args
        self.res = ''

    def run(self):
        self.res = self.func(*self.args)

def fun(k,p):
    """被测试函数"""
    print('-----fun函数内部,参数为{}----'.format(k))
    m = k + 10

    p.send(m)

@count_time
def my_process():
    """多进程"""

    # 方法三:
    process_li = []
    parent_con, child_con = multiprocessing.Pipe()
    for i in range(30):
        p = MyProcess('proce', fun, (1, child_con))
        process_li.append(p)
        p.start()
    for i in process_li:
        i.join()
    for i in process_li:
        print(parent_con.recv())

Result:
Insert picture description here
Description:

  • When the parameter loop_number=30000, the memory overflows and the computer is blacked out, so it only loops 30 times.
  • The result returned by the function can only be output together at the end, which does not meet the requirements.

1.2.3 Method Four: Quene

Definition class

class MyProcess(multiprocessing.Process):
    def __init__(self, name, func, args):
        super(MyProcess, self).__init__()
        self.name = name
        self.func = func
        self.args = args
        self.res = ''

    def run(self):
        self.res = self.func(*self.args)

def fun(k, p):
    """被测试函数"""
    print('-----fun函数内部,参数为{}----'.format(k))
    m = k + 10

    p.put(m)

@count_time
def my_process():
    """多进程"""

    # 方法四:Queue
    process_li = []
    q = multiprocessing.Queue()
    k = 1
    for i in range(30):
        p = MyProcess('proce', fun, (k, q))
        p.start()
        process_li.append(p)
    for i in process_li:
        i.join()

    while q.qsize() > 0:
        print(q.get())

Result:
Insert picture description here
Description:

  • Like the above Pipe, the memory overflows, and the result can only be placed in the queue and can only be obtained at the end, which does not meet the demand

to sum up:

references

[1] https://blog.csdn.net/weixin_39190382/article/details/107865552
[2] https://zhidao.baidu.com/question/246068963774450124.html
[3] https://blog.csdn.net/sunt2018/article/details/85336408
[4] https://blog.csdn.net/ztf312/article/details/80337255
[5] https://www.jb51.net/article/86412.htm
[6] https://blog.csdn.net/littlehaes/article/details/102626610

Guess you like

Origin blog.csdn.net/weixin_39190382/article/details/107864274