every blog every motto: Light tomorrow with today.
0. Preface
The current articles on the return value of multiple processes are scattered on the Internet, and this article focuses on a simple summary.
Description:
- The function return value of the tested function is used as the function parameter . Therefore, the running time of using multiple processes does not decrease, but is slower. This needs to be explained. Regarding the running time, only general results are shown, not the focus of this article.
- For the comparison of the running time of the apply and apply_async methods, please refer to the time comparison
1. Text
Decorator is used in the test function time, please refer to decorator for details
Import the module:
import threading
from queue import Queue
import multiprocessing
from multiprocessing import Manager
import time
loop_number = 30000 # 循环次数
1.1 No multi-process situation
def fun(k):
"""被测试函数"""
print('-----fun函数,参数为{}----'.format(k))
m = k + 10
return m
@count_time
def call_fun():
"""没有使用多线/进程的情况"""
number = 0
for i in range(loop_number):
print('number:{}'.format(number))
number = fun(number)
def main():
"""主程序"""
# 1. 没有使用多线程
call_fun()
if __name__ == '__main__':
main()
result:
1.2 Multi-process return value
1.2.1 Method 1: Process Pool Pool
For the apply of multiple process pools, the time comparison of apply_async can refer to Reference 1
1. apply
def fun(k):
"""被测试函数"""
print('-----fun函数,参数为{}----'.format(k))
m = k + 10
return m
@count_time
def my_process():
"""多进程"""
# 方法一:apply/apply_async
pool = multiprocessing.Pool(4) # 创建4个进程
k = 0
for i in range(loop_number):
k = pool.apply(fun, args=(k,)
print('返回值为:', k)
def main():
"""主程序"""
# 3. 使用多进程
my_process()
if __name__ == '__main__':
main()
result:
2. apply_async
def fun(k):
"""被测试函数"""
print('-----fun函数,参数为{}----'.format(k))
m = k + 10
return m
@count_time
def my_process():
"""多进程"""
# 方法一:apply/apply_async
pool = multiprocessing.Pool(4) # 创建4个进程
k = 0
for i in range(loop_number):
k = pool.apply_async(fun, args=(k,))
k = k.get()
print('返回值为:', k)
def main():
"""主程序"""
# 3. 使用多进程
my_process()
if __name__ == '__main__':
main()
Result:
Note: The
above running time is for general reference only. The running time is not the focus of this article. Summary of specific reference 1
:
- Because of the particularity of the test function (the return value of the function is passed into the function as a parameter ), the two methods are equivalent to serial, so there is no difference in time
- The apply_async method returns the result as an object , so you need to use the get method to get the result
apply_async discusses the position of the returned result:
1. The intermediate output of the returned result:
@count_time
def my_process():
"""多进程"""
# 方法一:apply/apply_async
pool = multiprocessing.Pool(4) # 创建4个进程
k = 0
for i in range(loop_number):
t = pool.apply_async(fun, args=(k,))
print('返回值为:', t.get())
result:
2. The final output of the returned result:
@count_time
def my_process():
"""多进程"""
# 方法一:apply/apply_async
pool = multiprocessing.Pool(4) # 创建4个进程
k = 0
li = [] # 空列表
for i in range(loop_number):
t = pool.apply_async(fun, args=(k,))
li.append(t)
for i in li:
print('返回值为:', i.get())
Result:
The reason for the different time:
When each result is obtained, the result is appended to the list. Otherwise, the process will be blocked when the result get is obtained, thereby turning the multi-process into a single process. When the list is used for storage, the data is getting Time, you can set the timeout time, for example, get(timeout=5).
summary:
- For apply_async return results, the final output time is shorter
- There is no obvious difference between the two apply, readers can verify by themselves
1.2.2 Method Two: Manger
from multiprocessing import Manager, Process
def fun(k,result_dict):
"""被测试函数"""
print('-----fun函数内部,参数为{}----'.format(k))
m = k + 10
result_dict[k] = m # 方法二:manger
@count_time
def my_process():
"""多进程"""
# 方法二:Manger
manger = Manager()
result_dict = manger.dict() # 使用字典
jobs = []
for i in range(10):
p = Process(target=fun, args=(i, result_dict))
jobs.append(p)
p.start()
for pr in jobs:
pr.join()
var = result_dict
print('返回结果', var.values())
def main():
"""主程序"""
# 3. 使用多进程
my_process()
if __name__ == '__main__':
main()
result:
Description:
- The above is a dictionary, you can also use a list
- The returned result can only be obtained after all the child processes have finished running. From this point of view, it does not meet our requirements. As a method of obtaining multiple processes, it is attached.
1.2.3 Method Three: Pipe
Definition class
class MyProcess(multiprocessing.Process):
def __init__(self, name, func, args):
super(MyProcess, self).__init__()
self.name = name
self.func = func
self.args = args
self.res = ''
def run(self):
self.res = self.func(*self.args)
def fun(k,p):
"""被测试函数"""
print('-----fun函数内部,参数为{}----'.format(k))
m = k + 10
p.send(m)
@count_time
def my_process():
"""多进程"""
# 方法三:
process_li = []
parent_con, child_con = multiprocessing.Pipe()
for i in range(30):
p = MyProcess('proce', fun, (1, child_con))
process_li.append(p)
p.start()
for i in process_li:
i.join()
for i in process_li:
print(parent_con.recv())
Result:
Description:
- When the parameter loop_number=30000, the memory overflows and the computer is blacked out, so it only loops 30 times.
- The result returned by the function can only be output together at the end, which does not meet the requirements.
1.2.3 Method Four: Quene
Definition class
class MyProcess(multiprocessing.Process):
def __init__(self, name, func, args):
super(MyProcess, self).__init__()
self.name = name
self.func = func
self.args = args
self.res = ''
def run(self):
self.res = self.func(*self.args)
def fun(k, p):
"""被测试函数"""
print('-----fun函数内部,参数为{}----'.format(k))
m = k + 10
p.put(m)
@count_time
def my_process():
"""多进程"""
# 方法四:Queue
process_li = []
q = multiprocessing.Queue()
k = 1
for i in range(30):
p = MyProcess('proce', fun, (k, q))
p.start()
process_li.append(p)
for i in process_li:
i.join()
while q.qsize() > 0:
print(q.get())
Result:
Description:
- Like the above Pipe, the memory overflows, and the result can only be placed in the queue and can only be obtained at the end, which does not meet the demand
to sum up:
- It is recommended to use process pool apply / apply_async
- Refer to this article to compare and analyze the apply/apply_async time . It is recommended to use apply_async under ThreadPool
references
[1] https://blog.csdn.net/weixin_39190382/article/details/107865552
[2] https://zhidao.baidu.com/question/246068963774450124.html
[3] https://blog.csdn.net/sunt2018/article/details/85336408
[4] https://blog.csdn.net/ztf312/article/details/80337255
[5] https://www.jb51.net/article/86412.htm
[6] https://blog.csdn.net/littlehaes/article/details/102626610