Python asynchronous operation concurrent.futures module usage example

multiprocessingIn addition to Python's multithreading module threading, it actually provides a simpler and easier-to-use  concurrent.futures module.

This module provides ThreadPoolExecutor2 ProcessPoolExecutor packaged classes, which is convenient for people to use and also makes the program look more concise.

I personally think it is one of the modules worth learning & using, which can handle most of the daily usage scenarios about multi-threading.

This article will learn  concurrent.futures the modules through several examples.

Environment of this article

  • Python 3.7

thread pool executor

Introduction first ThreadPoolExecutor.

ThreadPoolExecutor As the name suggests, Thread multiple tasks are created through the method Executors to execute and digest multiple tasks.

For example, in the following example, create a  parallel execution ThreadPoolExecutor with a maximum of 5 , and each   required parameter is   passed to the Executer for processing by calling:Threads say_hello_tosay_hello_tosubmit

from concurrent.futures import ThreadPoolExecutor
def say_hello_to(name):
print(name)
names = ['John', 'Ben', 'Bill', 'Alex', 'Jenny']
with ThreadPoolExecutor(max_workers=5) as executor:
for n in names:
executor.submit(say_hello_to, n)

The execution result of the above example:

John
Ben
Bill
Alex
Jenny

If the above example is executed several times, it is possible to encounter the situation where the text strings are mixed together, such as the output situation similar to the following. This is caused by multiple Threads wanting to output text at the same time. It is not a mysterious problem. This article This will be addressed in a later example.

John
BenBill
Alex
Jenny

future object

Then talk about concurrent.futures the very important role in the module -  Future.

In fact, when the call is made submit , what is returned is not Thread the result of the program being executed, but Future an instance of , and this instance is a proxy (Proxy) for the execution result, so we can query the Future instance in Thread through done running ,  and other methods cancelledWhat is the state of the program being executed? If the program has entered done the state, you can call to resultget the result.

But Python also provides a simpler method -  as_completed, to help check the status, so you can write less code.

So the previous example can be further modified as follows:

from concurrent.futures import ThreadPoolExecutor, as_completed
def say_hello_to(name):
return f'Hi, {name}'
names = ['John', 'Ben', 'Bill', 'Alex', 'Jenny']
with ThreadPoolExecutor(max_workers=5) as executor:
futures = []
for n in names:
future = executor.submit(say_hello_to, n)
print(type(future))
futures.append(future)
for future in as_completed(futures):
print(future.result())

In the above example, after the instance is obtained in line 11 future , it is put into the list in line 13 futures, and then in line 15,  the instances as_completed(futures) that have been executed are obtained one by one future , and  result() the results are obtained and printed out.

Its execution results are as follows:

<class 'concurrent.futures._base.Future'>
<class 'concurrent.futures._base.Future'>
<class 'concurrent.futures._base.Future'>
<class 'concurrent.futures._base.Future'>
<class 'concurrent.futures._base.Future'>
Hi, Jenny
Hi, Bill
Hi, Ben
Hi, John
Hi, Alex

Since we moved the printing function out of Thread, it also solves the situation that the printed text may stick together.

In addition to  submit() obtaining the Future instance first and then checking the status one by one and obtaining the result, you can also directly use  map()the method to directly obtain the execution result of the Thread, such as the following example:

from concurrent.futures import ThreadPoolExecutor, as_completed
def say_hello_to(name):
for i in range(100000):
pass
return f'Hi, {name}'
names = ['John', 'Ben', 'Bill', 'Alex', 'Jenny']
with ThreadPoolExecutor(max_workers=5) as executor:
results = executor.map(say_hello_to, names)
for r in results:
print(r)

process pool executor

The method of using ProcessPoolExecutor is exactly the same as that of ThreadPoolExecutor. Basically, you can choose to use ThreadPoolExecutor or ProcessPoolExecutor according to your needs.

However, it is worth noting that after Python 3.5,  map() there is an additional  chunksize parameter that can be used, and this parameter is only valid for ProcessPoolExecutor. This parameter can improve the execution performance of ProcessPoolExecutor when processing a large number of iterables.

使用 ProcessPoolExecutor时,此方法将可迭代对象分成许多块,将其作为单独的任务提交给池。这些块的(近似)大小可以通过将 chunksize 设置为正整数来指定。对于非常长的可迭代对象,与默认大小 1 相比,使用较大的 chunksize 值可以显着提高性能。对于 ThreadPoolExecutor,chunksize 没有效果。

我们可以将先前例子中的 names 乘以1000 倍的长度后,再测试设定不同 chucksize 的性能:

from concurrent.futures import ProcessPoolExecutor, as_completed
def say_hello_to(name):
return f'Hi, {name}'
names = ['John', 'Ben', 'Bill', 'Alex', 'Jenny'] * 1000
with ProcessPoolExecutor(max_workers=4) as executor:
results = executor.map(say_hello_to, names)

以下用Jupyter中的 %timeit 测试其性能:

<span style="background-color:#f8f8f8"><span style="color:#212529"><code class="language-python">%timeit <span style="color:#f47067">with</span> ProcessPoolExecutor(max_workers=<span style="color:#6cb6ff">4</span>) <span style="color:#f47067">as</span> executor: executor.<span style="color:#f69d50">map</span>(say_hello_to, names, chunksize=<span style="color:#6cb6ff">6</span>)</code></span></span>

上图可以看到随着 chunksize 的增加,程序平均的执行时间越来越短,但也不是无限制的增加,到某个数量之后,加速的幅度就开始趋缓,因此chunksize 的设定还是得花点心思才行。

Guess you like

Origin blog.csdn.net/qq_41221596/article/details/131603069