Python concurrent.future thread pool and process pool

concurrent.futrues is a high-level library, which only operates in the "task" level, which means you do not need to focus on synchronization and threads, processes to manage. Future production is in fact - an extension of the consumer model, in production - the consumer model, the producers do not care what time consumer data has been processed, the consumer does not care about the result of processing. You only need to specify a "max_workers" number of threads / processes pool, then submit the results to the task and finishing, another benefit is relative to the threading and multiprocessing module used in multi-threaded / multi-process scenarios, frequently create / destroy process or thread It is very resource consuming, and concurrent.futrues has its own thread pool / process pool, space for time.

concurrent.futrues There are two classes: concurrent.futrues.ThreadPoolExecutor (thread pool), commonly used for IO-intensive scene; concurrent.futrues.ProcessPoolExecutor (process pool), commonly used for computationally intensive scenes, why this sub-usage scenarios, that is the reason python GIL lock, multiple threads can only use one CPU, not repeat them here. Use of both is the same.

ThreadPoolExecutor / ProcessPoolExecutor commonly used method is as follows:

1, ThreadPoolExecutor / ProcessPoolExecutor configuration example when passing max_workers parameters to set the number of threads running concurrently up to the thread pool.

Handle 2, submit (self, fn, * args, ** kwargs) function to submit the task (function name and parameter) to the thread pool thread to be executed, and returns the task (similar to a file, drawing), pay attention to submit ( ) is not blocked, but returned immediately.

3, done () method of determining the task is finished.

4, cancel () method can cancel jobs submitted, if the task has been running in the thread pool, you can not cancel.

5, result () method gets the return value of the task. Look inside the code, I found this method is blocked.
6, wait (fs, timeout = None, return_when = ALL_COMPLETED), wait accepts three parameters, fs represents a task execution sequence; timeout represents maximum time, more than this time will return to complete even though the thread is not performed; return_when represents return results wait condition, by default all execution completes ALL_COMPLETED back
7, map (self, fn, * iterables, timeout = None, chunksize = 1), the first parameter is a function fn thread of execution; receiving a second parameter a iterable; third parameter timeout with wait () as the timeout, but the map is the result of the return thread of execution, if the timeout is less than the thread execution time will throw an exception TimeoutError.
8, as_completed (fs, timeout = None) method takes out the results of all tasks.
AN Iterator over The GIVEN Futures that Yields each AS IT completes is.

The Args:
FS: of The Sequence of Futures (Possibly Created by Different the Executors) to
the iterate over.
Timeout:. Of The maximum Number of seconds The to the wait the If None, the then there
IS NO limit on the wait time.

Returns:
An iterator that yields the given Futures as they complete (finished or
cancelled). If any given Futures are duplicated, they will be returned
once.

Raises:
TimeoutError: If the entire result iterator could not be generated
before the given timeout.

Following the intensive computational efficiency compared ThreadPoolExecutor scene and ProcessPoolExecutor of:

import time
from concurrent.futures import ThreadPoolExecutor, as_completed, ProcessPoolExecutor

def get_fib(num):
    if num < 3:
        return 1
    return get_fib(num - 1) + get_fib(num - 2)

def run_thread_pool(workers, fib_num):
    start_time = time.time()
    with ThreadPoolExecutor(workers) as thread_executor:
        tasks = [thread_executor.submit(get_fib, num) for num in range(fib_num)]
        results = [task.result() for task in as_completed(tasks)]
        print(results)
        print("ThreadPoolExecutor spend time: {}s".format(time.time() - start_time))

def run_process_pool(workers, fib_num):
    start_time = time.time()
    with ProcessPoolExecutor(workers) as process_executor:
        tasks = [process_executor.submit(get_fib, num) for num in range(fib_num)]
        results = [task.result() for task in as_completed(tasks)]
        print(results)
        print("ProcessPoolExecutor spend time: {}s".format(time.time() - start_time))

if __name__ == '__main__':
    #run_thread_pool(6, 38)
    run_process_pool(6, 38)

The results are as follows:

[5, 2, 1, 1, 3, 1, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 28657, 17711, 121393, 75025, 46368, 317811, 196418, 514229, 1346269, 832040, 2178309, 3524578, 9227465, 5702887, 14930352, 24157817]
ThreadPoolExecutor spend time: 24.460843086242676s

[1, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 144, 89, 377, 610, 987, 233, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 75025, 46368, 121393, 196418, 514229, 317811, 1346269, 832040, 2178309, 5702887, 3524578, 9227465, 14930352, 24157817]
ProcessPoolExecutor spend time: 15.908910274505615s

Gordennizaicunzai

Published 165 original articles · won praise 136 · views 440 000 +

His message board concerns

Python concurrent.future thread pool and process pool

Guess you like