foreword
This article will discuss with you python's multi-process concurrent programming (Part 2) , use the built-in basic library multiprocessing to achieve concurrency, first use this module simply through the official. Lay a good foundation first, so that you can have a basic usage and cognition, and we will use it in detail in the follow-up articles.
This article is the sixth article of python concurrent programming. The address of the previous article is as follows:
Python: Concurrent Programming (5)_Lion King's Blog-CSDN Blog
The address of the next article is as follows:
python: concurrent programming (seven)_Lion King's Blog-CSDN Blog
1. Actual combat
1. Use multiple processes to download multiple files concurrently, and count the total download time after the download is complete
import multiprocessing
import requests
import time
def download_file(url, output_file):
response = requests.get(url)
with open(output_file, 'wb') as file:
file.write(response.content)
if __name__ == '__main__':
urls = [
'https://so.gushiwen.cn/gushi/sanbai.aspx',
'https://so.gushiwen.cn/gushi/tangshi.aspx',
'https://so.gushiwen.cn/gushi/songsan.aspx',
# 根据需要添加更多的下载链接
]
start_time = time.time()
# 创建一个进程池
pool = multiprocessing.Pool()
# 并发地下载文件
for i, url in enumerate(urls):
output_file = f'file{i+1}.txt'
pool.apply_async(download_file, (url, output_file))
# 关闭进程池并等待所有进程完成
pool.close()
pool.join()
end_time = time.time()
total_time = end_time - start_time
print(f'共下载了 {len(urls)} 个文件,总耗时 {total_time:.2f} 秒。')
In this example, we create multiprocessing.Pool
a process pool using the method, and then use apply_async
the method to asynchronously submit the download task to the worker processes in the process pool. Each worker process is responsible for downloading a file. In this way, we can download multiple files concurrently, improving download efficiency. The above code is explained as follows:
(1)import multiprocessing
: Import multiprocessing
module for creating and managing processes.
(2)import requests
: Import requests
module for sending HTTP requests and downloading files.
(3)import time
: Import time
module for calculating program execution time.
(4)def download_file(url, output_file)
: Define a function download_file
to download the file at the specified URL and save it to the specified output file.
(5)if __name__ == '__main__':
: The main program entry, to ensure that the code is executed in the main process.
(6)urls = [...]:
Defines a list containing URLs of files to download.
(7)start_time = time.time()
: Record the time when the program starts executing.
(8)pool = multiprocessing.Pool()
: Create a process pool.
(9)for i, url in enumerate(urls):
: Traverse the list of download URLs.
(10)output_file = f'file{i+1}.txt'
: Generate output filename based on file index number.
(11)pool.apply_async(download_file, (url, output_file))
: Use apply_async
the method to asynchronously submit download tasks to worker processes in the process pool.
(12)pool.close()
: Close the process pool and no longer accept new tasks.
(13)pool.join()
: Wait for all worker processes to complete.
(14)end_time = time.time()
: Record the time when the program execution ended.
(15)total_time = end_time - start_time
: Calculate the total download time.
(16)print(f'共下载了 {len(urls)} 个文件,总耗时 {total_time:.2f} 秒。')
: Print information about the number of downloaded files and the total download time.
Note that this is a simple example and does not take into account the details of exception handling, error handling and further optimizations. In practical applications, you may need to make appropriate modifications and extensions according to specific needs.
2. Use multiple processes to realize parallel calculation of the first n numbers of the Fibonacci sequence
import multiprocessing
def fibonacci(n):
if n <= 0:
return []
elif n == 1:
return [0]
elif n == 2:
return [0, 1]
fib = [0, 1]
for i in range(2, n):
fib.append(fib[-1] + fib[-2])
return fib
if __name__ == '__main__':
n = 100 # 斐波那契数列的前n个数
# 创建进程池,根据CPU核心数量自动设置进程数
pool = multiprocessing.Pool()
# 使用进程池并行计算斐波那契数列
result = pool.map(fibonacci, [n])
# 关闭进程池
pool.close()
pool.join()
print(f'斐波那契数列的前{n}个数为:{result[0]}')
The program uses to multiprocessing.Pool
create a process pool and use map
the method to calculate the first n numbers of the Fibonacci sequence in parallel. The process pool will automatically set the number of processes according to the number of CPU cores of the computer to make full use of multi-core resources. Finally, print out the calculation result.
Note that the calculation of the Fibonacci sequence in this program is performed in parallel in different processes, which improves the calculation efficiency.
3. Use multiple processes to concurrently calculate the sum of squares of all elements in a given list
import multiprocessing
def square_sum(numbers):
total = 0
for num in numbers:
total += num**2
return total
if __name__ == '__main__':
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] # 给定的列表
# 创建进程池,根据CPU核心数量自动设置进程数
pool = multiprocessing.Pool()
# 使用进程池并行计算平方和
result = pool.apply(square_sum, (numbers,))
# 关闭进程池
pool.close()
pool.join()
print(f'给定列表中所有元素的平方和为:{result}')
The program multiprocessing.Pool
creates a pool of processes using the method and uses apply
the method to compute the sum of squares of all elements in a given list in parallel. The process pool will automatically set the number of processes according to the number of CPU cores of the computer to make full use of multi-core resources. Finally, print out the calculation result.
Note that the calculation of the sum of squares of the elements of a given list in this program is performed in parallel in different processes, which improves the computational efficiency.
2. The main functions of the multiprocessing module
1. Use Pool
objects to create process pools
Pool
Objects provide a simple way to manage and schedule processes, and you can use methods such as apply
, map
, imap
and so on to execute tasks concurrently.
2. Process
Create a single process using an object
If you need more fine-grained control over process creation and management, you can use Process
objects to create and manage individual processes.
3. Use for Queue
inter-process communication
Queue
It is a process-safe queue that can safely transfer data between multiple processes. It can be used to implement inter-process communication and data sharing.
4. Use for Lock
inter-process synchronization
Lock
Objects are used to achieve synchronization between processes, to prevent multiple processes from accessing shared resources at the same time, and to ensure the correctness of data.
5. Use Value
and Array
implement shared state between processes
Value
and Array
are objects used to share data between multiple processes, and they can be used to implement shared state between processes.
6. Use Manager
objects to manage shared state
Manager
Objects provide a more advanced way to manage shared state, which can create and manage multiple shared objects, such as lists, dictionaries, etc., and ensure their safe access among multiple processes.
7. Use for Event
inter-process synchronization
Event
Objects are used to implement inter-process synchronization, which can be used for event triggering and waiting between multiple processes.
8. Use for Semaphore
inter-process synchronization
Semaphore
The object is a counting semaphore, used to control access to shared resources, and can be used to limit the number of processes that simultaneously access shared resources.
9. Use Timeout
setting timeout
Timeout
The object can be used to set the timeout period for an operation to ensure that the operation is completed within the specified time, avoiding process blocking or infinite waiting.
10. Use for Pipe
inter-process communication
Pipe
A two-way communication mechanism between processes is provided to transfer data between two processes.