Python project practice: multi-threaded parallel computing + multi-process parallel computing


Insert image description here

1. Introduction: [Process + Multi-process] + [Thread + Multi-thread]

  • 进程(Process)A process is an execution unit in an operating system. Each process has its own independent memory space, containing code, data and resources.
    • Processes are independent of each other, and the crash of one process will not affect other processes.
    • Communication between processes is relatively complex and usually requires the use of specific mechanisms, such as pipes, message queues, or shared memory.
    • Creating and destroying processes is expensive.
    • Processes are suitable for scenarios where multiple tasks need to run completely independently and do not share resources.
  • 多进程(Multiprocessing)Multiprocessing is a mechanism for running multiple processes at the same time. Each process executes independently and has its own resources, such as memory space and file handles.
    • Multi-process can make full use of multi-core processors to achieve parallel computing, thereby improving computing performance and system response speed.
    • Multiple processes are independent of each other, and data exchange and coordination can be achieved through inter-process communication.
  • 线程(Thread)A thread is an execution unit in a process. A process can contain multiple threads, which share the same memory space and resources.
    • Threads are interdependent and it is more convenient to share data and resources, but it can also easily lead to problems such as data competition.
    • Switching between threads is less expensive because they share the same process context.
    • Threads are suitable for scenarios where resources need to be shared between multiple tasks and frequent switching is required.
  • 多线程(Multithreading)Multithreading is running multiple threads simultaneously in a process
    • Multithreading can increase the concurrency of a program, thereby utilizing computer resources more efficiently.
    • However, multi-threaded programming needs to pay attention to thread synchronization and data sharing issues to avoid bugs such as race conditions and deadlocks.

1.1. CPU core processor supported by the system

CPU core processor supported by the system: depends on the hardware and operating system, that is, the number of cores of the processor and the configuration of the operating system.

  • Number of processor cores: Modern computers are often equipped with processors with multiple cores, each of which can execute one thread. Therefore, the number of cores of a processor determines the upper limit of the number of threads supported by the system.
  • Hyper-Threading Technology: Some processors support Hyper-Threading Technology, which allows one physical core to execute two threads simultaneously. This means that a processor can have twice as many threads as cores.
  • Operating system: Different operating systems have different support for the number of threads.

Generally speaking, on modern desktop and server computers, you can expect to support at least a few dozen threads. For example, a 4-core, 8-thread processor supports 8 threads executing simultaneously.

1.2. Parameter analysis of core processor: 12th Gen Intel( R ) Core( TM ) i7-12700 2.10 GHz

  • Passed down from generation to generation: "12th Gen"Refers to the fact that the processor is Intel's 12th generation Core i7 series processor . As technology continues to advance, each generation of processors brings higher performance, more features, and better energy efficiency.
  • Model Name: "Intel(R) Core(TM) i7-12700"Is the model name of this processor. Among them, "i7"it indicates that the processor belongs to the high-performance desktop processor series and indicates the specific model"12700" of this model in the 12th Gen Core i7 series .
  • Basic frequency: "2.10 GHz"Indicates that the basic frequency of the processor is 2.10 GHz , which is the clock frequency of the processor. This is the main frequency of the processor by default. In actual operation, the main frequency may be dynamically adjusted based on load and power consumption management.
  • Number of cores and threads: Core i7-12700It is a multi-core processor. It is usually equipped with multiple physical cores and supports Hyper-Threading technology, so that each physical core can execute two threads at the same time. This increases the processor's concurrent processing capabilities. The specific number of cores and threads needs to be referred to the specification sheet of the model, which is usually stated in parentheses after the model name.
  • Architecture and manufacturing process: Core i7-12700Belongs to Intel's "12th Gen Alder Lake" architecture. The architecture uses different core designs, including "Performance Cores" and "Efficiency Cores". Performance cores are used for high-performance computing tasks, while performance cores are used for light-load and power-sensitive tasks to provide better energy efficiency.
  • Technical support: Core i7-12700Supports many Intel technical features, such as hyper-threading technology, Turbo Boost technology (dynamically accelerated clock speed), memory cache, virtualization technology, etc. These technologies can improve processor performance and energy efficiency while supporting more computing and application scenarios.

2. Detailed explanation of functions

2.0. Calculate the number of CPU cores: os.cpu_count() + mp.cpu_count()

# 方法一
import os

num_cores = os.cpu_count()
print("系统支持的CPU核心处理器数量:", num_cores)
# 方法二
import multiprocessing as mp

num_cores = mp.cpu_count()
print("系统支持的CPU核心处理器数量:", num_cores)

2.1. Executor (Executor) used for [multi-threaded parallel computing]: concurrent.futures.ThreadPoolExecutor()

import concurrent.futures

"""
函数说明:concurrent.futures.ThreadPoolExecutor(max_workers):
输入参数:		max_workers		指定最大线程数。默认使用系统的CPU核心数作为最大线程数。
"""
#############################################################################	
# 使用举例:
	# (0)使用ThreadPoolExecutor来创建一个线程池。
	# (1)使用executor.submit将所有参数只应用一次到函数calculate(),完成并行化计算。
	# (2)使用executor.map将列表的每个参数循环应用到函数calculate(),完成并行化计算。
with concurrent.futures.ThreadPoolExecutor() as executor:
	future = executor.submit(my_function, arg1, arg2)  # 其中:my_function是执行函数,arg1和arg2是函数的参数。
	results = executor.map(my_function, [arg1, arg2, arg3])  # 其中:my_function是执行函数,[arg1, arg2, arg3]是一个包含函数参数的列表。
	result = future.result()  # 获取任务的执行结果
	executor.shutdown()  # 等待所有任务完成并关闭线程池
#############################################################################		
"""
使用方式:可以使用submit()、map()、shutdown()方法分别用于提交任务、并行计算、关闭线程池。
	(1)并行计算一个任务:submit()方法提交一个任务(函数)给ThreadPoolExecutor进行并行计算,并返回一个concurrent.futures.Future对象,可以用于获取任务的执行结果。	
	(2)并行计算多个任务:map()方法接收一个函数和可迭代的参数,并将函数应用于每个参数,实现并行计算。	
	(3)获取任务的执行结果:concurrent.futures.Future对象表示一个尚未完成的任务。如果任务尚未完成,result()方法会阻塞直到任务完成并返回结果。
	(4)等待所有任务完成并关闭线程池:shutdown()方法。如果不调用shutdown(),程序可能会在所有任务完成之前提前结束,导致一些任务未能执行完毕。
"""

2.2. Executor (Executor) used for [multi-process parallel computing]: concurrent.futures.ProcessPoolExecutor()

Application: Allows tasks to be executed in multiple processes, thereby enabling parallel computing, especially suitable for CPU-intensive tasks.

import concurrent.futures

"""
函数说明:concurrent.futures.ProcessPoolExecutor(max_workers):
输入参数:		max_workers		指定最大进程数。默认使用系统的CPU核心数作为最大进程数。
"""
#############################################################################	
# 使用举例:
	# (0)使用ProcessPoolExecutor来创建一个进程池。
	# (1)使用executor.submit将所有参数只应用一次到函数calculate(),完成并行化计算。
	# (2)使用executor.map将列表的每个参数循环应用到函数calculate(),完成并行化计算。
with concurrent.futures.ProcessPoolExecutor() as executor:
	future = executor.submit(my_function, arg1, arg2)  # 其中:my_function是执行函数,arg1和arg2是函数的参数。
	future = executor.map(my_function, [arg1, arg2, arg3])  # 其中:my_function是执行函数,[arg1, arg2, arg3]是一个包含函数参数的列表。
	result = future.result()  # 获取任务的执行结果
	executor.shutdown()  # 等待所有任务完成并关闭进程池
#############################################################################		
"""
使用方式:可以使用submit()、map()、shutdown()方法分别用于提交任务、并行计算、关闭进程池。
	(1)并行计算一个任务:submit()方法提交一个任务(函数)给ThreadPoolExecutor进行并行计算,并返回一个concurrent.futures.Future对象,可以用于获取任务的执行结果。	
	(2)并行计算多个任务:map()方法接收一个函数和可迭代的参数,并将函数应用于每个参数,实现并行计算。	
	(3)获取任务的执行结果:concurrent.futures.Future对象表示一个尚未完成的任务。如果任务尚未完成,result()方法会阻塞直到任务完成并返回结果。
	(4)等待所有任务完成并关闭进程池:shutdown()方法。如果不调用shutdown(),程序可能会在所有任务完成之前提前结束,导致一些任务未能执行完毕。
"""

2.3. Application fields of multi-process and multi-thread

  • 多线程: Applicable toIO intensive tasks, which allows the CPU to switch to other threads during IO waiting to improve system efficiency.
    • IO intensive tasks: tasksThe main bottleneck is input/output (IO) operations, not computational operations. This type of task involves a large amount of reading, input, network communication or other IO operations. The main time during task execution is spent waiting for the IO operation to complete.
    • In Python, multi-threading cannot achieve true parallel computing due to the existence of the Global Interpreter Lock (GIL).
    • Single thread processing IO operations : reducing system efficiency. Because the thread will be blocked while waiting for IO, it cannot handle other tasks at the same time.
    • Multi-threaded parallel computing : improve system efficiency. Let the CPU switch to other tasks during IO waiting to make full use of computing resources.

Typical IO-intensive tasks include: (1) File reading and writing: a large number of file reading and writing operations, such as reading large data files, writing log files, etc. (2) Network communication: tasks involving network requests and responses, such as downloading files, sending and receiving network requests, etc. (3) Database operations: Perform a large number of reading and writing operations on the database, such as querying the database, writing data, etc. (4) Image/audio and video processing: IO operations in image, audio or video processing tasks, such as loading images, saving processed images, etc. (5) Concurrent network server: A server that handles a large number of concurrent client connections, where the main delay comes from network IO.

  • 多进程: Applicable toCPU intensive tasks, can make full use of multi-core processors to achieve true parallel computing.
    • CPU-intensive tasks: tasksThe main bottleneck is computational operations, not input/output (IO) operations. This type of task involves a large amount of calculation and processing, and most of the time during task execution is spent on CPU calculations.

Typical CPU-intensive tasks include: (1) Large-scale data processing: performing complex calculations, statistics, analysis and other operations on large amounts of data. (2) Numerical calculation: Perform large-scale numerical calculations, such as matrix operations, image processing, signal processing, etc. (3) Encryption and decryption: Perform a large number of data encryption or decryption operations. (4) 3D rendering: Perform complex three-dimensional graphics rendering, such as rendering operations in video games or animation production. (5) Parallel algorithms: Execute algorithms that require a large amount of parallel calculations, such as parallel sorting, parallel search, etc.

4. Project actual combat

4.1. Run the same tasks at the same time

4.1.1. Multi-thread parallel computing

"""
# 将下述算法优化为并行计算
for ii in range(image_raw.shape[0]):  	# 遍历3D的每个slice
    M = phase(image_median[ii], 4, 6)  	# 调用函数
    image_final_median[ii] = M  		# 保存计算结果
"""
import napari
import tifffile
import numpy as np
import concurrent.futures
from skimage.filters import median


def phase(img, param1, param2):
	# 定义phase函数。示例:需要根据实际情况实现该函数
	return calculated_phase

def calculate_phase(ii):
	# 定义calculate函数。
	M = phase(image_median[ii], 4, 6)
    image_final_median[ii] = M
    
if __name__ == "__main__":
	# 1、加载图像 + 图像处理
	image_path = r'D:\downSampleImage.tif'
	image_raw = tifffile.imread(image_path)  # 3D灰度图像:100x110x120
	image_median = median(image_raw)  # 中值滤波
	image_final_median = np.zeros_like(image_median)  # 新建数组
	
	# 2、并行计算
	with concurrent.futures.ThreadPoolExecutor() as executor:
	    executor.map(calculate_phase, range(image_raw.shape[0]))

	# 3、在napari中显示图像
	viewer = napari.Viewer()  # 创建napari视图
	viewer.layers.clear()  # 清空图层
	viewer.add_image(image_median, name="image_median")  # 添加图像
	viewer.add_image(image_final_median, name="image_final_median")  # 添加图像
	napari.run()  # 显示napari图形界面
	

4.1.2. Multi-process parallel computing

"""
# 将下述算法优化为并行计算
for ii in range(image_raw.shape[0]):  	# 遍历3D的每个slice
    M = phase(image_median[ii], 4, 6)  	# 调用函数
    image_final_median[ii] = M  		# 保存计算结果
"""
import napari
import tifffile
import numpy as np
import concurrent.futures
from skimage.filters import median


def phase(img, param1, param2):
	# 定义phase函数。示例:需要根据实际情况实现该函数
    return calculated_phase

def calculate_phase(ii):
	# 定义calculate函数。
    M = phase(image_median[ii], 4, 6)  # 归一化结果:M = [0~1]
    # image_final_median[ii] = M
    return M
    
"""
一、多进程并行计算:子进程中的变量必须是全局变量。
	举例说明:在calculate函数中,image_median将提示未定义。
	
二、多进程并行计算:子进程无法直接修改主进程的变量。(若调用,系统不提示且不报错)
	解决方法一:可以通过返回(子进程)计算结果,然后在(主进程)遍历获取。
	解决方法二:可以在调用多进程时,将所需要的变量传给子进程。
	
	举例说明:在calculate函数中,对(主进程变量)image_final_median的赋值操作失败,最终得到的image_final_median为空。
	具体做法:results = executor.map(calculate, range(image_raw.shape[0]))
"""


# 1、定义全局变量(多进程)
image_path = r'D:\downSampleImage.tif'
image_raw = tifffile.imread(image_path)  # 3D灰度图像:100x110x120
image_median = median(image_raw)  # 中值滤波   
image_final_median = np.zeros_like(image_median)  # 新建数组

if __name__ == "__main__":
	
	# 2、并行计算
	with concurrent.futures.ProcessPoolExecutor() as executor:
        results = executor.map(calculate_phase, range(200, 202))  # image_raw.shape[0]
    for ii, result in enumerate(results):  # 备注:ii从0开始,而不是200
        image_final_median[ii] = result
	
	# 3、在napari中显示图像
	viewer = napari.Viewer()  # 创建napari视图
	viewer.layers.clear()  # 清空图层
	viewer.add_image(image_median, name="image_median")  # 添加图像
	viewer.add_image(image_final_median, name="image_final_median")  # 添加图像
	napari.run()  # 显示napari图形界面
	

4.2. Run different tasks at the same time

  • Multithreading is suitable for I/O-intensive tasks, because the overhead of thread switching is small and multiple I/O operations, such as file reading and writing, network requests, etc., can be effectively executed in parallel. However, due to Python's Global Interpreter Lock (GIL), multi-threading has limited performance on CPU-intensive tasks.

  • Multi-process is suitable for CPU-intensive tasks, because each process has an independent Python interpreter and memory space, is not restricted by GIL, and can make full use of multi-core processors. For CPU-intensive tasks, multiple processes are usually faster than multiple threads.

  • Coroutines are suitable for high-concurrency I/O-intensive tasks. Coroutines allow multiple tasks to be executed in a single thread, avoiding the overhead of thread switching, but asynchronous code needs to be designed appropriately. Coroutines can achieve very high concurrency performance, but may perform poorly on CPU-intensive tasks.

  • Parallel computing libraries such as concurrent.futures, joblib, dask, etc. can provide simple interfaces to manage parallel tasks. Performance depends on the underlying parallel execution strategy and hardware resources.

4.2.1. Multi-threading

import threading
import time

# 定义任务1
def task1():
    for i in range(5):
        print("Task 1 - Step", i + 1)
        time.sleep(1)  # 模拟耗时操作

# 定义任务2
def task2():
    for i in range(3):
        print("Task 2 - Step", i + 1)
        time.sleep(1)  # 模拟耗时操作

if __name__ == "__main__":
	# 创建两个线程
	thread1 = threading.Thread(target=task1)
	thread2 = threading.Thread(target=task2)
	
	# 启动线程
	thread1.start()
	thread2.start()
	
	# 等待线程完成
	thread1.join()
	thread2.join()
	
	print("All tasks are completed.")

"""
Task 1 - Step 1
Task 2 - Step 1
Task 2 - Step 2
Task 1 - Step 2
Task 1 - Step 3
Task 2 - Step 3
Task 1 - Step 4
Task 1 - Step 5
All tasks are completed.
"""

4.2.2. Multi-process

import multiprocessing
import time

# 定义任务1
def task1():
    for i in range(5):
        print("Task 1 - Step", i + 1)
        time.sleep(1)  # 模拟耗时操作

# 定义任务2
def task2():
    for i in range(3):
        print("Task 2 - Step", i + 1)
        time.sleep(1)  # 模拟耗时操作

if __name__ == "__main__":
    # 创建两个进程
    process1 = multiprocessing.Process(target=task1)
    process2 = multiprocessing.Process(target=task2)

    # 启动进程
    process1.start()
    process2.start()

    # 等待进程完成
    process1.join()
    process2.join()

    print("All tasks are completed.")

"""
Task 1 - Step 1
Task 2 - Step 1
Task 2 - Step 2
Task 1 - Step 2
Task 2 - Step 3
Task 1 - Step 3
Task 1 - Step 4
Task 1 - Step 5
All tasks are completed.
"""

4.2.3. Coroutine (using asyncio)

import asyncio

# 定义任务1
async def task1():
    for i in range(5):
        print("Task 1 - Step", i + 1)
        await asyncio.sleep(1)  # 模拟异步操作

# 定义任务2
async def task2():
    for i in range(3):
        print("Task 2 - Step", i + 1)
        await asyncio.sleep(1)  # 模拟异步操作

async def main():
    # 并行执行 task1 和 task2
    await asyncio.gather(task1(), task2())

if __name__ == "__main__":
    asyncio.run(main())

"""
Task 1 - Step 1
Task 2 - Step 1
Task 1 - Step 2
Task 2 - Step 2
Task 1 - Step 3
Task 2 - Step 3
Task 1 - Step 4
Task 1 - Step 5
"""

4.2.4. Parallel computing library (using concurrent.futures) is extremely fast

import concurrent.futures

# 定义任务1
def task1():
    for i in range(5):
        print("Task 1 - Step", i + 1)

# 定义任务2
def task2():
    for i in range(3):
        print("Task 2 - Step", i + 1)

if __name__ == "__main__":
    # 使用 ThreadPoolExecutor 创建线程池
    with concurrent.futures.ThreadPoolExecutor() as executor:
        # 提交任务1和任务2给线程池
        future1 = executor.submit(task1)
        future2 = executor.submit(task2)

        # 获取任务1和任务2的结果
        result1 = future1.result()
        result2 = future2.result()

    # 在这里执行任何需要等待线程池完成的后续操作

"""
Task 1 - Step 1
Task 1 - Step 2
Task 1 - Step 3
Task 1 - Step 4
Task 1 - Step 5
Task 2 - Step 1
Task 2 - Step 2
Task 2 - Step 3
"""

Guess you like

Origin blog.csdn.net/shinuone/article/details/132047079