[Python parallel 3] process

Process articles

Basic use 1

#coding=utf-8

import multiprocessing
import os       # 获取pid用
import time     # 延时用

# 子进程要执行的函数
def child_proc(name):
    print(f'child process {name} pid: {os.getpid()}')
    time.sleep(3)
    print(f'{name} finish')

# 主进程,必须在主模块中执行
if __name__ == '__main__':
    print(f'parent process {os.getpid()} is running')

    # 生成子进程
    p1 = multiprocessing.Process(target = child_proc, args = ('child-1',))
    p2 = multiprocessing.Process(target = child_proc, args = ('child-2',))
    p1.start()
    p2.start()

    print(f'parent process {os.getpid()} is end')

Export

parent process 20114 is running
parent process 20114 is end
child process child-1 pid: 20115
child process child-2 pid: 20116
child-1 finish
child-2 finish

Note! Python official documents must mention why you want to use if __name__ = '__main__', because all the functionality of the package needs to be the main module into sub-modules, but not the IDLE __main__module import sub-module, we can only edited program execution in a file
more : multiprocessing - Process-based Parallelism


multiprocessing module

Note : depending on the viewing multiprocessingmodule declaration, first-line comments show # Package analogous to 'threading.py' but using processes, you can find similar operation with threads of a process (even multi-process module is directly thread module change overnight)

Function declaration : class multiprocessing. Process (Group = None, target = None, name = None, args = (), kwargs = {}, , daemon = None)
(Group: Official reserved parameters)
target: the child thread to be executed function
name: name to the child thread
args: passing parameters to the function to be performed
(a tuple type) *
daemon: background thread to thread 2

Thread class contains methods:

  • Start (): begin the process, it will be arranged in a separate process control manipulation of the object run()by calling the method (the Invoked) (if multiple calls, errors occur AssertionError)
  • RUN (): You can override this method in a subclass, the standard run()method will be passed in the constructor targetafter calling its argument
  • join blocks the current process until the waiting call: (timeout = None) join()the end of the process, or to reach the set time-out timeoutuntil the parameter
  • name : process name
  • is_alive (): determine whether the process is running
  • daemon : whether the property value of the background process
  • pid : Returns the process id
  • the Terminate (): End Process
    (On Unix, use SIGTERMsemaphores completed; TerminateProcess use on Windows ())
    (Please note, not an exit processing procedures and final clauses, etc. Please note that future generations will not process the process. is terminated - they will simply become isolated).
    (Note: If you use this method when using a pipe or queue associated process, the pipe or queue may be damaged and may not be used by another process Similarly, if the process. lock was obtained or semaphores, etc., it could lead to the termination of other processes deadlock.)
  • the kill (): kill the process, and terminate () the same, but using in Unix SIGKILLsemaphore
  • use Close (): Close the process object, release all associated resources. If it is still running, an error will be raised ValueError, the first call will return successfully, the other call will raise an errorValueError
  • ExitCode : exit code of the child process. Terminate if not terminated, will be None; if the semaphore is Nterminated, will return -N
  • authkey : the identity of the key processes (1 byte string). When multipriocessingthe time is initialized in the main process will be used os.urandom()to mark a random string. When a process object is created, it will inherit the identity of the key from the parent process, although it may be replaced by another string of bytes
  • Sentinel : (Sentinel) When the end of the process, a number of object processing system becomes ready. If you want to wait a few incidents immediately, you can use this value multiprocessing.connection.wait(), otherwise the call join()easier.

Pool process pool

If the process too much, more than the number of CPU cores, will lead to switch back and forth between processes, affect performance. Through the process of creating the pool, the process of accession to the inside, if the pool is not full process will create a process to execute the request; the pool if the process reaches the maximum specified, then the request will wait
function declaration : class multiprocessing.pool. pool ([processes [, initializer [, initargs [, maxtasksperchild [, context]]]]])
processes: Specifies the number of process pool, if not specified, was None, by default it will use os.cpu_count()
initializer: If it is not None, in each process will be called at the beginning initializer(*initargs)
maxtasksperchild: the number of tasks before exiting the work process and replaced with a new work process can be done in order to free up resources. The default is Noneto indicate the time to live and work processes as long as the pool of
context: to specify the work process context

There are two ways to add the following process to process the pool, respectively apply_async, andapply

apply_async

apply_async()It is used to synchronize the implementation process, allowing multiple processes simultaneously into the pool; asynchronous non-blocking
function declaration : apply_async (FUNC [, args [, kwds [, callback [, error_callback]]]])
callback: If you specify a callback, then it should be a callable that accepts one parameter

#coding=utf-8

import multiprocessing
import time     # 延时用

# 子进程要执行的函数
def child_proc(index):
    print(f'{index} process is running')
    time.sleep(3)
    print(f'{index} process is end')

# 主进程,必须在主模块中执行
if __name__ == '__main__':
    print(f'all process start')

    # 生成进程池
    p = multiprocessing.Pool()
    for i in range(5):
        p.apply_async(func = child_proc, args = (i,))
    p.close()
    p.join()    # 注! 如果不执行此句,将会直接退出主进程

    print(f'all process done!')

Export

all process start
0 process is running
1 process is running
2 process is running
3 process is running
4 process is running
0 process is end
3 process is end
2 process is end
4 process is end
1 process is end
all process done!

apply

apply(): Only one process into the pool, at the end of a process, in order to enter another; it is blocked
function declaration : apply (func [, args [ , kwds]])

p.apply(func = child_proc, args = (i,)) # 将上面代码中的apply_async换成apply即可

Export

all process start
0 process is running
0 process is end
1 process is running
1 process is end
2 process is running
2 process is end
3 process is running
3 process is end
4 process is running
4 process is end
all process done!

Analysis: comparison output apply_async apply and can be found, apply_async is synchronous, and apply an entry is a pool of execution


data sharing

Reference: Python multi-process programming - inter-process shared data
( Pipe, Queuehave some data sharing, but they will block the process)
Queue : the use of shared memory queue way to share data
Note : There is queue.Queue, multiprocessing.Queuetwo kinds of queue
queue.Queue: is the process in the non-blocking queue, each process private
multiprocessing.Queue: cross-process communication queue, there are various sub-processes

Shared Memory : use of multiprocessing Value, Arrayclass, shared memory is the way to share data
sharing processes use of multiprocessing: Managersharing data type, the way the process of sharing

Get Return value (there are only eligible for the main process data)

Way process pool for the realization of the object can be directly through the process of obtaining the return value

#coding=utf-8

import multiprocessing

# 子进程要执行的函数
def child_proc(x, y):
    return x + y

# 主进程,必须在主模块中执行
if __name__ == '__main__':
    # 生成进程池
    p = multiprocessing.Pool()
    z = p.apply(func = child_proc, args = (1, 2))
    print(z)

Queue

The use of multiprocessing Queueclass data sharing between processes

#coding=utf-8

from multiprocessing import Process, Queue

# 子进程要执行的函数
def child_proc(queue):
    num = queue.get()
    num += 10
    queue.put(num)

# 主进程,必须在主模块中执行
if __name__ == '__main__':
    # 创建共享数据
    queue = Queue()
    queue.put(1000)

    # 创建进程
    p1 = Process(target = child_proc, args = (queue,))
    p2 = Process(target = child_proc, args = (queue,))
    p1.start()
    p2.start()
    p1.join()
    p2.join()

    # 打印结果
    print(queue.get())

Value、Array

There are two shared memory structure - Value, Arraythey have achieved an internal locking mechanism, and therefore multi-process safe

#coding=utf-8

from multiprocessing import Process, Value, Array

# 子进程要执行的函数
def child_proc(num, li):
    num.value += 100
    for i in range(len(li)):
        li[i] += 10

# 主进程,必须在主模块中执行
if __name__ == '__main__':
    # 创建共享数据
    num = Value('d', 0.0)
    li = Array('i', range(10))

    # 创建进程
    p1 = Process(target = child_proc, args = (num, li))
    p2 = Process(target = child_proc, args = (num, li))
    p1.start()
    p2.start()
    p1.join()
    p2.join()

    # 打印结果
    print(num.value)
    for x in li:
        print(x)

Annex : Value, Arrayyou need to set the type of the value stored therein. d: double; i: int; c: char , etc.
Details : Go multiprocessing. sharedctypes can check the various types of string definitions

Manager

(Above shared memory Array by Value and structure to achieve these values in the main management process, it is dispersed)
Manager to sharing data through shared memory, data types supported by many
details : multiprocessing.managers

#coding=utf-8

from multiprocessing import Process, Manager

# 子进程要执行的函数
def child_proc(dict1, list1):
    dict1['yourname'] += ' snow'
    list1[3] += 10

# 主进程,必须在主模块中执行
if __name__ == '__main__':
    # 创建共享数据
    manager = Manager()
    dict1 = manager.dict()
    list1 = manager.list(range(4))

    dict1['yourname'] = 'youmux'
    list1[3] = 0

    # 创建进程
    p1 = Process(target = child_proc, args = (dict1, list1))
    p2 = Process(target = child_proc, args = (dict1, list1))
    p1.start()
    p2.start()
    p1.join()
    p2.join()

    # 打印结果
    print(dict1['yourname'])
    for x in list1:
        print(x)

1. Reference books: Reference books: "Python Parallel Programming Manual"

2. The background thread: the main background thread after thread stops directly stop running. Their resources (such as opening a file, database transactions, etc.) may not be properly released. If you want your thread graceful stop, so they do not become a suitable signaling mechanism such as the background and using eventEvent

Guess you like

Origin www.cnblogs.com/maplesnow/p/12044387.html