python的multiprocessing的坑

1参数就算只有一个也要写成（arg,）的形式

2在类里面必须声明成静态方法

3不能将要执行子进程的方法写到于Pool或者Process同级的方法内部

4执行子进程的方法如果是使用map进程池的时候必须只能接受一个参数，所以多参数的时候要使用zip包裹起来，使用apply参数必须分开，不能只写为一个

5有的对象不可以被pickle则不能当作参数传递

6多进程中可以传递lock，但不能直接传递queue对象，只能使用manager的list等对像，或者必须在实例化queue对象的时候初始化queue才行，举例：

import multiprocessing as MP

def f(msg):

f.q.put(msg)

def q_init(q):

f.q = q

扫描二维码关注公众号，回复： 2582115 查看本文章

if __name__ == "__main__":

q = MP.Queue()

p = MP.Pool(None, q_init, (q,))

for i in range(0,8):

p.apply_async(f, (i,))

p.close()

p.join()

while not q.empty():

print(q.get())

。。。。未完待续

7 python2的multiprocessing不可以把方法写到类里面，必须写出来才能执行

8 不能在多个子进程中写同一个h5py文件然后再join（）之后再关闭文件，这样会导致写入的数据再关闭时被瞬间清空，不知道原因，只可以再单个子进程中写了数据就关闭才行。

9 须在close（）方法前加上result.get()方法才能捕捉子进程的错误，否则有可能子进程出错了就直接跳过了而不报错。该方法对map和apply都实用，已经测试过了

10 The multiprocessing module has a major limitation: it only accepts certain functions, and in certain situations. For instance any class methods, lambdas, or functions defined in __main__ wont’ work. This is due to the way Python “pickles” (read: serializes) data and sends it to the worker processes. “Pickling” simply can’t handle a lot of different types of Python objects.

Fortunately, there is a fork of the multiprocessing module called multiprocess that works just fine (pip install --user multiprocess). multiprocess uses dillinstead of pickle to serialize Python objects (read: send your data and functions to the Python workers), and does not suffer the same issues. Usage is identical:

# shut down the old workers
pool.close()

from multiprocess import Pool
pool = Pool(8)
%timeit pool.map(lambda x: time.sleep(0.1), range(24))
pool.close()

python的multiprocessing的坑

猜你喜欢