1. Multi-process
Why use multiple processes?
Due to the existence of GIL, multi-threading in python is not really multi-threading. If you want to fully use the resources of multi-core CPU, you need to use multi-process in most cases in python.
Why does multiprocessing consume a lot of money?
It was said before that it was because of the switch, but this is only a superficial reason. The fundamental reason is that every time a process is opened, a copy is made from the parent process.
multiprocessing.Process与threading.Thread
The multiprocessing package is a multiprocessing management package in Python. Similar to threading.Thread, it can use the multiprocessing.Process object to create a process. The process can run functions written inside a Python program. The Process object is used in the same way as the Thread object, and also has the methods of start(), run(), and join(). In addition, there are also Lock/Event/Semaphore/Condition classes in the multiprocessing package (these objects can be passed to each process through parameters like multithreading) to synchronize processes, and its usage is consistent with the class of the same name in the threading package. Therefore, a large part of multiprocessing uses the same API as threading, but in a multi-process context.
Second, the Process class
Construction method:
Process([group [, target [, name [, args [, kwargs]]]]])
group: thread group, not implemented yet, the prompt in the library reference must be None;
target: the method to be executed;
name: the process name;
args/kwargs: the parameters to be passed into the method.
Instance method:
is_alive(): Returns whether the process is running.
join([timeout]): Block the process of the current context until the process calling this method terminates or reaches the specified timeout (optional parameter).
start(): The process is ready, waiting for CPU scheduling
run(): strat() calls the run method. If the incoming target is not specified during the instance process, the star executes the default run() method.
terminate(): Stop the worker process immediately, regardless of whether the task is completed or not
Attributes:
daemon: the same as the thread's setDeamon function
name: Process name.
pid: Process ID.
3. Multi-process call
General call
from multiprocessing import Process import time def f(name): time.sleep(1) print("hello",name,time.ctime()) if __name__=="__main__": p_list=[] for i in range(3): p = Process(target=f,args=("ljy",)) p_list.append(p) p.start() for p in p_list: p.join() print("end")
Inheritance call
from multiprocessing import Process import time class MyProcess(Process): def run(self): time.sleep(1) print("hello",self.name,time.ctime()) if __name__=="__main__": p_list=[] for i in range(5): p = MyProcess () p.daemon =True #Printing end is over p.start() p_list.append(p) # for p in p_list: # p.join() print("end")
pending
from multiprocessing import Process import time,os def info(title): print("title",title) print("parent process:",os.getppid()) #pycharm print("provess id:",os.getpid()) #父进程 if __name__=="__main__": info("main process line") time.sleep(1) print("------") p=Process(target=info,args=("ljy",)) p.start() p.join() #The final result is the ID of the pycharm process and the process ID of this program and its child process ID
Fourth, process communication
Queue
The thread queue uses queue, the process queue uses Queue, and the process queue is to copy a piece of data from one process to another process. It is enough to understand this temporarily.
import multiprocessing,time def foo(q,n): time.sleep(1) print("son process",id(q)) q.put(n) if __name__=="__main__": #p_list=[] q=multiprocessing.Queue() for i in range(3): p=multiprocessing.Process(target=foo,args=(q,i)) p.start() print(q.get()) print(q.get()) print(q.get())
Pipe
Two objects are generated, and these two objects are bidirectional pipes, going back and forth.
import multiprocessing def f(conn): conn.send([12,{"name":"yuan"},"hello"]) response=conn.recv() print("response",response) conn.close() # print("q_ID2:",id(conn)) if __name__=="__main__": parent_conn,child_conn = multiprocessing.Pipe() #双向通道 # print("q_ID1:",id(child_conn)) p = multiprocessing.Process(target=f,args=(child_conn,)) p.start() print(parent_conn.recv()) parent_conn.send("hello son!") p.join()
The recv and send here look very similar to the recv and send in the socket, but they are actually different. Pay attention to this
So far, the functions completed by Queue and Pipe are communication between processes (one send and one receive), but the way of implementation is different, and neither of them realizes data sharing.
Manager
To implement multiple processes operating the same data, the supported data types are:
list
, dict
, Namespace
, Lock
, RLock
, Semaphore
, BoundedSemaphore
, Condition
, Event
, Barrier
, Queue
, Value
,Array
example
from multiprocessing import Process, Manager def f(d, l,n): d[n] = '1' #{0:"1"} d['2'] = 2 #{0:"1","2":2} l.append(n) #[0,1,2,3,4, 0,1,2,3,4,5,6,7,8,9] #print(l) if __name__ == '__main__': with Manager() as manager: d = manager.dict() # {}, get a manager-encapsulated dictionary l = manager.list(range(5)) # [0,1,2,3,4], get a manager-encapsulated list p_list = [] for i in range(10): p = Process(target=f, args=(d,l,i)) p.start() p_list.append(p) for res in p_list: res.join() print(d) print(l)