Synchronous Asynchronous
1, synchronized
Synchronization refers to a process in the implementation of a request, if the request is take some time to return information, that this process will wait forever until it receives a return message, only to continue execution
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor import os, random def task(i): print(f'{os.getpid()}开始了任务') time.sleep(random.randint(1,3)) print(f'{os.getpid()}结束了任务') return i if __name__ == '__main__': p = ProcessPoolExecutor() for i in range(10): obj = p.submit(task, i) print(obj.result()) p.shutdown(wait=True)
2, asynchronous
Refers to the process do not need to wait forever, but continue to do the following, regardless of the state of other processes, when a message is processed the system will return to the notification process, which can improve the efficiency of execution
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor import os import requests def task(i): print(f'{os.getpid()}开始了任务') time.sleep(random.randint(1,3)) print(f'{os.getpid()}结束了任务') return i if __name__ == '__main__': p = ThreadPoolExecutor() li = [] for i in range(10): obj = p.submit(task, i) li.append(obj) p.shutdown() for i in li: print(i.result()) ##################map函数的应用 if __name__ == '__main__': p = ProcessPoolExecutor() obj = p.map(task, range(10)) p.shutdown() print(list(obj))
Callback
The difference between process and thread pool pool
- Process pool: the main process calls the callback function, task result
- Thread Pool: Idle thread calls the callback function, task result
When a callback function, the callback function is the most common in reptiles, it is very time-consuming manufacturing data, processing data without time-consuming
Scene callback function: Any task process pool once completed, will immediately inform the main process, the main process is to call a function to handle the results of the function that is a callback function.
We can time-consuming process that is blocked task into the pool, and then specify a callback function (main process responsible for implementation), so that the main process in the implementation of the callback function eliminating the need for blocking IO process, it directly to a single task results
Plus asynchronous callback processing site
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor import os import requests def url_page(url): response = requests.get(url) print(f'{os.getpid()} is getting {url}') if response.status_code == 200: return {"url": url, "text": response.text} def parse_page(res): res = res.result() with open('url.text', 'a', encoding='utf-8') as f: parse_res = f"url:{res['url']} size:{len(res['text'])}" f.write(parse_res+'\n') if __name__ == '__main__': p = ProcessPoolExecutor(4) # p = ThreadPoolExecutor(4) l = [ 'http://www.baidu.com', 'http://www.baidu.com', 'http://www.baidu.com', 'http://www.baidu.com', 'http://www.JD.com', 'http://www.JD.com', 'http://www.JD.com', 'http://www.JD.com', 'http://www.JD.com', 'http://www.JD.com', ] for i in l: # res = p.submit(url_page, i).add_done_callback(parse_page) ret = p.submit(url_page, i) ret.add_done_callback(parse_page) p.shutdown()