Similar IO (reptiles) high-performance-related

Question: I'll give you 10 pictures of the url, you help me to put 10 pictures download.
Before you

  

urls = ['http://www.xx1.png','http://www.xx1.png','http://www.xx10.png',]
			for url in urls:
				response = requests.get(url)
				with open(url+'.png','wb') as f:
					f.write(response.content)

  

This form can achieve the above tasks, but the efficiency is very low, if every time a url of io 2s, so will have to spend 6s, this is not efficient

 

Here are several schemes can achieve high performance

1. Multithreading:

  Cons: the thread is not high utilization, access to a url for each thread after it idle.

"""
多线程

"""
import requests
import threading


urls = [
    'http://www.baidu.com/',
    'https://www.cnblogs.com/',
    'https://www.cnblogs.com/news/',
    'https://cn.bing.com/',
    'https://stackoverflow.com/',
]

def task(url):
    response = requests.get(url)
    print(response)

for url in urls:
    t = threading.Thread(target=task,args=(url,))
    t.start()

2. coroutine:

  Since io encountered when coroutine can be switched (internal switch), efficiency

  

"" " 
Coroutine + IO switching 
pip3 install gevent 
internal gevent greenlet call (to achieve a coroutine). " 
"" 
From gevent Import Monkey; monkey.patch_all () 
Import gevent 
Import Requests 


DEF FUNC (URL): 
    Response = requests.get (URL) 
    Print (Response) 

URLs = [ 
    'http://www.baidu.com/', 
    'https://www.cnblogs.com/', 
    'https://www.cnblogs.com/news/' , 
    'https://cn.bing.com/', 
    'https://stackoverflow.com/', 
] 
spawn_list = [] 
for URLs in URL: 
    # here no transmission request, but are placed in a list. 
    spawn_list.append (gevent.spawn (FUNC, URL)) 

# initiation request; io encountered on switching 
gevent.

 

3. Based on the non-blocking asynchronous event loop module (module utilizes the interior of the IO multiplexing.) Twisted

"""
基于事件循环的异步非阻塞模块:Twisted
"""
from twisted.web.client import getPage, defer
from twisted.internet import reactor

def stop_loop(arg):
    reactor.stop()


def get_response(contents):
    print(contents)

deferred_list = []

url_list = [
    'http://www.baidu.com/',
    'https://www.cnblogs.com/',
    'https://www.cnblogs.com/news/',
    'https://cn.bing.com/',
    'https://stackoverflow.com/',
]

for url in url_list:
    deferred = getPage(bytes(url, encoding='utf8'))
    deferred.addCallback(get_response)
    deferred_list.append(deferred)


dlist = defer.DeferredList(deferred_list)
dlist.addBoth(stop_loop)

reactor.run()

 

Question: asynchronous non-blocking internal module based event loop is how to achieve? Use of IO multiplexing.
- is to write the socket on the nature reptile
- the benefits of non-blocking? When requesting no longer wait.

Internal implementation mechanism: IO multiplexing.

import socket
import select


class ChunSheng(object):

	def __init__(self):
		self.socket_list = []
		self.conn_list = []

		self.conn_func_dict = {}

	# url_func[0] url; url_func--(url,func_url)
	def add_request(self, url_func):
		conn = socket.socket()
		conn.setblocking(False)
		try:
			conn.connect((url_func[0], 80))
		except BlockingIOError as e:
			pass
		self.conn_func_dict[conn] = url_func[1]

		self.socket_list.append(conn)
		self.conn_list.append(conn)

	def run(self):
		"""
		检测self.socket_list中的5个socket对象是否连接成功
		:return:
		"""
		True the while: 
			# select.select 
			# first argument: wherein socket for detecting whether a response has been acquired content 
			# second argument: means for detecting whether wherein the socket has been connected successfully 

			# return a first value r: in particular that a socket obtained results 
			# return a second value w: a socket connector that is particularly successful 
			R & lt, W, E = select.select (self.socket_list, self.conn_list, [], 0.05) 
			for our sock in w: # [ SOCKET1, SOCKET2] 
				sock.send (b'GET / HTTP1.1 \ R & lt \ NHOst: xxxx.com \ R & lt \ n-\ R & lt \ n-') 
				self.conn_list.remove (our sock) 

			for R & lt our sock in: 
				Data = our sock. the recv (8096) 
				FUNC = self.conn_func_dict [our sock] 
				FUNC (Data) 
				sock.close () 
				self.socket_list.remove (our sock) 

			IF Not self.socket_list: 
				BREAK

  

What is asynchronous?
  Starting is a callback, automatically execute a function when a task is completed.
  Our contacts:
  - reptiles: callback function is performed automatically after the download is complete
  - ajax: sending a request to the background, execute the callback function after the request is complete.
- What is non-blocking?
  In fact, just do not wait, socket how to set setblocing (False) then the socket is no longer blocked.
- IO multiplexing role?
  Listening socket state:
    - whether the connection is successful
    - whether to obtain results
achieved IO multiplexing:
    - the SELECT, only listening 1024 socket; internal loops through all of the socket to detect;
    - poll, no limit to the number, internally All cycles to detect socket;
    - the epoll, no limit to the number, the callback.

 

Schemes 2 and 3 similar results, but the angle is not the same switch. Coroutine internal switching; and non-blocking asynchronous event loop based on an outsider module (in perspective God) for deployment;

 

Guess you like

Origin www.cnblogs.com/zenghui-python/p/11653385.html