python-- asynchronous: gevent, yeild, coroutine, aiohttp

2019-7-18:

 

Because of work reasons need to use asynchronous library, here is a summary of some asynchronous learning

 

 Why use asynchronous?

- This web application is usually the bottleneck in IO level, to solve the problem than to wait to read and write text parsing speed increase of more cost-effective.

- The advantages (cpu processing) brings multi-process has not been reflected, but to create the scheduling process and bring spending far beyond its positive effects , dragging a hind leg. Even so, the process of bringing more benefits compared to the previous single-threaded process model is much better. ( Coordination problems between between multi-process and multi-threaded in addition to a large overhead created outside there is a difficult to cure the defect, is to deal with the process or thread, because it is multi-process and multi-threaded programs without dependence locked usually it is not controllable, but coroutines can be the perfect solution to coordination problems, to determine scheduling between coroutines by the user. )

Reference: https://zhuanlan.zhihu.com/p/25228075

 


First, asynchronous:

  Asynchronous I / O framework uses nonblocking socket performing concurrent operations on a single thread . Python because of the GIL (Global Interpreter Lock) this stuff, there is no real multithreaded, multiprocessing and therefore will use concurrency in many cases, and multi-threaded applications in Python also pay attention to key areas of sync, not very convenient , multi-threaded and multi-process instead of using co-routines is a good choice, because it attractive feature: the initiative to call / exit state, avoid cpu context switch and so on ...

 

A vivid example:

Aihe Cha Zhang, do not talk nonsense, boil water.
Dramatis personae: Zhang, kettle two (ordinary kettle, referred kettle; will ring the kettle, kettle abbreviated ring).
1 Zhang put the kettle on the fire, such as legislation to open water. (Synchronous blocking)
Zhang felt a little silly

2 Zhang put the kettle on the fire to the living room to watch TV from time to time to see if the kitchen is not open water. (Synchronous non-blocking)

3 Zhang still felt a bit silly, so change high-end, bought a kettle that will ring the flute. After the water is boiling, it can emit a loud beep ~ ~ ~ ~ noise.
Zhang put the kettle on the fire ring, legislation and other open water. (Asynchronous blocking)
Zhang felt so little significance Shadeng
 
4 Zhang put the kettle on the fire ring, to the living room watching TV, not to see it before the kettle rang, rang get another pot. (Asynchronous non-blocking)
Zhang feel smart.

The so-called synchronous asynchronous, but for the purposes of the kettle.
Ordinary kettle, synchronization; whistling kettle, asynchronous.
Although able to work, but the whistling kettle may be completed after themselves, prompting Zhang open water. This is a common kettle can not reach.
Synchronization can only let the caller go to polling own (case 2), resulting in low efficiency of Zhang.

The so-called blocking non-blocking, only for Zhang terms.
Zhang Li, etc., obstruction; Zhang watching TV, non-blocking.
In cases 1 and 3 is blocked Zhang, his wife did not know he shouted. While 3 in response kettle is asynchronous, it may not make much sense for the erecting of Zhang. Therefore, the general is in line with asynchronous non-blocking use, so as to play an asynchronous effectiveness.


Author: Yu copied
link: https: //www.zhihu.com/question/19732473/answer/23434554
Source: know almost
copyrighted by the author. Commercial reprint please contact the author authorized, non-commercial reprint please indicate the source.

 

Editor: What is blocking, non-blocking, synchronous, asynchronous?

 

 

Second, the early asynchronous:

  Gevent with monkeys patch library, gevent gives us the ability to synchronize asynchronous logic to write the program. When we hit a monkey patch to the program, dynamically some network library (for example socket, thread) to replace the Python program is running into asynchronous library. Makes the program during the time the network operations have become asynchronous way to work, the efficiency of naturally much improved.

  gevent third-party libraries, by greenlet achieve coroutine, the basic idea is:

  When a greenlet encountered IO operations, such as access to the network, it automatically switches to the other greenlet, IO wait until the operation is complete, then switch back at the appropriate time to continue. Since the IO operation is very time-consuming, often make the program in a wait state, with gevent automatically switch coroutine for us to ensure that there is always greenlet running, instead of waiting for IO. Since the switching is done automatically when operating in IO, so gevent need to modify some of the standard library that comes with Python, when you start this process completed by monkey patch

  Python Python community is also aware of the need for a separate standard libraries to support coroutines, so there later asyncio.

 

Third, the coroutine Coroutine:

  Coroutine, also referred Coroutine. Be understood literally, i.e., the routine to run concurrently, it is the ratio of the thread (thread) finer order of user-thread, characterized by allowing the user to exit the active and active call, suspends the current routine and returns the value or to perform other tasks, then return to the original stopping point continue. The yield statement is executed to implement the function returns and other half will go to the same place and continue. (In fact, here we would like to thank operating system (OS) for the work we do, because it has getcontext and swapcontext these characteristics, through system calls, we can save the context and state together, to switch to another context, these characteristics are the coroutine implementation provides the basis for the underlying operating system Interrupts and Traps mechanism to achieve this was to provide a possibility) 

 

yeild:

  yield相当于return,它将相应的值返回给调用next()或者send()的调用者,从而交出了cpu使用权,而当调用者再调用next()或者send()时,又会返回到yield中断的地方,如果send有参数,又会将参数返回给yield赋值的变量,如果没有就跟next()一样赋值为None。

 

这里先插入生成器的知识:

  学过生成器和迭代器的同学应该都知道python有yield这个关键字,yield能把一个函数变成一个generator,与return不同,yield在函数中返回值时会保存函数的状态,使下一次调用函数时会从上一次的状态继续执行,即从yield的下一条语句开始执行,这样做有许多好处,比如我们想要生成一个数列,若该数列的存储空间太大,而我们仅仅需要访问前面几个元素,那么yield就派上用场了,它实现了这种一边循环一边计算的机制,节省了存储空间,提高了运行效率。

"""
举例:斐波那契数列
yeild把这个函数变成了生成器,然后当我需要时,调用它的next方法获得下一个值

"""

def fib(max):
    n, a, b = 0, 0, 1
    while n  max:
        yeild b
        a, b = b, a + b
        n = n + 1

 

使用协程的方式解决生产者-消费者模型:

#-*- coding:utf-8
def consumer():
    status = True
    while True:
        n = yield status
        print("我拿到了{}!".format(n))
        if n == 3:
            status = False

def producer(consumer):
    n = 5
    while n > 0:
    # yield给主程序返回消费者的状态
        yield consumer.send(n)    # 生产者调用了消费者的send()方法,把n发送给consumer(即c),在consumer中的n = yield status,n拿到的是消费者发送的数字,同时,consumer用yield的方式把状态(status)返回给消费者,注意:这时producer(即消费者)的consumer.send()调用返回的就是consumer中yield的status!消费者马上将status返回给调度它的主程序,主程序获取状态,判断是否错误,若错误,则终止循环,结束程序。
        n -= 1

if __name__ == '__main__':
    c = consumer()    # 因为consumer函数中存在yield语句,python会把它当成一个generator(生成器,注意:生成器和协程的概念区别很大,千万别混淆了两者),因此在运行这条语句后,python并不会像执行函数一样,而是返回了一个generator object
    c.send(None)   # 将consumer(即变量c,它是一个generator)中的语句推进到第一个yield语句出现的位置,那么在例子中,consumer中的status = True和while True:都已经被执行了,程序停留在n = yield status的位置(注意:此时这条语句还没有被执行),上面说的send(None)语句十分重要,如果漏写这一句,那么程序直接报错
    p = producer(c)  # 像上面一样定义了producer的生成器,注意的是这里我们传入了消费者的生成器,来让producer跟consumer通信
    for status in p:  # 循环地运行producer和获取它yield回来的状态
        if status == False:
            print("我只要3,4,5就行啦")
            break
    print("程序结束")

"""
现在我们要让生产者发送1,2,3,4,5给消费者,消费者接受数字,返回状态给生产者,而我们的消费者只需要3,4,5就行了,当数字等于3时,会返回一个错误的状态。最终我们需要由主程序来监控生产者-消费者的过程状态,调度结束程序
"""

  把n发送generator(生成器)中yield的赋值语句中,同时返回generator中yield的变量(结果)。

 

协程和生成器的区别:

有些人会把生成器(generator)和协程(coroutine)的概念混淆,两者的区别还是很大的。

直接上最重要的区别:

  • generator总是生成值,一般是迭代的序列

  • coroutine关注的是消耗值,是数据(data)的消费者

  • coroutine不会与迭代操作关联,而generator会

  • coroutine强调协同控制程序流,generator强调保存状态和产生数据

相似的是,它们都是不用return来实现重复调用的函数/对象,都用到了yield(中断/恢复)的方式来实现。

 

四、asyncio:

  asyncio是python 3.4中引入的异步IO库。为简单起见,您需要了解两件事:协程和事件循环。协程就像函数一样,但它们可以在代码中的某些点暂停或恢复。这用于在等待IO(例如HTTP请求)时暂停协程,并在此期间执行另一个协程。我们使用await(等同于yield from)关键字来声明我们想要一个协程的返回值。事件循环用于协调协程的执行。

  asyncio是python 3.4中新增的模块,它提供了一种机制,使得你可以用协程(coroutines)、IO复用(multiplexing I/O)在单线程环境中编写并发模型。

根据官方说明,asyncio模块主要包括了:

    • 具有特定系统实现的事件循环(event loop);

    • 数据通讯和协议抽象(类似Twisted中的部分);

    • TCP,UDP,SSL,子进程管道,延迟调用和其他;

    • Future类;

    • yield from的支持;

    • 同步的支持;

    • 提供向线程池转移作业的接口;

  在Python3.5中,引入了aync&await 语法结构,通过"aync def"可以定义一个协程代码片段,作用类似于Python3.4中的@asyncio.coroutine修饰符,而await则相当于"yield from"。

 

举个例子~~

"""
  当事件循环开始运行时,它会在Task中寻找coroutine来执行调度,因为事件循环注册了print_sum(),因此print_sum()被调用,执行result = await compute(x, y)这条语句(等同于result = yield from compute(x, y)),
  因为compute()自身就是一个coroutine,因此print_sum()这个协程就会暂时被挂起,compute()被加入到事件循环中,程序流执行compute()中的print语句,打印”Compute %s + %s …”,然后执行了await asyncio.sleep(1.0),
  因为asyncio.sleep()也是一个coroutine,接着compute()就会被挂起,等待计时器读秒,在这1秒的过程中,事件循环会在队列中查询可以被调度的coroutine,而因为此前print_sum()与compute()都被挂起了,因此事件循环会停下来等待协程的调度,
  当计时器读秒结束后,程序流便会返回到compute()中执行return语句,结果会返回到print_sum()中的result中,最后打印result,事件队列中没有可以调度的任务了,此时loop.close()把事件队列关闭,程序结束。
""" import asyncio async def compute(x, y): print("Compute %s + %s ..." % (x, y)) await asyncio.sleep(1.0) return x + y async def print_sum(x, y): result = await compute(x, y) print("%s + %s = %s" % (x, y, result)) loop = asyncio.get_event_loop() loop.run_until_complete(print_sum(1, 2)) loop.close()

 

 


 

 

五、AIOHTTP--用于asyncio和Python的异步HTTP客户端/服务器

https://aiohttp.readthedocs.io/en/stable/

可选的 cchardet作为chardet的更快替代品 

用于快速DNS解析的可选 aiodns

支持非阻塞异步I/O的库

 

Guess you like

Origin www.cnblogs.com/marvintang1001/p/11227756.html