笔记-python-coroutine

 笔记-python-coroutine

1.      协程

1.1.    协程的概念

协程,又称微线程,纤程。英文名Coroutine。协程是一种用户态的轻量级线程。

线程是系统级别的,它们是由操作系统调度;协程是程序级别的,由程序员根据需要自己调度。我们把一个线程中的一个个函数叫做子程序,那么子程序在执行过程中可以中断去执行别的子程序;别的子程序也可以中断回来继续执行之前的子程序,这就是协程。也就是说同一线程下的一段代码<1>执行着执行着就可以中断,然后跳去执行另一段代码,当再次回来执行代码块<1>的时候,接着从之前中断的地方开始执行。

比较专业的理解是:

  协程拥有自己的寄存器上下文和栈。协程调度切换时,将寄存器上下文和栈保存到其他地方,在切回来的时候,恢复先前保存的寄存器上下文和栈。因此:协程能保留上一次调用时的状态(即所有局部状态的一个特定组合),每次过程重入时,就相当于进入上一次调用的状态,换种说法:进入上一次离开时所处逻辑流的位置。

个人理解:

一个程序是一个任务组合,进程和线程是操作系统级的资源单位,这是任务细分的标准,程序依此来设计和调度。

有些时候需要在函数间更灵活的调度,又不想老是传递一堆变量,在编译器级别设计了上下文管理器,可以在一个函数中中断并切换到其它函数,然后又可以切回来继续运行该函数。这种机制/体系就叫协程。

理论上这些调度可以使用函数,但会有大量数据重复,干脆在编译器中设计好了,切走时保存好上下文,切回来时继续执行下一个语句。

1.2.    yield实现协程

代码:

def consumer(name):

    print(name)

    r = ''

    while True:

        n = yield r

        if not n:

            return

        print('consumer:%s %s ' %(n,name))

        r = '200 OK'

def produce(c,d):

    c.send(None)

    d.send(None)

    n = 0

    while n < 5:

        n = n + 1

        print('producer:%s' %n)

        r = c.send(n)

        d.send(n)

        print('producer: consumer return %s' %r)

    c.close()

    d.close()

c = consumer('B')

d = consumer('A')

produce(c,d)

2.      greenlet

2.1.    开始的地方

官方文档:https://greenlet.readthedocs.io/en/latest/

The “greenlet” package is a spin-off of Stackless, a version of CPython that supports micro-threads called “tasklets”. T

Greenlets are provided as a C extension module for the regular unmodified interpreter.

使用案例1:

from greenlet import greenlet

def test1():

    print(12)

    gr2.switch()

    print(34)

def test2():

    print(56)

    gr1.switch()

    print(78)

gr1 = greenlet(test1)

gr2 = greenlet(test2)

gr1.switch()

switch切换到gr1,gr2,gr1,然后就结束了,不会再切回到test2,print(78)不会执行。

整个greenlet可以理解为在执行代码外包含一层环境,环境是嵌套的,可以切换。

官方解释:

A “greenlet” is a small independent pseudo-thread. Think about it as a small stack of frames; the outermost (bottom) frame is the initial function you called, and the innermost frame is the one in which the greenlet is currently paused. You work with greenlets by creating a number of such stacks and jumping execution between them. Jumps are never implicit: a greenlet must choose to jump to another greenlet, which will cause the former to suspend and the latter to resume where it was suspended. Jumping between greenlets is called “switching”.

When you create a greenlet, it gets an initially empty stack; when you first switch to it, it starts to run a specified function, which may call other functions, switch out of the greenlet, etc. When eventually the outermost function finishes its execution, the greenlet’s stack becomes empty again and the greenlet is “dead”. Greenlets can also die of an uncaught exception.

既然是嵌套,一定会有父子,兄弟关系,并且很重要的一个问题是执行顺序。

Remember, switches are not calls, but transfer of execution between parallel “stack containers”, and the “parent” defines which stack logically comes “below” the current one.。

2.2.    重要的属性

dir(greenlet)列出属性如下:

['GREENLET_USE_GC', 'GREENLET_USE_TRACING', 'GreenletExit', '_C_API', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '__version__', 'error', 'getcurrent', 'gettrace', 'greenlet', 'settrace']

greenlet.greenlet is the greenlet type, which supports the following operations:

greenlet(run=None, parent=None)

Create a new greenlet object (without running it). run is the callable to invoke, and parent is the parent greenlet, which defaults to the current greenlet.

greenlet.getcurrent()

Returns the current greenlet (i.e. the one which called this function).

greenlet.GreenletExit

This special exception does not propagate to the parent greenlet; it can be used to kill a single greenlet.

The greenlet type can be subclassed, too. A greenlet runs by calling its run attribute, which is normally set when the greenlet is created; but for subclasses it also makes sense to define a runmethod instead of giving a run argument to the constructor.

看一下greenlet.greenlet

>>> dir(greenlet.greenlet)

['GreenletExit', '__bool__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_stack_saved', 'dead', 'error', 'getcurrent', 'gettrace', 'gr_frame', 'parent', 'run', 'settrace', 'switch', 'throw']

  1. run 当greenlet启动时会调用到这个callable;启动后属性不复存在;
  2. switch(*args, **kwargs) 切换
  3. parent 父对象,可读写,但不允许存在循环;
  4. gr_frame:the current top frame, or None;
  5. dead: True if greenlet is dead(it finished its execution).
  6. bool(g): True if g is active, False if it is dead or not yet started.
  7. throw([typ, [val,[tb]]]):

Switches execution to the greenlet g, but immediately raises the given exception in g. If no argument is provided, the exception defaults to greenlet.GreenletExit. The normal exception propagation rules apply, as described above. Note that calling this method is almost equivalent to the following:

def raiser():

    raise typ, val, tb

g_raiser = greenlet(raiser, parent=g)

g_raiser.switch()

except that this trick does not work for the greenlet.GreenletExit exception, which would not propagate from g_raiser to g.

2.3.    switch

纯切换没什么好说的,带参数则是另外一回事了。

Switches between greenlets occur when the method switch() of a greenlet is called, in which case execution jumps to the greenlet whose switch() is called, or when a greenlet dies, in which case execution jumps to the parent greenlet. During a switch, an object or an exception is “sent” to the target greenlet; this can be used as a convenient way to pass information between greenlets. For example:

def test1(x, y):

    z = gr2.switch(x+y)

    print(z)

def test2(u):

    print(u)

    gr1.switch(42)

gr1 = greenlet(test1)

gr2 = greenlet(test2)

gr1.switch("hello", " world")

切换传值规则:

g.switch(*args, **kwargs)

Switches execution to the greenlet g, sending it the given arguments. As a special case, if gdid not start yet, then it will start to run now.

Dying greenlet

If a greenlet’s run() finishes, its return value is the object sent to its parent. If run() terminates with an exception, the exception is propagated to its parent (unless it is a greenlet.GreenletExitexception, in which case the exception object is caught and returned to the parent).

需要注意的是z = gr2.switch(x+y),这个部分与yield的send类似,会接收传递的值。

Note that any attempt to switch to a dead greenlet actually goes to the dead greenlet’s parent, or its parent’s parent, and so on. (The final parent is the “main” greenlet, which is never dead.)

切换到dead greenlet的尝试会切到目的greenlet的父级(或祖级)。

2.4.    greenlets

协程是基于线程的,可以在线程中拥有一个main greenlet和一个子系greenlet树。不能跨线程。

2.5.    垃圾回收

If all the references to a greenlet object go away (including the references from the parent attribute of other greenlets), then there is no way to ever switch back to this greenlet. In this case, a GreenletExit exception is generated into the greenlet. This is the only case where a greenlet receives the execution asynchronously. This gives try:finally: blocks a chance to clean up resources held by the greenlet. 

一个greenlet所有的引用都停止时,会抛出异常。

代码示例:

from greenlet import greenlet, GreenletExit

huge = []

def show_leak():

    def test1():

        gr2.switch()

    def test2():

        huge.extend([x* x for x in range(100)])

        try:

            gr1.switch()

        finally:

            print 'finish switch del huge'

            del huge[:]

   

    gr1 = greenlet(test1)

    gr2 = greenlet(test2)

    gr1.switch()

    gr1 = gr2 = None

    print 'length of huge is zero ? %s' % len(huge)

if __name__ == '__main__':

    show_leak()

上述代码的没有正常结束(在第10行挂起)。第18行之后gr1,gr2的引用计数都变成0,那么会在第10行抛出GreenletExit异常,因此finally语句有机会执行。同时,在文章开始介绍Greenlet module的时候也提到了,GreenletExit这个异常并不会抛出到parent,所以main greenlet也不会出异常。

  看上去貌似解决了问题,但这对程序员要求太高了,百密一疏。所以最好的办法还是保证协程的正常结束。

Greenlets do not participate in garbage collection; cycles involving data that is present in a greenlet’s frames will not be detected. Storing references to other greenlets cyclically may lead to leaks.

还是注意协程的正常结束比较靠谱。

2.6.    tracing support

greenlet的调试需要能够理清它们之间的关系,好在不用每次自己去捋,greenlet模块提供了这方面的支持。

greenlet.gettrace()

Returns a previously set tracing function, or None.

greenlet.settrace(callback)

Sets a new tracing function and returns a previous tracing function, or None. The callback is called on various events and is expected to have the following signature:

代码示例:

def callback(event, args):

    if event == 'switch':

        origin, target = args

        # Handle a switch from origin to target.

        # Note that callback is running in the context of target

        # greenlet and any exceptions will be passed as if

        # target.throw() was used instead of a switch.

        return

    if event == 'throw':

        origin, target = args

        # Handle a throw from origin to target.

        # Note that callback is running in the context of target

        # greenlet and any exceptions will replace the original, as

        # if target.throw() was used with the replacing exception.

        return

代码示例:

def callback_t(event, args):

    print('{} from {} to {}'.format(event, id(args[0]), id(args[1])))

def test1(x, y):

    z = gr2.switch(x+y)

    print(z)

def test2(u):

    print(u)

    gr1.switch('wefweeger')

main = greenlet.getcurrent()

gr1 = greenlet(test1)

gr2 = greenlet(test2)

print('main is {}, gr1 is {}, gr2 is {}.'.format(id(main), id(gr1), id(gr2)))

oldtrace = greenlet.settrace(callback_t)

gr1.switch("hell", " world")

2.7.    总结

  1. greenlet类似于栈
  2. greenlet创生之后,一定要结束,不能switch出去就不回来了,否则容易造成内存泄露;
  3. greenlet不能跨线程;
  4. 不能存在循环引用,这个是官方文档明确说明;

3.      gevent

doc address: http://www.gevent.org/contents.html

gevent is a coroutine -based Python networking library that uses greenlet to provide a high-level synchronous API on top of the libev or libuv event loop.

比较长,另开一章。

猜你喜欢

转载自www.cnblogs.com/wodeboke-y/p/10035638.html