Detailed explanation of Python generators and coroutines

Starting today, we will begin to enter the difficulty of Python, that is 协程.

In order to understand the knowledge points of the coroutine, I checked a lot of relevant information on the Internet. I find it difficult to have a systematic and comprehensive article. As a result, when we study, we are often half-knowledgeable, and we are still confused after learning.

The first course of learning coroutine is to understand 生成器, and only with 生成器the foundation, can we understand better 协程.

If you are a novice, then you should know 迭代器that 生成器it should be relatively unfamiliar. It doesn't matter, after reading this series of articles, you can also successfully transition from Xiaobai to Ptyhon master.

1. Iterable, iterator, generator

When I first learned Python, I was really confused about these three things. Even think they are equivalent.

In fact, they are not the same.

Objects can be iterative, well understood, we are very 字符串familiar: list, dict, tuple, , deque, etc.

In order to verify what I said, you need to use collections.abcthis module (not in Python2) isinstance()to classify whether an object is iterable ( Iterable), whether it is an iterator ( Iterator), and whether it is a generator ( Generator).

These judgment methods are applicable here, but they are not absolutely applicable. For the reasons, please see the supplementary explanation below.

import collections
from collections.abc import Iterable, Iterator, Generator

# 字符串
astr = "XiaoMing"
print("字符串:{}".format(astr))
print(isinstance(astr, Iterable))
print(isinstance(astr, Iterator))
print(isinstance(astr, Generator))

# 列表
alist = [21, 23, 32,19]
print("列表:{}".format(alist))
print(isinstance(alist, Iterable))
print(isinstance(alist, Iterator))
print(isinstance(alist, Generator))

# 字典
adict = {"name": "小明", "gender": "男", "age": 18}
print("字典:{}".format(adict))
print(isinstance(adict, Iterable))
print(isinstance(adict, Iterator))
print(isinstance(adict, Generator))

# deque
adeque=collections.deque('abcdefg')
print("deque:{}".format(adeque))
print(isinstance(adeque, Iterable))
print(isinstance(adeque, Iterator))
print(isinstance(adeque, Generator))

Output result

字符串:XiaoMing
True
False
False

列表:[21, 23, 32, 19]
True
False
False

字典:{'name': '小明', 'gender': '男', 'age': 18}
True
False
False

deque:deque(['a', 'b', 'c', 'd', 'e', 'f', 'g'])
True
False
False

From the results, these iterable objects are not iterators, nor generators. They have one thing in common, that is, they can all be used forto loop. As everyone knows, we will not verify this.

Regarding iterable objects, there are a few points that need to be added

  1. It can be checked by the dir()method. If there __iter__is a description, it is iterable, but if not, it cannot be said that it is not iterable. For the reason, see Article 2.
  2. To judge whether it is iterable, one should not __iter__make a hasty decision just by looking at whether there is a __getitem__method , because only the method can be iterable. Because when there __iter__is not , the Python interpreter will look for it __getitem__and try to get the elements in order (starting from index 0) without throwing an exception, that is, iterable.
  3. Therefore, the best approach would be judged by for循环, or iter()go real run.

Next 迭代器is . Compared with iterable objects, 迭代器there is just one more function. That is __next__(), we can no longer use forloops to intermittently obtain element values. It can be implemented directly using the next() method.

Iterators are implemented on the basis of iterable. To create an iterator, we must first have an iterable object. Let's take a look now, how to create an iterable object and create an iterator based on the iterable object.

from collections.abc import Iterable, Iterator, Generator

class MyList(object):  # 定义可迭代对象类

    def __init__(self, num):
        self.end = num  # 上边界

    # 返回一个实现了__iter__和__next__的迭代器类的实例
    def __iter__(self):
        return MyListIterator(self.end)


class MyListIterator(object):  # 定义迭代器类

    def __init__(self, end):
        self.data = end  # 上边界
        self.start = 0

    # 返回该对象的迭代器类的实例;因为自己就是迭代器,所以返回self
    def __iter__(self):
        return self

    # 迭代器类必须实现的方法,若是Python2则是next()函数
    def __next__(self):
        while self.start < self.data:
            self.start += 1
            return self.start - 1
        raise StopIteration


if __name__ == '__main__':
    my_list = MyList(5)  # 得到一个可迭代对象
    print(isinstance(my_list, Iterable))  # True
    print(isinstance(my_list, Iterator))  # False
    # 迭代
    for i in my_list:
        print(i)

    my_iterator = iter(my_list)  # 得到一个迭代器
    print(isinstance(my_iterator, Iterable))  # True
    print(isinstance(my_iterator, Iterator))  # True

    # 迭代
    print(next(my_iterator))
    print(next(my_iterator))
    print(next(my_iterator))
    print(next(my_iterator))
    print(next(my_iterator))

Output

0
1
2
3
4

True
False

True
True

0
1
2
3
4

If the above code is too much, you can also look here, you can understand better.

from collections.abc import Iterator

aStr = 'abcd'  # 创建字符串,它是可迭代对象
aIterator = iter(aStr)  # 通过iter(),将可迭代对象转换为一个迭代器
print(isinstance(aIterator, Iterator))  # True
next(aIterator)  # a
next(aIterator)  # b
next(aIterator)  # c
next(aIterator)  # d

Supplementary note :

  1. The iterator is implemented internally, __next__this magic method. (Python3.x)
  2. You can use the dir()method to check whether there __next__is one to determine whether a variable is an iterator.

Next, is our focus 生成器.

The concept of generators first appeared in Python 2.2. The reason why generators were introduced was to implement a structure that does not need to waste space when calculating the next value.

We said earlier that the iterator is based on iterable, with a next() method added. The generator is 可以用for循环,可以使用next()implemented on the basis of the iterator ( ) yield.

yieldWhat is it, it is equivalent to return in our function. Every time next() or for traversal, the new value will be returned here by yield, and will block here, waiting for the next call. It is precisely because of this mechanism that the use of generators shines in Python programming. Realize memory saving and asynchronous programming.

How to create a generator, there are mainly the following two methods

  • Use list comprehension
# 使用列表生成式,注意不是[],而是()
L = (x * x for x in range(10))
print(isinstance(L, Generator))  # True
  • Function that implements yield
# 实现了yield的函数
def mygen(n):
    now = 0
    while now < n:
        yield now
        now += 1

if __name__ == '__main__':
    gen = mygen(10)
    print(isinstance(gen, Generator))  # True

Iterable objects and iterators generate all values ​​and store them in memory, but 生成器only temporarily generate elements that save time and space.

2. How to run/activate the generator

Since the generator does not generate all elements at once, but executes and returns one at a time, how to stimulate the generator to execute (or activate)?

There are two main ways to activate

  • usenext()
  • usegenerator.send(None)

Look at the examples separately and you will know.

def mygen(n):
    now = 0
    while now < n:
        yield now
        now += 1

if __name__ == '__main__':
    gen = mygen(4)

    # 通过交替执行,来说明这两种方法是等价的。
    print(gen.send(None))
    print(next(gen))
    print(gen.send(None))
    print(next(gen))

Output

0
1
2
3

3. The execution status of the generator

The generator will have the following four states in its life cycle

GEN_CREATED# Wait start execution GEN_RUNNING# The interpreter is executing (this status can only be seen in multi-threaded applications) GEN_SUSPENDED# pause at the yield expression GEN_CLOSED# End of execution

To feel through the code, in order not to increase the difficulty of code understanding, GEN_RUNNINGthis state, I will not give an example. Interested students, you can try multi-threading. If you have any questions, you can reply to me in the background.

from inspect import getgeneratorstate

def mygen(n):
    now = 0
    while now < n:
        yield now
        now += 1

if __name__ == '__main__':
    gen = mygen(2)
    print(getgeneratorstate(gen))

    print(next(gen))
    print(getgeneratorstate(gen))

    print(next(gen))
    gen.close()  # 手动关闭/结束生成器
    print(getgeneratorstate(gen))

Output

GEN_CREATED
0
GEN_SUSPENDED
1
GEN_CLOSED

4. Generator exception handling

In the generator operation, if the condition is not satisfied generator generating element, on / 应该throw an exception ( StopIteration).

The generator built by the list generation has automatically helped us realize the step of throwing an exception. If you don't believe me, let's take a look.

So when we define a generator ourselves, we should also throw an exception when the conditions for generating elements are not met. Take the above code to modify it.

def mygen(n):
    now = 0
    while now < n:
        yield now
        now += 1
    raise StopIteration

if __name__ == '__main__':
    gen = mygen(2)
    next(gen)
    next(gen)
    next(gen)

5. Transition from generator to coroutine: yield

Through the above introduction, we know that the generator has introduced us the function of suspending function execution ( yield). When the pause function is available, people wonder if they can send something to the generator when it is paused (in fact, it is also mentioned above: send(None)). This function sends information to suspend the generator by PEP 342entering Python 2.5in, and gave birth Pythonin 协程the birth. According to wikipediathe definition of

A coroutine is a computer program component that generates subroutines for non-preemptive multitasking. The coroutine allows different entry points to pause or start executing the program at different locations.

Note that in essence, coroutine is not a concept in the language, but a concept in the programming model.

Coroutines and threads, yes 相似点, multiple coroutines are the same as threads, only interleaved and serialized; there are also 不同点frequent switching, locking, and unlocking between threads. In terms of complexity and efficiency, it is similar to coroutines. In comparison, this is indeed a pain point. By using the coroutine yieldpause generator, the program execution flow can be passed on to other routines, thereby achieving alternately performed between different subroutine.

Let's take a look at a concise demonstration of how to send messages to the generator.

def jumping_range(N):
    index = 0
    while index < N:
        # 通过send()发送的信息将赋值给jump
        jump = yield index
        if jump is None:
            jump = 1
        index += jump

if __name__ == '__main__':
    itr = jumping_range(5)
    print(next(itr))
    print(itr.send(2))
    print(next(itr))
    print(itr.send(-1))

Output.

0
2
3
2

Here is an explanation of why this is output. The point is jump = yield indexthis statement.

Divided into two parts:

  • yield indexIs the index returnto the external calling program.
  • jump = yield Can receive the information sent by the external program through send() and assign it tojump

All of the above are the basic necessary knowledge for talking about coroutine concurrency . Please be sure to practice and understand it yourself . Otherwise, the following content will become boring and obscure.

Guess you like

Origin blog.csdn.net/weixin_36338224/article/details/109231279