Detailed explanation of Iterable, Iterator, Generator in Python

Detailed explanation of Iterable, Iterator, Generator in Python

This article mainly refers to the video of Gao Tian, ​​a code farmer at station B:

This article will discuss the iterable object iterable, iterator iterator and generator generator in python. Most python programmers will have heard of these three concepts, but they may lack a deep understanding of them.

for loop

lst = [1, 2, 3]
for item in lst:
  # do something
  psss

The for loop is the most common operation in python. Even beginners who have learned python for half a day can master the usage of for loop proficiently. This is due to the fact that the semantics of the for loop in python are really easy to understand. for item in lstTake out items one by one from lst and process them. For a list, the elements in it are arranged in order one by one, which is very natural. But what about unordered dictionaries? Or what about more complex file objects? In python, these can be traversed through the for loop, how is this done? What is going on behind the for loop?

In fact, what is done behind the for loop (to a certain degree of abstraction) is not complicated, but understanding it is very helpful for us to deeply understand iterable objects, iterators and generators in python.

The action behind the for loop

The action behind the for loop is actually not complicated, it can be regarded as two steps:

  1. First take iter for lst to get the iterator iterator

    iterator = iter(lst)
    
  2. Then continuously take next to the iterator, and take out the elements

    item0 = next(iterator)
    item1 = next(iterator)
    # ...
    

    Until encountering the exception of StopIteration

Here iterand nextare two built-in functions of python. Respectively act on iterable object iterable and iterator iterator, the function is to obtain iterator from iterable object and fetch elements one by one from iterator. That is, the need in our example lstis an iterable object, iteratorthe iterator it generates.

So, back to our original question, how does the for loop traverse complex data structures such as dictionaries and file objects? How does python know how to get the next element in the data structure? __iter__This involves two magic methods / __getitem__and that must be implemented by iterable objects and iterators respectively __next__.

Iterable iterable object

We have just mentioned that for xxx in yyyhere yyymust be an iterable object iterable. iterWe can get an iterator by passing an iterable object to the method. So iterhow does the method get an iterator from an iterable? The answer is the magic method implemented on this iterable object: __iter__or __getitem__.

For example __iter__, the method needs to return an iterator.

Iterator iterator

iterAfter getting the iterator through the method, we can pass the iterator into the nextmethod to continuously take out elements from the iterator. How to extract elements? It relies on the method implemented by the iterator object __next__.

It should be noted that the python official documentation suggests that the iterator we implement should also be an iterable object, that is, we must also implement __iter__the method. This is to ensure that if we explicitly fetch an iterable object iter, after getting an iterator, the iterator can also be itertraversed by means of for loop / fetch again. For example in this situation:

lst = [1, 2, 3]
ite = iter(lst)
next(ite)
for item in ite:
  # do sth
  pass

If the iterator itself is not an iterable object, it will report an error if it is put into the for loop, because it does not implement __iter__the method. Of course, in order to ensure that an iterator is an iterable object at the same time, __iter__the method we need to implement is usually very simple. In most cases, we only need to return itself. Right now:

def __iter__(self):
    return self

So far, we have understood which magic methods need to be implemented by the iterable object iterable and the iterator iterator, as well as their differences and connections.

Iterator example

Let's take the linked list as an example to implement its iterator and iterable objects:

class NodeIter:
    def __init__(self, node):
        self.curr_node = node

    def __next__(self):
        if self.curr_node is None:
            raise StopIteration
        node, self.curr_node = self.curr_node, self.curr_node.next
        return node

    def __iter__(self):
        return self


class Node:
    def __init__(self, name):
        self.name = name
        self.next = None

    def __iter__(self):
        return NodeIter(self)

Here, Nodeit is an iterable object, which can be traversed in the for loop, or its iteriterator can be obtained directly. Its corresponding iterator is to remove elements NodeIteraccording to the method it implements , until there are no elements, and raise a StopIteration. __next__Note that in order to ensure that the iterator NodeIteris also an iterable object, we also implement __iter__the method for it, which returns itself directly.

Generator generator

Generators may be a syntax that many python beginners are relatively unfamiliar with. In fact, a generator is a special kind of iterator.

from typing import Iterator

def gen(num):
    while num > 0:
        yield num
        num -= 1
    return

g = gen(5)

print(isinstance(g, Iterator))
first = next(g)
print(first)

print('in for loop: ')
for i in g:
    print(i)
# 输出:
# True
# 5
# in for loop:
# 4
# 3
# 2
# 1

For example, above is an example of traversing a generator, which genis called a generator function and gcalled a generator object. It can be used in the next, for loop, etc. ways we introduced in the iterator section before, because the generator is also a kind of iterator.

The following mainly introduces the differences between generators and general iterators.

It is easy to find that there is a yield keyword in the so-called generator function. Note that here we also specially wrote a return keyword. If it is in a general function, it is obvious that genthe function will return a None. However, when the python interpreter sees a function with the yield keyword present, it marks the function as a generator function . When a generator function is called, it does not run its function body and does not return a value, but instead returns a generator object (in this case g).

When the generator object is passed to the next method, its corresponding generator function will actually be run. When the generator function is running (that is, when the generator object is called by the next method), the function will return the value after yield when it runs to the yield statement. But it can be seen that after the yield statement, there are still some statements in the function that have not been executed. This call has returned and will not be executed again. At this time, the generator function is equivalent to being pressed a pause button, and when next is called next time, the generator function will continue to run from the current yiled statement. Therefore, num in our example will be decremented by one each iteration.

After num keeps decreasing, the function will jump out of the while loop and execute return. In the generator function, the return statement is equivalent to raising a StopIteration in the iterator. Note that no matter the return in the generator function returns a None or returns a value, this value will not be returned when the generator object is called by next, and next will only return the value of the yield statement. If you really need to get the return value in the generator function, you need to catch the StopIteration exception and get the return value.

From the consumer's point of view, generators, a special kind of iterator, are used in little different ways than normal iterators . From the perspective of implementation principle, ordinary iterators save the current iteration state through class member variables, while in generators, the iteration state is saved in the stack frame of the function and saved through the running state of the function. Generators are usually implemented more concisely than ordinary iterators. Compare the generator example below with the iterator example above.

We say: Generators are used in hardly any different way than ordinary iterators . So what's the difference? Here we introduce an advanced usage of generators: send. The send method can pass in the parameters of the send function as the value of the yield statement ( ) while calling the generator function to yield the yield xxxvalue. In the generator function, the value of the yield statement can be received for processing. This allows us to change the internal state of the generator by passing in some values ​​through the send method when iterating the generator, and realize the interaction with the generator.

def gen(num):
    while num > 0:
        tmp = yield num
        if tmp is not None:
            num = tmp
        num -= 1

g = gen(5)

first = next(g)  # first = g.send(None)
print(f"first: {
      
      first}")

print(f"send: {
      
      g.send(10)}")

for i in g:
    print(i)
# 输出:
first: 5
send: 9
8
7
6
5
4
3
2
1

Calling the next method directly is equivalent to g.send(None). And if the generator function does not use a variable to accept the return value of the yield statement and process the logic, then any value entered by send is equivalent to being directly discarded. At this time, no matter what xxx is in g.send(xxx), Both are equivalent to calling the next method directly.

generator example

Also take the realization of the linked list as an example. Previously, we NodeIterimplemented the traversal through a class Node, and Nodethis Iterable iterwill return NodeIterthis Iterator when it is called by the method.

Here, we implement the Nodemethod __iter__directly as a generator function. When called by the iter method, a generator object will be returned. We know that the generator object is a special iterator, and of course it can be traversed normally. In this way, we realize the traversal of the linked list Node more concisely through the generator. Moreover, this is completely transparent to the user, and the calling method and traversal method are exactly the same as the previous implementation of NodeIter.

class Node:
    def __init__(self, name):
        self.name = name
        self.next = None

    def __iter__(self):
        node = self
        while node is not None:
            yield node
            node = node.next

node1 = Node('node1')
node2 = Node('node2')
node3 = Node('node3')

node1.next = node2
node2.next = node3

for node in node1:
    print(node.name)

Ref

Guess you like

Origin blog.csdn.net/weixin_44966641/article/details/131501576