Introduction to Iterators

With list comprehensions, we can directly create a list. However, due to memory constraints, the list capacity is definitely limited. Moreover, creating a list with 1 million elements not only takes up a lot of storage space, but if we only need to access the first few elements, then most of the space occupied by the latter elements is wasted.

So, if the elements of the list can be calculated according to a certain algorithm, can we continue to calculate the subsequent elements in the process of looping? This saves a lot of space by not having to create a complete list. In Python, this mechanism of computing while looping is called a generator: generator.

Iterator primary application

To create a generator, there are many ways. The first method is very simple, as long as a list comprehension is []changed ()to create a generator:

>>> L = [x * x for x in range(10)]
>>> L
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> g = (x * x for x in range(10))
>>> g
<generator object <genexpr> at 0x1022ef630>

The difference between creating a Lsum gis that the outermost []sum is a list (), butL a generator.g

We can directly print out each element of the list, but how do we print out each element of the generator?

If you want to print them out one by one, you can next()get the next return value of the generator through the function:

>>> next(g)
0
>>> next(g)
1
>>> next(g)
4
>>> next(g)
9
>>> next(g)
16
>>> next(g)
25
>>> next(g)
36
>>> next(g)
49
>>> next(g)
64
>>> next(g)
81
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

As we said, the generator saves the algorithm. Each time it is called next(g), it calculates the gvalue of the next element, until the last element is calculated, and StopIterationan error is thrown when there are no more elements.

Of course, the above constant call next(g)is too perverted. The correct way is to use a forloop, because the generator is also an iterable object:

>>> g = (x * x for x in range(10))
>>> for n in g:
... print(n)
...
0
1
4
9
16
25
36
49
64
81

So, after we create a generator, we basically never call next()it, but foriterate through it in a loop, and don't need to care about StopIterationerrors.

generators are very powerful. If the calculation algorithm is complex and forcannot be implemented by a loop similar to the list generation type, it can also be implemented by a function.

For example, in the famous Fibonacci sequence, any number except the first and second numbers can be obtained by adding the first two numbers:

1, 1, 2, 3, 5, 8, 13, 21, 34, …

The Fibonacci sequence cannot be written with list comprehension, but it is easy to print it with a function:

def fib(max):
　　n, a, b = 0, 0, 1
　　while n < max:
　　　　print(b)
　　　　a, b = b, a + b
　　　　n = n + 1
　　return 'done'

The above function can output the first N numbers of the Fibonacci sequence:

>>> fib(6)
1
1
2
3
5
8
'done'

If you look closely, you can see that the fibfunction actually defines the calculation rules of the Fibonacci sequence. It can start from the first element and calculate any subsequent elements. This logic is actually very similar to the generator.

That said, the above function and generator are just one step away. To turn fiba function into a generator, just print(b)change yield bit to:

def fib(max):
    n, a, b = 0, 0, 1
    while n < max:
        yield b
        a, b = b, a + b
        n = n + 1
    return 'done'

yield can be understood as return. For the return of the expression after yield, the variable information on the scene will also be retained, if there is one

m=yield b

In fact, it is to assign the value of b to m, if there is no corresponding send operation in the program code.

This is another way to define generators. If a function definition contains yieldkeywords, then the function is no longer a normal function, but a generator:

>>> f = fib(6)
>>> f
<generator object fib at 0x104feaaa0>

Here, the most difficult thing to understand is that the execution flow of generator and function is different. Functions are executed sequentially, returnreturning when a statement or the last line of a function statement is encountered. The function that becomes a generator is next()executed every time it is called, and when a yieldstatement is returned, it yieldcontinues to execute from the last returned statement when it is executed again.

As a simple example, define a generator that returns the numbers 1, 3, and 5 in turn:

def odd():
print('step 1')
yield 1
print('step 2')
yield(3)
print('step 3')
yield(5)

When calling the generator, first generate a generator object, and then use the next()function to continuously get the next return value:

>>> o = odd()
>>> next(o)
step 1
1
>>> next(o)
step 2
3
>>> next(o)
step 3
5
>>> next(o)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

It can be seen that oddit is not an ordinary function, but a generator. During the execution process, it yieldwill be interrupted when it encounters it, and the execution will continue next time. After executing 3 times yield, there is no yieldmore to execute, so the 4th call next(o)will report an error.

Back to fibthe example, if we keep calling during the loop yield, it will keep interrupting. Of course, a condition must be set for the loop to exit the loop, otherwise an infinite number list will be generated.

Similarly, after changing the function to a generator, we basically never use it next()to get the next return value, but directly use the forloop to iterate:

>>> for n in fib(6):
... print(n)
...
1
1
2
3
5
8

But forwhen the generator is called with a loop, it is found that the return value of the generator's returnstatement cannot be obtained. If you want to get the return value, you must catch the StopIterationerror, and the return value is contained StopIterationin value:

>>> g = fib(6)
>>> while True:
... try:
... x = next(g)
... print('g:', x)
... except StopIteration as e:
... print('Generator return value:', e.value)
... break
...
g: 1
g: 1
g: 2
g: 3
g: 5
g: 8
Generator return value: done

Summary: From the above, you can initially see the usage of yield. Adding this function to a function becomes an iterator function, which can be used for loop iteration like a general iterator. The iterator can avoid the problem of excessive memory usage caused by some big data processing to a certain extent. After using yield, it is guaranteed that each content is processed, so it will not take up too many resources.

Iterator coroutine application

Coroutines, also known as micro-threads, fibers. English name Coroutine. The concept of coroutines has been around for a long time, but it was only in the last few years that it became widely used in some languages (such as Lua). Subroutines, or functions, are hierarchical calls in all languages. For example, A calls B, B calls C during execution, C returns after execution, B returns after execution, and finally A completes execution. Therefore, the subroutine call is implemented through the stack, and a thread executes a subroutine. Subroutine calls are always one entry, one return, and the calling sequence is unambiguous. Coroutine calls are different from subroutines. A coroutine looks like a subroutine, but during execution, it can be interrupted inside the subroutine, and then turn to execute other subroutines, and then return to continue execution at an appropriate time. Note that interrupting in a subroutine to execute other subroutines is not a function call, which is somewhat similar to CPU interrupts.

Let's look at a common producer-consumer model:

def consumer():
r = ''
while True:
n = yield r
if not n:
return
print('[CONSUMER] Consuming %s...' % n)
r = '200 OK'
def produce(c):
c.send(None)
n = 0
while n < 5:
n = n + 1
print('[PRODUCER] Producing %s...' % n)
r = c.send(n)
print('[PRODUCER] Consumer return: %s' % r)
c.close()
c = consumer()
produce(c)

The above is a common producer-consumer model. In the past, in the process of using c implementation under linux, it was necessary to consider the multi-threaded lock mechanism to ensure normal operation, but this lock mechanism will lead to deadlock if there is a little error in the application. , and then using coroutines can completely avoid such problems.

Note (1): c.send(None) parsing

#After c=consumer() runs, what you get is just an iterator initialization process. At this time, the consumer function itself does not run 
.# That is to say, the following function #1 does not run the code 
# def consumer(): 
# r = '' #1 
# while True: #2 
# n = yield r #3 
# if not n: #4 
# return #5 
# print('[CONSUMER] Consuming %s...' % n) #6 
# r = '200 OK' #7 
#There is c.send(None) in the above code , the function of this code is 
c.send(None) #The
 above code is equivalent to 
c.next() #At
 this point, start to get the iterator value, there will be doubts at this point, why can't other values be sent? First of all, the iterator is not running, you try to send a message to it, it 
# does not know how to deal with it, so it will report an error, so you must send None first

Note (2):

for i in consumer():
 print (i);
 #Note that this is not the same as the situation described above. At this time, there is no code for send(None) and next() #But 
you need to know that in the for loop The next function will be called automatically, so we don't have to worry about this problem

Reference: https://www.deeplearn.me/231.html

python yield understanding

Introduction to Iterators

Iterator primary application

Iterator coroutine application

Guess you like