2020-12-11 Analysis of Python yield usage

Analysis of Python yield

You may have heard that a function with yield is called a generator in Python. What is a generator?

Let's put aside the generator and show the concept of yield with a common programming topic.


How to generate Fibonacci sequence

The Fibonacci sequence is a very simple recursive sequence, except for the first and second numbers, any number can be obtained by adding the first two numbers. Using a computer program to output the first N numbers of the Fibonacci sequence is a very simple problem. Many beginners can easily write the following functions:

Listing 1. Simply output the first N numbers of the Fibonacci sequence

Instance

#!/usr/bin/python
# -*- coding: UTF-8 -*-
 
def fab(max): 
    n, a, b = 0, 0, 1 
    while n < max: 
        print b 
        a, b = b, a + b 
        n = n + 1
fab(5)

Executing the above code, we can get the following output:

1 
1 
2 
3 
5

There is no problem with the result, but experienced developers will point out that printing numbers directly in the fab function will result in poor reusability of the function, because the fab function returns None, and other functions cannot get the number sequence generated by the function.

To improve the reusability of the fab function, it is best not to print out the sequence directly, but to return a List. The following is the second version of the rewritten fab function:

Listing 2. Output the top N numbers of the Fibonacci sequence. Second edition

Instance

#!/usr/bin/python
# -*- coding: UTF-8 -*-
 
def fab(max): 
    n, a, b = 0, 0, 1 
    L = [] 
    while n < max: 
        L.append(b) 
        a, b = b, a + b 
        n = n + 1 
    return L
 
for n in fab(5): 
    print n

You can print out the List returned by the fab function as follows:

1 
1 
2 
3 
5

The rewritten fab function can meet the reusability requirements by returning List, but more experienced developers will point out that the memory occupied by the function during operation will increase with the increase of the parameter max. If you want to control the memory Occupy, it is best not to use List

To save intermediate results, but to iterate through iterable objects. For example, in Python 2.x, the code:

Listing 3. Iterating through iterable objects

for i in range(1000): pass

Will cause a List of 1000 elements to be generated, and the code:

for i in xrange(1000): pass

It does not generate a 1000-element List, but returns the next value in each iteration, and the memory space is very small. Because xrange does not return a List, it returns an iterable object.

Using iterable we can rewrite the fab function into a class that supports iterable. The following is the third version of Fab:

Listing 4. The third version

Instance

#!/usr/bin/python
# -*- coding: UTF-8 -*-
 
class Fab(object): 
 
    def __init__(self, max): 
        self.max = max 
        self.n, self.a, self.b = 0, 0, 1 
 
    def __iter__(self): 
        return self 
 
    def next(self): 
        if self.n < self.max: 
            r = self.b 
            self.a, self.b = self.b, self.a + self.b 
            self.n = self.n + 1 
            return r 
        raise StopIteration()
 
for n in Fab(5): 
    print n

The Fab class continuously returns the next number in the sequence through next(), and the memory usage is always constant:

1 
1 
2 
3 
5

However, with this version rewritten from class, the code is far less concise than the first version of the fab function. If we want to keep the conciseness of the first version of the fab function, and at the same time get the iterable effect, yield comes in handy:

Listing 5. Fourth edition using yield

Instance

#!/usr/bin/python
# -*- coding: UTF-8 -*-
 
def fab(max): 
    n, a, b = 0, 0, 1 
    while n < max: 
        yield b      # 使用 yield
        # print b 
        a, b = b, a + b 
        n = n + 1
 
for n in fab(5): 
    print n

Compared with the first version, the fourth version of fab only changed print b to yield b, and achieved the iterable effect while maintaining simplicity.

Calling the fourth version of the fab is exactly the same as the second version of the fab:

1 
1 
2 
3 
5

Simply put, the role of yield is to turn a function into a generator. The function with yield is no longer an ordinary function. The Python interpreter will treat it as a generator. Calling fab(5) will not execute the fab function. Instead, it returns an iterable object! When the for loop is executed, the code inside the fab function will be executed each time the loop is executed. When the execution reaches yield b, the fab function returns an iteration value. At the next iteration, the code continues to execute from the next statement of yield b, and the function's The local variable looks exactly the same as before the last interruption, so the function continues execution until it encounters yield again.

You can also manually call the next() method of fab(5) (because fab(5) is a generator object, which has a next() method), so that we can see the execution flow of fab more clearly:

Listing 6. Execution process

>>>f = fab(5) 
>>> f.next() 
1 
>>> f.next() 
1 
>>> f.next() 
2 
>>> f.next() 
3 
>>> f.next() 
5 
>>> f.next() 
Traceback (most recent call last): 
 File "<stdin>", line 1, in <module> 
StopIteration

When the function execution ends, the generator automatically throws a StopIteration exception, indicating that the iteration is complete. In the for loop, there is no need to handle the StopIteration exception, and the loop will end normally.

We can draw the following conclusions:

A function with yield is a generator. It is different from ordinary functions. Generating a generator looks like a function call, but will not execute any function code until next() is called (in the for loop, next() will be called automatically. )) to start execution. Although the execution flow is still executed according to the flow of the function, it will be interrupted every time a yield statement is executed, and an iteration value will be returned. The next execution will continue from the next statement of yield. It seems as if a function is interrupted by yield several times during normal execution, and each interruption will return the current iteration value through yield.

The benefit of yield is obvious. Rewriting a function into a generator gains iterative ability. Compared with using a class instance to save the state to calculate the next next() value, not only the code is concise, but the execution flow is extremely clear.

How to judge whether a function is a special generator function? You can use isgeneratorfunction to determine:

Listing 7. Using isgeneratorfunction to determine

>>>from inspect import isgeneratorfunction 
>>> isgeneratorfunction(fab) 
True

Pay attention to the distinction between fab and fab(5), fab is a generator function, and fab(5) is a generator returned by calling fab, just like the difference between the definition of a class and an instance of a class:

Listing 8. Class definition and class instance

>>>import types 
>>> isinstance(fab, types.GeneratorType) 
False 
>>> isinstance(fab(5), types.GeneratorType) 
True

fab is not iterable, but fab(5) is iterable:

>>>from collections import Iterable 
>>> isinstance(fab, Iterable) 
False 
>>> isinstance(fab(5), Iterable) 
True

Each call to the fab function will generate a new generator instance, and the instances will not affect each other:

>>>f1 = fab(3) 
>>> f2 = fab(5) 
>>> print 'f1:', f1.next() 
f1: 1 
>>> print 'f2:', f2.next() 
f2: 1 
>>> print 'f1:', f1.next() 
f1: 1 
>>> print 'f2:', f2.next() 
f2: 1 
>>> print 'f1:', f1.next() 
f1: 2 
>>> print 'f2:', f2.next() 
f2: 2 
>>> print 'f2:', f2.next() 
f2: 3 
>>> print 'f2:', f2.next() 
f2: 5

The role of return

In a generator function, if there is no return, it will execute until the function is completed by default. If it returns during execution, it will directly throw StopIteration to terminate the iteration.


another example

Another example of yield comes from file reading. If you call the read() method directly on the file object, it will cause unpredictable memory usage. A good way is to use a fixed-length buffer to continuously read the file contents. With yield, we no longer need to write an iterative class for reading files, we can easily read files:

Listing 9. Another yield example

Instance

def read_file(fpath): 
    BLOCK_SIZE = 1024 
    with open(fpath, 'rb') as f: 
        while True: 
            block = f.read(BLOCK_SIZE) 
            if block: 
                yield block 
            else: 
                return

Guess you like

Origin blog.csdn.net/qingfengxd1/article/details/111032043