In-depth understanding of Python generators and generator iterator Iterator

Buildergenerator

Sometimes we will use a list to generate a list of analytic formula, the code generation function shown below is an odd number within a list of 10

print([i for i in range(10) if i % 2 == 1])

Here i for i in range(10)generated Generator is a generator, we can print it out, the following code shown:

print(i for i in range(10))
# 结果
<generator object <genexpr> at 0x000002967560D6D0>

List by the formula, we can directly create a list, however, subject to memory limitations, the list is certainly limited capacity, and create a list of 1 million elements include not only take up much storage space, if we only need to access the front several elements, that the vast majority of the space occupied by the elements behind all wasted.

So, if the list element can be calculated out in accordance with an algorithm that if we can continue to calculate in the process cycle in a subsequent elements? This eliminates the need to create a complete list, thus saving a lot of space, in Python, this kind 一边循环一边计算of mechanism, known as a generator: generator

Builder is a special program, may be used to control the behavior of the iterative loop, Python is generated in 迭代器the one of return values using the yield function, each call pauses yield, but may use next()a function and send()function recovery generator.

生成器类似于返回值为数组的一个函数This function can take arguments, can be called, however, different from the general one-time function returns an array of all values comprising, 生成器一次只能产生一个值before, so the amount of memory consumed will be greatly reduced, and allowed to call functions can quickly process several return value, thus looks like a function generator, but it behaves like an iterator

Creating generator

To create a generator, there are many ways, the first method is very simple, only the formula to a list of [] brackets changed () parentheses, creates a generator

alist = [x for x in range(10)]
print(alist)
#生成器
generator_ex = (x for x in range(10))
print(generator_ex)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
<generator object <genexpr> at 0x0000018B2DBBD660>

What is the difference between list and then create generator_ex is it? On the surface is [] and (), but the result is not the same, is a printed list (the list is because the formula), while the second print is <generator object at 0x0000018B2DBBD660>, then how to print out the generator_ex each element of it?

If you want a print out () function, or by the __next next __ () Returns the value of a generator is obtained:

generator_ex = (x for x in range(3))
print(next(generator_ex))
print(next(generator_ex))
print(generator_ex.__next__())
# 结果
0
1
2

# 如果调用4次next
print(next(generator_ex))
print(next(generator_ex))
print(generator_ex.__next__())
print(generator_ex.__next__())
# 结果	
StopIteration

We can see, generator algorithm is stored, each call to next (generaotr_ex) to calculate the value of his next element, until the calculated last element, when no more elements, throwing StopIterationerrors, and above so constantly calling is a bad habit, the correct approach is to use a for loop, because generator也是可迭代对象:

generator_ex = (x for x in range(3))
for i in generator_ex:
	print(i)
# 结果	
0
1
2

So we created a generator, basically never call next (), but loop to iterate through for, and do not care about StopIteration error, generator is very powerful, if extrapolated algorithm is complex, with a similar type of list generation time for circulation can not be achieved, can also be implemented as a function.

Generator function

Def is defined by using the keyword yield a one-time return a result, blocking and start again. Here we talk about the role of a generator function with a number of examples fibnacci column, code as follows:

#fibonacci数列
def fib(max):
    n,a,b =0,0,1
    while n < max:
        a,b =b,a+b
        n = n+1
        print(a)
    return 'done'
 
a = fib(10)
print(fib(10))

# 上面我们发现,print(b)每次函数运行都要打印,占内存,所以为了不占内存,
# 我们也可以使用生成器,这里叫yield

def fib(max):
    n,a,b =0,0,1
    while n < max:
        yield b
        a,b =b,a+b
        n = n+1
    return 'done'
 
a = fib(10)
print(fib(10))

Said here about the execution flow generator and function, the function is executed sequentially, encountered a return statement or statement returns the last line of the function. Becomes a function generator, each time next call () is executed, encounters the yield statement returns again next () call from the last time the yield return statement at the urgent need to perform, that is, how much, how much to take, It does not occupy memory.

In single-threaded operation concurrency case by yield

import time
def consumer(name):
    print("%s 准备学习啦!" %name)
    while True:
       lesson = yield
 
       print("开始[%s]了,[%s]老师来讲课了!" %(lesson,name))
 
 
def producer(name):
    c = consumer('A')
    c2 = consumer('B')
    c.__next__()
    c2.__next__()
    print("同学们开始上课 了!")
    for i in range(10):
        time.sleep(1)
        print("到了两个同学!")
        c.send(i) # send的作用是唤醒并继续执行,发送一个信息到生成器内部
        c2.send(i)

Why is it called a generator function? Over time, because it generates a queue value. General function after finished will return a value and then exit, but 生成器函数会自动挂起,然后重新拾起急需执行he will use the yield keyword shut the function that returns a value to the caller 同时保留了当前的足够多的状态,可以使函数继续执行, generators and iterators agreements are closely related, they have an iterator __next__()member method, which returns the next iteration of either one, buy cause abnormal end of an iteration.

yield summarize:
(1) the usual for ... in ... the cycle, in the back is an array, the array is a iterable, there are similar lists, strings, files. He may be a = [1,2,3], may be a = [x * x for x in range (3)].

Its disadvantages are also obvious, that is, all data in memory inside, if there are vast amounts of data, will be very memory consumption.

(2) generator is iterative, but it may be read only once. Because only generated when used, such as a = (x * x for x in range (3)). Note that this is !!! parentheses instead of square brackets.

(3) key generator (generator) capable of iteration is that he has next () method, the principle is 通过重复调用next()方法,直到捕获一个异常.

(4) yield no longer function with a common function, but a generator Generator, iteration may be used

(5) yield is a 类似return 的关键字, yield value yield when he returned back or right side of the iterative first encounter. and下一次迭代的时候,从上一次迭代遇到的yield后面的代码开始执行

(6) yield is the return of a return value, and remember the return position. The next iteration starts from this position.

(7) with a yield not only the function only for the for loop, and 可用于某个函数的参数, as long as the parameters of this function also allows an iterative parameter.

(8) send()and the next()difference is that send可传递参数给yield表达式, this time, the parameters will be passed as the value of the expression yield, and yield value of the parameter is returned to the caller, that is send可以强行修改上一个yield表达式值.

(9) send()和next()都有返回值, whose return value is encountered when the current iteration of yield, yield values behind the expression, in fact 当前迭代yield后面的参数.

(10) when the first call must be next (), before you can send (), otherwise it will error. In the next call after send () before the reason for None it is because at that time no one yield, so it can be considerednext()等同于send(None)

Generator expressions

Generator expressions derived from the combination of iterative and list comprehensions, generators and list comprehensions similar, but it uses ()instead[ ]

list_comprehension = [x**2 for x in range(10)]
list_generator = (x**2 for x in range(10))

We already know, for loop can act directly on the data types are the following:

  • One is 集合数据类型such as list, tuple, dict, set, str , etc.
  • One is generator, including a generator and a generator function with yield

These can act directly on an object referred to as a for loop iterables Iterable, you can isinstance()determine whether an object is an object Iterable

from collections import Iterable

list_comprehension = [x**2 for x in range(10)]
list_generator = (x**2 for x in range(10))
print(isinstance(list_comprehension , Iterable))
print(isinstance(list_generator , Iterable))

True
True

Not only can act on the generator for circulation, can also be next () function continues to call and returns the next value, until the last throw StopIterationerror said it could not continue to return the next value.

try:
	for i in list_generator:
		print(i)
except StopIteration:
	pass

0
1
4
9
16
25
36
49
64
81

IteratorIterator

What is the iterator

A realization of iter方法the object can be iterative, to achieve a next方target process is iterative and is an iterator. May be next () function call and return to the next target value continuously referred iterator: Iterator. So a realization of the method and the target next iter method is iterator.

Can use the isinstance () determines whether an object is an object Iterator.

list_comprehension = [x**2 for x in range(10)]
list_generator = (x**2 for x in range(10))
print(isinstance(list_comprehension, Iterable)) # True
print(isinstance(list_comprehension, Iterator)) # False
print(isinstance(list_generator, Iterable)) 	# True
print(isinstance(list_generator, Iterator)) 	# True

生成器都是Iterator对象But list、dict、stralthough the Iterable 可迭代对象( ), but 不是Iterator (iterators).

Why list, dict, str and other data types are not Iterator

This is because of Python Iterator对象表示的是一个数据流, Iterator objects can be next () function call to return and continue to the next data until the absence of data thrown StopIteration error. This data stream can be seen as an ordered sequence, but we can not know in advance the length of the sequence, can only continue to function to achieve a next-demand computing data next (), therefore Iterator的计算是惰性的, only the data it needs to return next It will be calculated.

Iterator may even represent an infinite stream of data, such as all natural numbers. The list is never stored using all natural numbers. list、dict、str等数据类型Temporary assignment at the time when the definition of memory has been allocated, but Iterator is calling next ().

! Note that 文件both iterable, but also iterators.

from collections import Iterator
from collections import Iterable

f = open('housing.csv')
print(isinstance(f,Iterator)) # True
print(isinstance(f,Iterable)) # True

Essentially by constantly calling next () implementation Python3 for loop function, for example:

for x in [1, 2, 3, 4, 5]:
    pass

################ 实际上完全等价于 ##################

# 首先获得Iterator对象:
it = iter([1, 2, 3, 4, 5])
# 循环:
while True:
    try:
        # 获得下一个值:
        x = next(it)
    except StopIteration:
        # 遇到StopIteration就退出循环
        break

to sum up:

  • For those who can act on the loop are Iterable object type;
  • Who can act on the next () Iterator objects are a function of the type, which represents a sequence of lazy evaluation;
  • The aggregate data type list, dict, str, etc., but is Iterable 不是Iterator, but can iter()obtain an Iterator object functions.

reference

  1. https://www.cnblogs.com/wj-1314/p/8490822.html
Published 148 original articles · won praise 136 · Views 250,000 +

Guess you like

Origin blog.csdn.net/DlMmU/article/details/105040369