Many Python programmers will 迭代器
confuse 生成器
the concepts and functions of and , and can't tell the difference between the two. Today, let's talk about these two concepts.
Iterator
Iterator Pattern
Iterator is a design pattern that provides a way to sequentially access the elements of an aggregate object without exposing its internal implementation. It is a lazy way to get data, we don't need to load all the data into memory at one time, which can avoid the trouble that the data set is too large and the memory cannot be loaded all.
This application scenario, for example: reading a large file, analyzing the keywords of each line .
One of the simplest iterator patterns, represented as an interface, contains two methods:
Next()
return next elementhasNext()
Returns whether there is a next element
An object that implements these two methods is an iterator.
Iterators in Python
Many times, Python programmers ignore the difference between Iterator and Iterable Object.
In fact, we have to distinguish the two of them well.
Iterable Object
An iterable is an object that has the ability to return one of its own data elements at a time.
E.g:
In [1]: a = [1, 2, 3, 4, 5]
In [2]: for i in a:
...: print(i)
...:
1
2
3
4
5
In [3]: b = {"first":1, "second":2, "third":3}
In [4]: for i in b:
...: print i
...:
second
third
first
The above code outputs all the elements in the list and all the keys in the dict by iteration. So, we call lists and dicts iterables (not iterators).
In Python, all collections are iterable. Inside the language, iterators support the operations listed below:
- for loop
- Traverse files and directories
- List comprehensions, dictionary comprehensions, and set comprehensions
- Tuple unpacking
- When calling a function, use * to unpack the arguments
- Building and extending collection types
So you can see that iterative operations are important in many places in python.
Reasons why sequences can be iterated
This relies on a buildin-function iter()
. If the interpreter wants to iterate over the object x, it will call iter()
to generate an iterator to iterate.
The built-in iter function does the following:
- Checks if the object implements the
__iter__
method , and if so calls it, getting an iterator. - If the
__iter__
method , but the__getitem__
method is implemented, Python creates an iterator that tries to get the elements in order (starting at index 0). - If the attempt fails, Python throws a TypeError exception, usually saying "X object is not iterable".
In [8]: x = 2
In [9]: iter(x)
-----------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-9-128770259dbe> in <module>()
----> 1 iter(x)
TypeError: 'int' object is not iterable
Standard sequences all implement the __getitem__ method. Actually, they both implement the __iter__ method, so you should too. The reason why __getitem__ is implemented is for backward compatibility, but it may be deprecated in the future.
How to implement iterable objects
How does the Object created by yourself become an iterable object? How to create an iterator yourself? It's actually very simple.
For iterable objects, either of the following two requirements need to be met (see above for the reasons):
- Has a
__getitem__
method ; accepts a parameter index - Has
__iter__
method ; returns an Iterator
example:
#!/usr/bin/env python
class MyIterableObject():
def __init__(self, s):
self.seq = s.split(' ')
def __getitem__(self, index):
return self.seq[index]
def __iter__(self):
return MyIterator(self.seq) # MyIterator的具体实现参见后面
if __name__ == '__main__':
mio = MyIterableObject("a b c d e f g")
for i in mio:
print(i)
Iterator
iter
When an iterator is obtained with the function, the iterator can be manipulated to obtain the data of the object.
Use the next()
method to get elements one by one. When all elements are obtained and continue to call the next()
method , a StopIteration exception will be thrown.
as follows:
In [13]: a = [1, 2, 3, 4, 5]
In [14]: i = iter(a)
In [15]: while True:
...: print(next(i))
...:
1
2
3
4
5
-----------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-15-ac43f8f9aeeb> in <module>()
1 while True:
----> 2 print(next(i))
3
StopIteration:
Python's iterator is simpler, it does not support repositioning to the beginning of such operations. Once an iterator is used, if you want to read from the beginning, you can only create a new iterator.
How to implement an iterator
Standard python iterators need to implement two methods:
__iter__
return the iterator itselfnext()
Returns the next element in the dataset. If there is no next one, throw a StopIteration
TIPS:
The name of the next() method in python3 has been changed__next__
, but the way to use python2 is still possible.
example:
class MyIterator():
def __init__(self, s):
self.seq = s
self.len = len(self.seq)
self.index = 0
def __iter__(self):
return self
def next(self):
try:
n = self.seq[self.index]
except IndexError:
raise StopIteration
self.index += 1
return n
One thing to note here: in the iterator pattern description, there needs to be a method to determine whether it is the last element, and this function is replaced by an exception in python. In the process of using the iterator, we can catch this exception. If you use the buildin for .. in
method , it will automatically capture it for us.
Generator
First of all, when we usually talk about Generator
this thing, in fact, it generally refers to two things:
- Generator Function: A function that uses the
yield
keyword , it becomes a generator function - Generator Object: Generated by Generator Function, it is a special Iterator. It wraps the definition body of the generator function and implements the
__iter__
andnext
two methods, conforming to the Iterator protocol.
What is the biggest difference between generators and iterators?
The main difference is that the method of value generation is different. When using an iterator, all elements to be iterated must already exist. For generators, each value does not have to exist, it can be calculated ( generated ) during execution.
For example: use a generator to generate a proportional sequence
def arithmetic_progression(base, dif, count):
for n in range(count):
yield base + dif * n
if __name__ == '__main__':
for i in arithmetic_progression(1, 3, 10):
print(i)
It can be seen that this proportional sequence does not exist, and is calculated every time the yield is executed during the iteration process.
This feature can be achieved thanks to the yield
keyword . It can suspend the execution of the function, return the value, and continue where it left off the next time. Its execution flow is as follows:
- Call the generator function with next
- The function executes to yield, returns a value, and suspends the function
- Repeat steps 1-2 until all values are returned
- StopIteration is thrown if next is used
The code verification is as follows:
In [21]: def test():
...: yield 1
...: yield 2
...: yield 3
...:
In [22]: gen = test()
In [23]: next(gen)
Out[23]: 1
In [24]: next(gen)
Out[24]: 2
In [25]: next(gen)
Out[25]: 3
In [26]: next(gen)
-----------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-26-8a6233884a6c> in <module>()
----> 1 next(gen)
StopIteration:
Use generators instead of iterators
Now we replace the iterator scheme above with generators MyIterableObject
.
class MyGenerator():
def __init__(self, s):
self.seq = s.split(' ')
def __iter__(self):
for s in self.seq:
yield s
The code is simplified a lot, we don't need to create the Iterator object by ourselves, yield will do it for us.
Iterator toolset (itertools)
Although, the use of generators is simple enough, but why is a language like python that saves your life time not further packaged?
Python has a lot of built-in generator functions, such as traversing folders os.walk
, tools map
, enumerate
and so on.
Python also has an official library called itertools, which contains 19 generator functions that can be combined to perform various functions.
end
The above is the difference between iterators and generators. In fact, these two things are not difficult to understand. However, there are several concepts that are easily confused here. As long as you understand these concepts, you can distinguish them clearly!
Author and source ( reposkeeper ) authorized to share By CC BY-SA 4.0
Follow the WeChat public account to get the push of new articles!