Iterators and Generators¶
The Tools of Pythonicity
What goes on in those for loops?
Iterators and Iterables¶
Iteration is one of the main reasons Python code is so readable:
for x in just_about_anything:
do_stuff(x)
An iterable is anything that can be looped over sequentially, so it does not have to be a “sequence”: list, tuple, etc. For example, a string is iterable.
An iterator is an iterable that remembers state. All sequences are iterable, but not all sequences are iterators. To make a sequence an iterator, you can call it with iter:
my_iter = iter(my_sequence)
Iterator Types:
https://docs.python.org/3/library/stdtypes.html#iterator-types
Iterables¶
To make an object iterable, you simply have to implement the __getitem__ method.
class T:
def __getitem__(self, position):
if position > 5:
raise IndexError
return position
Demo
iter()
¶
How do you get the iterator object from an “iterable”?
The iter function will make any iterable an iterator. It first looks for the __iter__ method, and if none is found, uses get_item to create the iterator.
The iter()
function:
In [20]: iter([2,3,4])
Out[20]: <listiterator at 0x101e01350>
In [21]: iter("a string")
Out[21]: <iterator at 0x101e01090>
In [22]: iter( ('a', 'tuple') )
Out[22]: <tupleiterator at 0x101e01710>
List as an Iterator:¶
In [10]: a_list = [1,2,3]
In [11]: list_iter = iter(a_list)
In [12]: next(list_iter)
Out[12]: 1
In [13]: next(list_iter)
Out[13]: 2
In [14]: next(list_iter)
Out[14]: 3
In [15]: next(list_iter)
--------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-15-1a7db9b70878> in <module>()
----> 1 next(list_iter)
StopIteration:
Using iterators when you can¶
Example: trigrams:
triplets = zip(words, words[1:], words[2:])
zip() returns an iterable – it does not build up the whole list. So this is quite efficient.
but slicing: ([1:]) produces a copy – so this does use three copies of the list – not so good if memory is tight. Note that they are shallow copies, so not that bad.
Nevertheless, we can do better:
from itertools import islice
In [68]: triplets = zip(words, islice(words, 1, None), islice(words, 2, None))
In [69]: for triplet in triplets:
...: print(triplet)
...:
('this', 'that', 'the')
('that', 'the', 'other')
('the', 'other', 'and')
('other', 'and', 'one')
('and', 'one', 'more')
The Iterator Protocol¶
The main thing that differentiates an iterator from an iterable (sequence) is that an iterator saves state.
An iterable must have the following methods:
an_iterator.__iter__()
Usually returns the iterator object itself.
an_iterator.__next__()
Returns the next item from the container. If there are no further items,
raises the StopIteration
exception.
Making an Iterator¶
A simple version of range()
class IterateMe_1:
def __init__(self, stop=5):
self.current = 0
self.stop = stop
def __iter__(self):
return self
def __next__(self):
if self.current < self.stop:
self.current += 1
return self.current
else:
raise StopIteration
(demo: iterator_1.py
)
What does for
do?¶
Now that we know the iterator protocol, we can write something like a for loop:
my_for.py
def my_for(an_iterable, func):
"""
Emulation of a for loop.
func() will be called with each item in an_iterable
"""
# equiv of "for i in l:"
iterator = iter(an_iterable)
while True:
try:
i = next(iterator)
except StopIteration:
break
func(i)
Itertools¶
itertools
is a collection of utilities that make it easy to
build an iterator that iterates over sequences in various common ways
http://docs.python.org/3/library/itertools.html
NOTE:
iteratables are not only for for
They can be used with anything that expects an iterable:
sum
, tuple
, sorted
, and list
Generators¶
Generators
- give you an iterator object
- no access to the underlying data ... if it even exists
- Conceptually:
Iterators are about various ways to loop over data.
Generators can generate the data on the fly.
- Practically:
You can use either one either way (and a generator is one type of iterator).
Generators do some of the book-keeping for you – simpler syntax.
yield¶
yield
is a way to make a quickie generator with a function:
def a_generator_function(params):
some_stuff
yield something
Generator functions “yield” a value, rather than returning a value.
State is preserved in between yields.
A function with yield
in it is a “factory” for a generator
Each time you call it, you get a new generator:
gen_a = a_generator()
gen_b = a_generator()
Each instance keeps its own state.
Really just a shorthand for an iterator class that does the book keeping for you.
An example: like range()
def y_range(start, stop, step=1):
i = start
while i < stop:
yield i
i += step
Real World Example from FloatCanvas:
Note:
In [164]: gen = y_range(2,6)
In [165]: type(gen)
Out[165]: generator
In [166]: dir(gen)
Out[166]:
...
'__iter__',
...
'__next__',
So the generator is an iterator
Note: A generator function can also be a method in a class
More about iterators and generators:
http://www.learningpython.com/2009/02/23/iterators-iterables-and-generators-oh-my/
yield_example.py
generator comprehension¶
yet another way to make a generator:
>>> [x * 2 for x in [1, 2, 3]]
[2, 4, 6]
>>> (x * 2 for x in [1, 2, 3])
<generator object <genexpr> at 0x10911bf50>
>>> for n in (x * 2 for x in [1, 2, 3]):
... print n
... 2 4 6
More interesting if [1, 2, 3] is also a generator
Note that map and filter produce iterators.