Python Iterators

Photo of Python computer code.

We go in-depth on Python iterators, generators.

We discussed basic use of the for loop in a previous article. We will cover the internals of how for loops work in Python using iterators. How can you use the iterator pattern to create your own objects that are consumed in loops and throughout the language.

The Iterator Pattern in Python

These are the steps that take place when consuming, then exhausting an iterator. They are explained below:

  • Call the __iter__() method to receive an iterator.

  • Call the __next__() method to receive individual items from the iterator.

  • Catch a StopIteration exception to end iteration.

Returning an iterator with __iter__()

For an object var1, Python will call var1.__iter__() to receive an iterator returned by this method. An iterator is an object that implements the __next__() method. An example of this is shown below:

class NextMethod:
     def __next__(self):
          ....

class ReturnIter:
    def __iter__(self):
        return NextMethod()

We discuss __next__() below.

This example has 2 objects: one with a __iter__() method and a second with a __next__() method. In practice we usually implement both __iter__() and __next__() in the same class. This is what we will stick with going forward.

An iterator can return itself with self.

class CompleteIter:
    def __iter__(self):
        return self
    def __next__(self):
        ....
Implementing the __next__() method

The __next__() method is called repeatedly, once at the beginning of each iteration. The result returned by __next__() is the next output of the iterator.

In a for loop, each result of __next__() becomes the variable in the next iteration of the for loop. To stop iterating we raise a StopIteration exception. This example simply counts to 3 and stops.

class CompleteIter:
    def __init__(self):
        self.counter = 0
    def __iter__(self):
        return self
    def __next__(self):
        self.counter += 1
        if self.counter == 4:
            raise StopIteration
        else:
            return self.counter

for i in CompleteIter():
    print(i)

# Output:
# 1
# 2
# 3

We see above a complete example of a working iterator.

An iterator is a class that implements certain methods, so can use the __init__() method for one-time setup.

The __iter__() method simply returns itself. CompleteIter features both __iter__() to return an iterator (itself) and __next__() to implement the iterator pattern.

Each call to __next__() provides the next value for the next iteration. In this case the numbers 1, 2, 3 are returned after 3 iterations. A StopIteration exception is raised on the 4th iteration, stopping the loop and not producing any further results.

The Sequence to Initialise an Iterator

We saw how we create iterators by implementing __iter__() and __next__(). We now focus on what the Python runtime does with our class and how the iterator is initialised.

>>> instance = CompleteIter()
__init__
>>> iterator = iter(instance)
__iter__
>>> first = next(iterator)
__next__
>>> second = next(iterator)
__next__
>>> third = next(iterator)
__next__
>>> fourth = next(iterator)
__next__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 12, in __next__
StopIteration
>>>

In the example above we first create an instance of CompleteIter. This calls the __init__() method to initialise the instance. We need to call iter() on an object with the correct method, not the class. This instantiates the instance, setting initial variables.

The first iterator-related step is the call by iter() on the instance. This calls __iter__(), returning the iterator.

Since CompleteIter returns itself this step is superfluous (we could continue to use instance in place of the iterator). We see this below where instance and iterator both have the same address[1] , demonstrating that they are the same object. We use the above example to show the general case.

Each call by next() executes the __next__() method on the object. The variable fourth is never set. The last call to next() raises the StopIteration exception.

>>> instance
<__main__.CompleteIter object at 0x7f8679c13e20>
>>> iterator
<__main__.CompleteIter object at 0x7f8679c13e20>

Generators and the yield statement

There is another method to create iterators that has a much simpler syntax. We can define a function as we usually do, with the exception that we replace the return statement with yield. That’s it. Every time we yield from this function it creates a new output for that iteration. For our for loop each value yielded is the variable of the next loop.

We recreate our example as a generator.

def as_generator():
    yield 1
    yield 2
    yield 3

for i in as_generator():
    print(i)

The output from this example is the same as before. Note how much clearer and less verbose this is. This function will yield at each statement, but execution will resume from the last yield. Note the example below:

>>> def as_generator():
...     print("Before first yield.")
...     yield 1
...     print("After 1")
...     yield 2
...     print("After 2")
...     yield 3
...     print("After 3")
...
>>> gen = as_generator()
>>> next(gen)
Before first yield.
1
>>> next(gen)
After 1
2
>>> next(gen)
After 2
3
>>> next(gen)
After 3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>>

When a generator is first executed it does not run any of the code defined in the function. It merely returns an iterator. We see this with gen = as_generator(). The first result is yielded when next() is first run on this instance of the generator. Execution begins from the top of the function up to the first yield. Execution always resumes from the previous yield.

We need not always use a different yield for each result. The following wraps the range() function to only return even values.

def even_range(end):
    for i in range(0, end, 2):
        yield i

list(even_range(10))
# Output:
# [0, 2, 4, 6, 8]
Using "yield from"

Notice in the above statement that we loop over an iterator created by range() and yield each result. This is consumed by list(). If we are yielding from an iterator we can use the yield from statement directly, reducing the amount of code to write.

The example above is repeated using yield from.

def even_range(end):
    yield from range(0, end, 2)

list(even_range(10))
# [0, 2, 4, 6, 8]

The result is the same with slightly less code. This solution is cleaner when we want to yield from an iterator.

Sending values to a generator using the yield expression

Throughout this discussion about generators we have only covered iterators as an object that produces values. It is possible to send values back into the generator. This is done using yield as a statement and assigning its value to a variable.

def add_to_counter():
    counter = 0
    input_value = 0
    while True:
        counter += 1
        input_value = yield input_value+counter

In this example we have a generator add_to_counter() it will:

  • receive a value at each iteration,

  • assign that value to input_value,

  • then increment a counter and return the sum of that counter and the input value.

>>> gen = add_to_counter()
>>> gen.send(None)
1
>>> gen.send(5)
7
>>> gen.send(10)
13
>>> gen.send(16)
20
>>> gen.send(16)
21

We start the first iteration using .send(None). This is equivalent to next(). Since a generator begins at the top of a function we can only send a None value. Any other value would raise a TypeError.

When Python arrives at a line like variable = yield value it will

  • first yield value,

  • the iteration will end,

  • the new value sent to the generator using .send(new_value) at the start of the next iteration will be assigned to variable.

This behaviour is similar to regular functions which receive parameters to change the behaviour of the function. In this case we can send a parameter to change the behaviour of a single iteration. It is key to how coroutines are implemented in Python.


1. The value of the addresses will differ to that shown here if you attempt this yourself, but will both have the same value.