Difference between iterators and generators in Python
Table of contents
- Introduction
- What is an iterator?
- Iterator example
- What is an iterable?
- Iterable example – separate class
- Iterable example – single class
- Python containers
- What are generators?
- Generator types
- Generator function example
- Generator comprehension example
- Generator advantages
- Summary
- References
Today, we are going to discuss a little bit confusing topic in Python programming language. We are going to clarify the difference between iterators and generators. Before we start, I advise beginners to ready slowly until the confusion goes away. First, we will start with basic concepts then build on top of that. At the end of the article, if you still think it is not clear you may start over while taking notes. This topic is not straight forward but not that hard. Just keep track of what is being explained.
Let us get started…
Python is a scripting language but it is also object oriented so I assume you are familiar with the basics of object oriented programming concepts such as classes, objects and methods. To refresh your memory, a class is simply an object type or definition and an object is an instance of a class. Classes (therefore objects) encapsulate data members and methods to operate on. That is all what we need to recall from OOP for now.
In plain English, an iterator is an object that keeps track of some state and returns the next state. For example, an odd number iterator currently at 7 returns the next odd number of 9. Yes, it is that simple! Programmatically speaking, it is a class that implements a next method which in turn operates on a state data member regardless of how the next value is calculated.
Take a look at the following example…
As an example, we are going to implement the odd number iterator we just mentioned earlier. Here is the code in Python 3 syntax:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
# Iterator class definition class OddIterator: # Iterator class constructor # We initialize it with a max # to set the maximum odd number # we are going to generate def __init__(self, max): # Current odd number self.curr = -1 # Maximum odd number to generate self.max = max # Recall that an iterator # should implement a next method def __next__(self): # Add 2 to the current value # to generate the next odd number self.curr+=2 # If we reach the max, stop iteration # otherwise return the current odd # number if self.curr > self.max: raise StopIteration() else: return self.curr # Create an odd iterator object # Set the maximum to 21 od = OddIterator(21) # Print the first 11 odd numbers # starting at 1. The next() function # is a Python built in function that # takes an iterator as input and returns # the iterator's next value for i in range(1, 12): print(i, next(od)) |
As you can see, the odd iterator is a Python class named OddIterator. The state data member is the variable curr which keeps track of the current odd number. The next method returns the next odd number by adding 2 to the current value. If you run the code above, you should get the following output:
1 2 3 4 5 6 7 8 9 10 11 |
1 1 2 3 3 5 4 7 5 9 6 11 7 13 8 15 9 17 10 19 11 21 |
Now we know what an iterator is, what is an iterable then? An iterable is an object that returns an iterator for the purpose of returning all of its elements. An iterable implements the iter method and returns an iterator object. The returned iterator can be an instance of a separate class or we can use a single class for both the iterable and the iterator it returns.
Take a look at the following examples…
Iterable example – separate class
Here is the code in Python 3 syntax:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
# Iterator class defintion # It is a separate class class OddIterator: # Constructor def __init__(self, max): self.curr = -1 self.max = max # Since it is an iterator it # has to implement a next method def __next__(self): self.curr+=2 if self.curr > self.max: raise StopIteration() else: return self.curr # The iterable class defintion class OddIterable: # Constructor def __init__(self, max): self.max = max # Recall an iterable implements the # iter method. This method has to # return an iterator, In this case the # iterator it returns is an external class def __iter__(self): return OddIterator(self.max) # Odd number iterable. This means we # can run a for loop against it od = OddIterable(21) # Print all odd numbers for i in od: print(i) |
If you run the code above, you should get the following output:
1 2 3 4 5 6 7 8 9 10 11 |
1 3 5 7 9 11 13 15 17 19 21 |
Iterable example – single class
For convenience, we can have a single class that implements both methods (iter and next). This makes the class both an iterable and its own iterator. Let us see how can we do that. Here is the code in Python 3 syntax:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
# Single class class OddNumber: # Constructor def __init__(self, max): self.curr = -1 self.max = max # Class implements next # so it is an iterator def __next__(self): self.curr+=2 if self.curr > self.max: raise StopIteration() else: return self.curr # Class implements iter so it is an # iterable. Since the same class # implements the iterator we return self def __iter__(self): return self # Create an iterable object od = OddNumber(21) # Run a for loop against it for i in od: print(i) |
When you run a for loop against an iterable, the iter method is called which returns a iterator to print all elements. The output of the above code should be the following:
1 2 3 4 5 6 7 8 9 10 11 |
1 3 5 7 9 11 13 15 17 19 21 |
As we are talking about iterables, let us quickly talk about a relevant topic i.e. Python containers. Python supports built in data types called containers such as lists, sets and dictionaries to mention a few. These data types are typically iterable. This means you can easily run a for loop against these types and get access to their elements. You may need to check this article for more information on how to iterate through containers using a Python for loop.
After we have explained what an iterator and iterable are, we can now define what a Python generator is. In short, a generator is a special kind of iterator that is implemented in an elegant way. It is a powerful programming construct that enables us to write iterators without the need to use classes or implement the iter and next methods.
In Python, we can write generators in two different ways…
- Generator function: in the form of a regular function but the difference in syntax between a generator function and a regular function is that a yield statement is used instead of a return statement. A regular function returns a value but a generator function returns a generator object. Calling a generator function does not change the generator’s state.
- Generator expression: in the form of a list comprehension but instead of using square brackets [ ] we use parenthesis ( ). For more information about list comprehension check the following article:
Let us look at some examples…
We want to implement the odd iterator using a generator function. Here is the code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
# Odd number generator function definition. def OddNumber(max): o = -1 for i in range(max): o += 2 # max is the maximum odd number # we are going to generate if o > max: raise StopIteration() else: yield o # Define an odd generator function # Recall that od is going to be an # object not a value od = OddNumber(21) # This is going to print 1, 3, 5 print (next(od)) print (next(od)) print (next(od)) # This is going to print all odd # numbers between 1 and 21 for i in OddNumber(21): print(i) |
In the code above, the generator function OddNumber returns an OBJECT not a value. This is what you need to keep in mind. Think of it, it is not any different from the iterator class that we implemented earlier. The only difference is the syntax which is more elegant.
Let us now implement it using the generator comprehension…
Generator comprehension example
1 2 3 4 5 6 7 8 9 10 11 |
# range function takes 3 arguments (start, end, step) # so it is going to generate the numbers between # 1 and 21 automatically for us. Now putting this # inside a generator comprehension ( ) will return # an OBJECT. Note that if we use brackets [ ] instead # then it will return a list (which is not what we want) od = (o for o in range(1, 22, 2)) # Run a loop against the generator for o in od: print(o) |
That is all for today. I hope I was able to help you understand the difference between iterators and generators in Python. In the next section, I will end by mentioning some advantages of using generators followed by a summary.
We can definitely achieve the same results using regular Python code as opposed to iterators and generators however using generators has some advantages.
- The code using generators is compact and elegant with fewer intermediate data structures
- The code runs faster because generators are memory and CPU efficient as they run lazily without the need to buffer large memory space (ex. reading a large file)
- Generators work just fine with infinite streams as in the case of reading from a network socket
- End user decides how he or she wants to use the result meaning end user decides whether he wants to put the result in a list, set or dictionary
- An iterator is an object that keeps track of some state and returns the next state
- An iterable implements the iter method and returns an iterator object
- We can have a single class that implements both methods (iter and next). This makes the class both an iterable and its own iterator
- A generator is a programming construct that enables us to write iterators without the need to use classes or implement the iter and next methods.
- We can write generators in two different ways. Using generator function or generator comprehension
- Using generators we can write compact and fast code
- To get an idea why iterators and generators are handy. An example application of iterators and generators is to read a file line by line without buffering the entire file in memory using few lines of code.
- anandology.com
- nvie.com
- wiki.python.org
- hackerearth.com
- jeffknupp.com
- codementor.io
- intermediatepythonista.com
- blog.pythonlibrary.org
- programiz.com
- medium.freecodecamp.org
Thanks for reading. For questions and feedback, please use the comments section below.