Skip to main content

A Hundred Days of Code, Day 019 - Python Iterators and Generators, Done!

Starting up with the last of the Lerner courses, I got.
Iterators and Generators.
Hopefully I get done with this and use the same intensity with actually writing code.
What shape will that take and how will I write about it? I have no clue
For now, notes follow.


Part 1, Iterators

  • If we run the iter function on something and it comes back with an iterator object, then it’s iterable; I can iterate on it.
  • Behind scenes in a for loop, Python does something similar. It asks if the object that has to be looped over is iterable with iter. And then keeps doing next with it to get the next result/item, until it comes across a StopIteration which lets it know, that the values are exhausted and it can stop looping.
  • The for loop is a very thin shell, that asks the appropriate questions of the iterator. But it is always the iterator that determines, what exactly is returned.
  • iterable vs iterator
    • iterable means that it can be put into a “for” loop in Python
      • things like strings, lists, dictionaries, sets, files and objects (that I will soon create) that implement that iterator protocol.
      • it responds positively to the iter function’s question, Are you iterable? It returns an iterator object instead of raising errors (TypeErrors in this case)
    • an iterator is the object returned by the iterable, the thing on which the next function is run.
      • it could be the original object (the iterable) itself.
      • it could also be a seperate object returned by the iterable.
      • when the iterator object is exhausted, it returns StopIteration
      • these are what give us the values.
  • How do I make my objects iterable?
    • By implementing the iterator protocol
      • the object must respond to iter, by returning an iterator object
      • it should respond to the next function with values
      • it should raise a StopIteration error, when it’s done.
  • Best to create a helper class, pass the data to it and return that as the iterator instance.
    • helps with state, since I can then use the same object and iterate over it multiple times independantly.

Part 2, Generators

  • Creating a whole class just to implement an iterator, might be overkill
  • Generators are a solution, that might come in handy.
  • They are like functions; functions that implement the iterator protocol.
  • instead of using return, we use yield.
    • this lets us return multiple items.
  • I can use them to chunk large items, buffer uncertain stuff, and wrap complex stuff.
  • Rule of thumb,
    • if I already have a class, just added functionality to it using an iterator class.
    • but if I need something standalone, just use a generator function

Part 3, Coroutines

  • Little helper functions.
    • With such a function, I assign the yield statment to a variable.
    • This makes the function sit and wait for me to send it something.
    • Once it receives data, it ingests it, processes it and poops out, yields an answer :)
    • x = yield x
    • Instead of sending data back to a coroutine with send, I could send it an exception using throw.
    • I use close to close the generator and break out (trap GeneratorExit)
    • I can just say yield from some_generator if I am calling another generator from my generator.

Part 4, Generator Comprehensions

  • Create one just using () genvar = (x*x for x in range(-5,5)) will make genvar a generator object.
  • The general idea, I am getting, behind the idea of generators and iterators is that we use them on really large objects, so that we don’t run out of memory.
    • for e.g. joining a list of 3 elements is easy-peasy. A list of 3 billion elements less so, and ergo, generators and iterators.
    • the ultimate idea being, just like an oil pipeline. We ought not to store it in tanks (lists, dictionaries etc.) We ought to strive as much as possible that it flows from upstream, right through to our taps in a single unbroken stream. Just pass the oil along, using various t-junctions and pipes and motors and stuff. Keep it flowing. Minimise storage.
    • it also spreads wait times across well, time. So I never have to wait for something to gather up and collect and then display. the generator comes back instantly and when it has to get a values, I just have to wait for that one value.

Part 5, itertools

  • itertools is a collection of generators/iterators, that’ll let me add the iterator protocol to existing classes and functions. Looking through these feels like looking at specialty carpentry tools, like this biscuit joiner.
  • lets me take advantage of other smarter folks’ work and optimisations
    • itertools.chain crawls elements across iterables
    • itertools.count counts upwards from a number given, in steps if we want.
    • itertools.cycle cycles through a range of options
    • itertools.repeat repeats a value for the number of times, we tell it
    • itertools.filterfalse takes a iterable and a function, applies the function to the iterable and if the function returns false, only then is the element from the iterable, passed through. (the inverse in which only true elements are returned is a built-in in Python, aptly called filter)
    • itertools.takewhile and itertools.dropwhile are opposites of each other
      • takewhile takes a function and a sequence, applies the function to the sequence (starts a while loop) and then immediately stops if it hits false
      • dropwhile inverts that. it starts a while loop when it hits it’s first false and then keeps going from there.
    • itertools.compress takes two iterables of equal length, (or if they are unequal, it’ll stop when it reaches the end of the shorter one) and then if the element second iterable rings true, it returns the element in the first one.
    • itertools.accumulate adds the first element to the second element, then adds that the third element and so on. (if i want to use some other function of add, I can do that too)
    • itertools.tee lets me create multiple instances of a generator, each with it’s own state (memory intensive.)
    • permutations and combinations
      • itertools.product takes an element from one iterable, runs it across all the elements of the second (or third and fourth) and returns a cartesian product, an x,y (or x,y,z) coordinate tuple.
      • itertools.permutations does permutations across two iterables. returns tuples.
      • itertools.combinations does combinations.
      • itertools.combinations_with_replacements does combinations allowing each element to repeat. (a combination might not allow (10, 10, 10) for example; this helps with that)
    • more-itertools is an external package that gives me more complex itertools. (pip install more-itertools) need to look at them later.

Learning / Feedback/ Experiences from doing exercises

  • Reuven tells stories “Deep in the recesses of Python …”
  • Reuven’s talking Python interpreter, “Hey Object, are you iterable?”
  • I am slowly anticipating the problems in Reuven’s code as he keeps iterating on them.
    • ‘Will this work?’, asks Reuven.
      ‘NO!’, I yell back at the screen. ‘This function does not do anything, will return None and cause an error!’
      I am learning! This is fun! :)
  • Been going, “Ooooh! So that’s how they do it!” lots of times in this class.
  • Ok, have now dropped from a 30,000 foot view of Python to a 10,000 foot view. Will hit ground level, starting tomorrow, when I start doing exercises.

Read all about my Python Iterators & Generators journey here