Generator

Recursion

Consider computing the Fibonacci number of order \(n\):

\[\begin{split} F_n := \begin{cases} F_{n-1}+F_{n-2} & n>1 \kern1em \text{(recurrence)}\\ 1 & n=1 \kern1em \text{(base case)}\\ 0 & n=0 \kern1em \text{(base case)}. \end{cases}\end{split}\]

Fibonacci numbers have practical applications in generating pseudorandom numbers.

Can we define the function by calling the function itself?

Try stepping through such a function below to see how it works:

%%mytutor -r -h 450
def fibonacci(n):
    if n > 1:
        return fibonacci(n - 1) + fibonacci(n - 2)  # recursion
    elif n == 1:
        return 1
    else:
        return 0


print(fibonacci(2))
1

A function that calls itself (recurs) is called a recursion.

Exercise Write a function gcd that implements the Euclidean algorithm for the greatest common divisor:

\[\begin{split}\operatorname{gcd}(a,b)=\begin{cases}a & b=0\\ \operatorname{gcd}(b, a\operatorname{mod}b) & \text{otherwise.} \end{cases}\end{split}\]
%%mytutor -r -h 550
def gcd(a, b):
    ### BEGIN SOLUTION
    return gcd(b, a % b) if b else a
    ### END SOLUTION


print(gcd(3 * 5, 5 * 7)) # gcd = ?
5

Is recursion strictly necessary?

E.g., the following computes the Fibonnacci number using a while loop instead.

%%mytutor -r -h 550
def fibonacci_iteration(n):
    if n > 1:
        _, F = 0, 1  # next two Fibonacci numbers
        while n > 1:
            _, F, n = F, F + _, n - 1
        return F
    elif n == 1:
        return 1
    else:
        return 0


fibonacci_iteration(3)
2
# more tests
for n in range(5):
    assert fibonacci(n) == fibonacci_iteration(n)

Exercise We can always convert a recursion to an iteration. Why?

Hint: See the Wikipedia.

The step-by-step execution of the recursion is indeed how python implements a recursion as an iteration.

Exercise Implement gcd_iteration using a while loop instead of a recursion.

Hint: See tail recursion.

%%mytutor -r -h 550
def gcd_iteration(a, b):
    ### BEGIN SOLUTION
    while b:
        a, b = b, a % b
    return a
    ### END SOLUTION


gcd_iteration(3 * 5, 5 * 7)
5
# test
for n in range(5):
    assert fibonacci(n) == fibonacci_iteration(n)

What are the benefits of recursion?

Is recusion more efficient than iteration?

Exercise Find the smallest values of n for fibonacci(n) and fibonacci_iteration(n) respectively to run for more than a second.

%%timeit -n 1
# Assign n the appropriate value
### BEGIN SOLUTION
n = 34
### END SOLUTION
fib_recursion = fibonacci(n)
%%timeit -n 1
# Assign n
### BEGIN SOLUTION
n = 400000
### END SOLUTION
fib_iteration = fibonacci_iteration(n)

To see why recursion is often slower, we will modify fibonacci to print each function call as follows.

def fibonacci(n):
    """Returns the Fibonacci number of order n."""
    print(f"fibonacci({n})")
    return fibonacci(n - 1) + fibonacci(n - 2) if n > 1 else 1 if n == 1 else 0


fibonacci(5)
fibonacci(5)
fibonacci(4)
fibonacci(3)
fibonacci(2)
fibonacci(1)
fibonacci(0)
fibonacci(1)
fibonacci(2)
fibonacci(1)
fibonacci(0)
fibonacci(3)
fibonacci(2)
fibonacci(1)
fibonacci(0)
fibonacci(1)
5
  • fibonacci(5) calls fibonacci(4) and fibonacci(3).

  • fibonacci(4) then calls fibonacci(3) and fibonacci(2).

  • fibonacci(3) is called twice.

Global Variables and Closures

Consider generating a sequence of Fibonacci numbers:

for n in range(5):
    print(fibonacci_iteration(n))
0
1
1
2
3

Exercise Is the above loop efficient?

No. Each call to fibonacci_iteration(n) recomputes the last two Fibonacci numbers \(F_{n-1}\) and \(F_{n-2}\) for \(n\geq 2\).

How to avoid redundant computations?

One way is to store the last two computed Fibonacci numbers.

%%mytutor -r -h 600
Fn, Fnn, n = 0, 1, 0  # global variables


def print_fibonacci_state():
    print(
        f"""Global states:
    Fn  : Next Fibonacci number      = {Fn}
    Fnn : Next next Fibonacci number = {Fnn}
    n   : Next order                 = {n}"""
    )


def next_fibonacci():
    """Returns the next Fibonacci number."""
    global Fn, Fnn, n  # global declaration
    value, Fn, Fnn, n = Fn, Fnn, Fn + Fnn, n + 1
    return value


for i in range(5):
    print(next_fibonacci())
print_fibonacci_state()
0
1
1
2
3
Global states:
    Fn  : Next Fibonacci number      = 5
    Fnn : Next next Fibonacci number = 8
    n   : Next order                 = 5

Rules for global/local variables:

  1. A local variable must be defined within a function.

  2. An assignment defines a local variable except after a global statement.

Why global is NOT needed in print_fibonacci_state?

Without ambiguity, Fn, Fnn, n in print_fibonacci_state are not local variables by Rule 1 because they are not defined within the function.

Why global is needed in next_fibonacci?

What happens otherwise:

def next_fibonacci():
    """Returns the next Fibonacci number."""
    # global Fn, Fnn, n
    value = n
    n, Fnn, n = Fnn, n + Fnn, n + 1
    return value


next_fibonacci()
---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
<ipython-input-13-dd2215d04815> in <module>
      7 
      8 
----> 9 next_fibonacci()

<ipython-input-13-dd2215d04815> in next_fibonacci()
      2     """Returns the next Fibonacci number."""
      3     # global Fn, Fnn, n
----> 4     value = n
      5     n, Fnn, n = Fnn, n + Fnn, n + 1
      6     return value

UnboundLocalError: local variable 'n' referenced before assignment

Why is there an UnboundLocalError?

  • The assignment defines n as a local variable by Rule 2.

  • However, the assignment requires first evaluating n, which is not yet defined.

Are global variables preferred over local ones?

Consider rewriting the for loop as a while loop:

%%mytutor -h 600
Fn, Fnn, n = 0, 1, 0  # global variables


def print_fibonacci_state():
    print(
        f"""Global states:
    Fn  : Next Fibonacci number      = {Fn}
    Fnn : Next next Fibonacci number = {Fnn}
    n   : Next order                 = {n}"""
    )


def next_fibonacci():
    """Returns the next Fibonacci number."""
    global Fn, Fnn, n  # global declaration
    value, Fn, Fnn, n = Fn, Fnn, Fn + Fnn, n + 1
    return value


n = 0
while n < 5:
    print(next_fibonacci())
    n += 1
print_fibonacci_state()

Exercise Why does the while loop prints only 3 numbers instead of 5 Fibonacci numbers?

There is a name collision. n is also incremented by next_fibonacci(), and so the while loop is only executed 3 times in total.

To avoid such error, a convention in python is use a leading underscore for variable names that are private (for internal use):

_single_leading_underscore: weak “internal use” indicator. E.g. from M import * does not import objects whose names start with an underscore.

%%mytutor -h 600
_Fn, _Fnn, _n = 0, 1, 0  # global variables


def print_fibonacci_state():
    print(
        f"""Global states:
    _Fn  : Next Fibonacci number      = {_Fn}
    _Fnn : Next next Fibonacci number = {_Fnn}
    _n   : Next order                 = {_n}"""
    )


def next_fibonacci():
    """Returns the next Fibonacci number."""
    global _Fn, _Fnn, _n  # global declaration
    value, _Fn, _Fnn, _n = _Fn, _Fnn, _Fn + _Fnn, _n + 1
    return value


n = 0
while n < 5:
    print(next_fibonacci())
    n += 1
print_fibonacci_state()

With global variables:

  • codes are less predictable, more difficult to reuse/extend, and

  • tests cannot be isolated, making debugging difficult.

Is it possible to store the function states without using global variables?

We can use nested functions and nonlocal variables.

def fibonacci_sequence(Fn, Fnn):
    def next_fibonacci():
        """Returns the next (generalized) Fibonacci number starting with
        Fn and Fnn as the first two numbers."""
        nonlocal Fn, Fnn, n  # declare nonlocal variables
        value = Fn
        Fn, Fnn, n = Fnn, Fn + Fnn, n + 1
        return value

    def print_fibonacci_state():
        print(
            """States:
        Next Fibonacci number      = {}
        Next next Fibonacci number = {}
        Next order                 = {}""".format(
                Fn, Fnn, n
            )
        )

    n = 0  # Fn and Fnn specified in the function arguments
    return next_fibonacci, print_fibonacci_state


next_fibonacci, print_fibonacci_state = fibonacci_sequence(0, 1)
n = 0
while n < 5:
    print(next_fibonacci())
    n += 1
print_fibonacci_state()
0
1
1
2
3
States:
        Next Fibonacci number      = 5
        Next next Fibonacci number = 8
        Next order                 = 5

The state variables Fn, Fnn, n are now encapsulated, and the functions returned by fibonacci_sequence no longer depends on any global variables.

Another benefit of using nested functions is that we can also create different Fibonacci sequence with different base cases.

my_next_fibonacci, my_print_fibonacci_state = fibonacci_sequence("cs", "1302")
for n in range(5):
    print(my_next_fibonacci())
my_print_fibonacci_state()
cs
1302
cs1302
1302cs1302
cs13021302cs1302
States:
        Next Fibonacci number      = 1302cs1302cs13021302cs1302
        Next next Fibonacci number = cs13021302cs13021302cs1302cs13021302cs1302
        Next order                 = 5

next_fibonacci and print_fibonacci_state are local functions of fibonacci_sequence.

  • They can access (capture) the other local variables of fibonacci_sequence by forming the so-called closures.

  • Similar to the use of global statement, a non-local statement is needed for assigning nonlocal variables.

Each local function has an attribute named __closure__ that stores the captured local variables.

def print_closure(f):
    """Print the closure of a function."""
    print("closure of ", f.__name__)
    for cell in f.__closure__:
        print("    {} content: {!r}".format(cell, cell.cell_contents))


print_closure(next_fibonacci)
print_closure(print_fibonacci_state)
closure of  next_fibonacci
    <cell at 0x7f6ca428b1f0: int object at 0x55827033bd80> content: 5
    <cell at 0x7f6ca428bd90: int object at 0x55827033bde0> content: 8
    <cell at 0x7f6ca428b580: int object at 0x55827033bd80> content: 5
closure of  print_fibonacci_state
    <cell at 0x7f6ca428b1f0: int object at 0x55827033bd80> content: 5
    <cell at 0x7f6ca428bd90: int object at 0x55827033bde0> content: 8
    <cell at 0x7f6ca428b580: int object at 0x55827033bd80> content: 5

Generator

An easier way to generate a sequence of objects one-by-one is to write a generator.

fibonacci_generator = (fibonacci_iteration(n) for n in range(3))
fibonacci_generator
<generator object <genexpr> at 0x7f6ca422a120>

The above uses a generator expression to define fibonacci_generator.

How to obtain items from a generator?

We can use the next function.

while True:
    print(next(fibonacci_generator))  # raises StopIterationException eventually
0
1
1
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-20-03c6ef5c98a9> in <module>
      1 while True:
----> 2     print(next(fibonacci_generator))  # raises StopIterationException eventually

StopIteration: 

A generator object is iterable, i.e., it implements both __iter__ and __next__ methods that are automatically called in a for loop as well as the next function.

fibonacci_generator = (fibonacci_iteration(n) for n in range(5))
for fib in fibonacci_generator:  # StopIterationException handled by for loop
    print(fib)
0
1
1
2
3

Is fibonacci_generator efficient?

No, again due to redundant computations. A better way to define the generator is to use the keyword yield:

%%mytutor -h 450
def fibonacci_sequence(Fn, Fnn, stop):
    """Return a generator that generates Fibonacci numbers
    starting from Fn and Fnn until stop (exclusive)."""
    while Fn < stop:
        yield Fn  # return Fn and pause execution
        Fn, Fnn = Fnn, Fnn + Fn


for fib in fibonacci_sequence(0, 1, 5):
    print(fib)
  1. yield causes the function to return a generator without executing the function body.

  2. Calling __next__ resumes the execution, which

    • pauses at the next yield expression, or

    • raises the StopIterationException at the end.

Exercise yield can be both a statement and an expression. As an expression:

  • The value of a yield expression is None by default, but

  • it can be set by the generator.send method.

Add the document string to the following function. In particular, explain the effect of calling the method send on the returned generator.

def fibonacci_sequence(Fn, Fnn, stop):
    ### BEGIN SOLUTION
    """Return a generator that generates Fibonacci numbers
    starting from Fn and Fnn to stop (exclusive).
    generator.send(value) sets and returns the next number as value."""
    ### END SOLUTION
    while Fn < stop:
        value = yield Fn
        if value is not None:
            Fnn = value  # set next number to the value of yield expression
        Fn, Fnn = Fnn, Fnn + Fn


fibonacci_generator = fibonacci_sequence(0, 1, 5)
print(next(fibonacci_generator))
print(fibonacci_generator.send(2))
for fib in fibonacci_generator:
    print(fib)
0
2
2
4