CS1302 Introduction to Computer Programming
import random
%reload_ext divewidgetsMotivation of composite data type¶
The following code calculates the average of five numbers:
def average_five_numbers(n1, n2, n3, n4, n5):
return (n1 + n2 + n3 + n4 + n5) / 5
average_five_numbers(1, 2, 3, 4, 5)What about using the above function to compute the average household income in Hong Kong.
The labor size in Hong Kong is close to 4 million.
- Should we create a variable to store the income of each individual?
- Should we recursively apply the function to groups of five numbers?
What we need is
- a composite data type that can keep a variable number of items, so that
- we can then define a function that takes an object of the composite data type,
- and returns the average of all items in the object.
How to store a sequence of items in Python?
We learned a composite data type that stores a sequence of characters. What is it?
tuple and list are two other built-in sequence types for ordered collections of objects. Unlike string, they can store items of possibly different types.
Indeed, we have already used tuples and lists before.
%%optlite -h 400
a_list = "1 2 3".split()
a_tuple = (lambda *args: args)(1, 2, 3)
a_list[0] = 0
a_tuple[0] = 0What is the difference between tuple and list?
Constructing sequences¶
How to create tuple/list?
Mathematicians often represent a set of items in two different ways:
- Roster notation, which enumerates the elements in the sequence, e.g.,
- Set-builder notation, which describes the content using a rule for constructing the elements, e.g.,
namely the set of perfect squares less than 100.
How to create a tuple/list by enumerating its items?
To create a tuple, we enclose a comma separated sequence by parentheses:
%%optlite -h 450
empty_tuple = ()
singleton_tuple = (0,) # why not (0)?
heterogeneous_tuple = (singleton_tuple, (1, 2.0), print)
enclosed_starred_tuple = (*range(2), *"23")To create a list, we use square brackets to enclose a comma separated sequence of objects.
%%optlite -h 400
empty_list = []
singleton_list = [0] # no need to write [0,]
heterogeneous_list = [singleton_list, (1, 2.0), print]
enclosed_starred_list = [*range(2), *"23"]We can also create a tuple/list from other iterables using the constructors tuple/list as well as addition and multiplication similar to str.
%%optlite -l -h 900
str2list = list("Hello")
str2tuple = tuple("Hello")
range2list = list(range(5))
range2tuple = tuple(range(5))
tuple2list = list((1, 2, 3))
list2tuple = tuple([1, 2, 3])
concatenated_tuple = (1,) + (2, 3)
concatenated_list = [1, 2] + [3]
duplicated_tuple = (1,) * 2
duplicated_list = 2 * [1]print((1 + 2) * 2, (1 + 2,) * 2, sep="\n")Solution to Exercise 1
(1+2)*2 evaluates to 6 but (1+2,)*2 evaluates to (3,3).
- The parentheses in
(1+2)indicate the addition needs to be performed first, but - the parentheses in
(1+2,)creates a tuple.
Hence, singleton tuple must have a comma after the item to differentiate these two use cases.
How to use a rule to construct a tuple/list?
We can specify the rule using a comprehension,
which we have used in a generator expression.
E.g., the following is a python one-liner that returns a generator for prime numbers.
all?
prime_sequence = lambda stop: (
x for x in range(2, stop) if all(x % divisor for divisor in range(2, x))
)
print(*prime_sequence(100))There are two comprehensions used:
- In
all(x % divisor for divisor in range(2, x)), the comprehension creates a generator of remainders to the functionall, which returnsTrueif all the remainders are non-zero elseFalse. - In the return value
(x for x in range(2, stop) if ...)of the anonymous function, the comprehension creates a generator of numbers from 2 tostop-1that satisfy the condition of theifclause.
any?
### BEGIN SOLUTION
composite_sequence = lambda stop: (
x for x in range(2, stop) if any(x % divisor == 0 for divisor in range(2, x))
)
### END SOLUTION
print(*composite_sequence(100))We can construct a list instead of a generator using list comprehension:
[x ** 2 for x in range(10)] # Enclose comprehension by bracketsIs the list comprehension the same as applying list to a generator expression?
list(x ** 2 for x in range(10)) # Enclose comprehension by bracketsList comprehension is more efficient as it does not need to create generator first:
%%timeit
[x ** 2 for x in range(10)]%%timeit
list(x ** 2 for x in range(10))%%timeit
tuple(x for x in range(100))%%timeit
tuple([x for x in range(100)])Solution to Exercise 3
The second method is often faster because the list of items can be created faster with list comprehension instead of generator expression. This benefits appear to out-weigh the cost in converting a list to a tuple.
With list comprehension, we can simulate a sequence of biased coin flips.
from random import random as rand
p = rand() # unknown bias
coin_flips = ["H" if rand() <= p else "T" for i in range(1000)]
print("Chance of head:", p)
print("Coin flips:", *coin_flips)We can then estimate the bias by the fraction of heads coming up.
def average(seq):
return sum(seq) / len(seq)
head_indicators = [1 if outcome == "H" else 0 for outcome in coin_flips]
fraction_of_heads = average(head_indicators)
print("Fraction of heads:", fraction_of_heads)def variance(seq):
### BEGIN SOLUTION
return sum(i ** 2 for i in seq) / len(seq) - average(seq) ** 2
### END SOLUTION
delta = (variance(head_indicators) / len(head_indicators)) ** 0.5
print("95% confidence interval: [{:.2f},{:.2f}]".format(p - 2 * delta, p + 2 * delta))Selecting items in a sequence¶
How to traverse a tuple/list?
Instead of calling the dunder method directly, we can use a for loop to iterate over all the items in order.
a = (*range(5),)
for item in a:
print(item, end=" ")To do it in reverse, we can use the reversed function.
reversed?
a = [*range(5)]
for item in reversed(a):
print(item, end=" ")We can also traverse multiple tuples/lists simultaneously by zipping them.
zip?
a = (*range(5),)
b = reversed(a)
for item1, item2 in zip(a, b):
print(item1, item2)How to select an item in a sequence?
We can select an item of a sequence a by subscription
a[i]where a is a list and i is an integer index.
A non-negative index indicates the distance from the beginning.
%%optlite -h 500
a = (*range(10),)
print(a)
print("Length:", len(a))
print("First element:", a[0])
print("Second element:", a[1])
print("Last element:", a[len(a) - 1])
print(a[len(a)]) # IndexErrorA negative index represents a negative offset from an imaginary element one past the end of the sequence.
%%optlite -h 500
a = [*range(10)]
print(a)
print("Last element:", a[-1])
print("Second last element:", a[-2])
print("First element:", a[-len(a)])
print(a[-len(a) - 1]) # IndexErrorHow to select multiple items?
We can use slicing to select a range of items as follows:
a[start:stop]
a[start:stop:step]The selected items corresponds to those indexed using range:
(a[i] for i in range(start, stop))
(a[i] for i in range(start, stop, step))a = (*range(10),)
print(a[1:4])
print(a[1:4:2])Unlike range, the parameters for slicing take their default values if missing or equal to None:
a = [*range(10)]
print(a[:4]) # start defaults to 0
print(a[1:]) # stop defaults to len(a)
print(a[1:4:]) # step defaults to 1The parameters can also take negative values:
print(a[-1:])
print(a[:-1])
print(a[::-1]) # What are the default values used here?A mixture of negative and postive values are also okay:
print(a[-1:1]) # equal [a[-1], a[0]]?
print(a[1:-1]) # equal []?
print(a[1:-1:-1]) # equal [a[1], a[0]]?
print(a[-100:100]) # result in IndexError like subscription?def sss(a, i=None, j=None, k=None):
### BEGIN SOLUTION
l = len(a)
step = 1 if k is None else k
m = l if step > 0 else l - 1
start = 0 if i is None else min(i if i > 0 else max(i + l, 0), m)
stop = l if j is None else min(j if j > 0 else max(j + l, 0), m)
### END SOLUTION
return start, stop, step
a = [*range(10)]
assert sss(a, -1, 1) == (9, 1, 1)
assert sss(a, 1, -1) == (1, 9, 1)
assert sss(a, 1, -1, -1) == (1, 9, -1)
assert sss(a, -100, 100) == (0, 10, 1)With slicing, we can now implement a practical sorting algorithm called quicksort to sort a sequence. Explain how the code works:
def quicksort(seq):
"""Return a sorted list of items from seq."""
if len(seq) <= 1:
return list(seq)
i = random.randint(0, len(seq) - 1)
pivot, others = seq[i], [*seq[:i], *seq[i + 1 :]]
left = quicksort([x for x in others if x < pivot])
right = quicksort([x for x in others if x >= pivot])
return [*left, pivot, *right]
seq = [random.randint(0, 99) for i in range(10)]
print(seq, quicksort(seq), sep="\n")Solution to Exercise 6
The above recursion creates a sorted list as [*left, pivot, *right] where
pivotis a randomly selected item inseq,leftis the sorted list of items smaller thanpivot, andrightis the sorted list of items no smaller thanpivot.
The base case happens when seq contains at most one item, in which case seq is already sorted.