Operations on Sequences

City University of Hong Kong

CS1302 Introduction to Computer Programming


import random
%reload_ext divewidgets

Mutating a list

%%optlite -h 350
b = [*range(10)]  # aliasing
b[::2] = b[:5]
b[0:1] = b[:5]
b[::2] = b[:5]  # fails

Last assignment fails because [::2] with step size not equal to 1 is an extended slice, which can only be assigned to a list of equal size.

What is the difference between mutation and aliasing?

In the previous code:

  • The first assignment b = [*range(10)] is aliasing, which gives the list the target name/identifier b.
  • Other assignments such as b[::2] = b[:5] are mutations that calls __setitem__ because the target b[::2] is not an identifier.
list.__setitem__?
# %%optlite -l -h 400
a = b = [0]
b[0] = a[0] + 1
print(a[0] < b[0])
Solution to Exercise 1
  • The first line a = b makes a an alias of the same object b points to, and so
  • the second line mutates the same object a and b point to.
  • Hence, a[0] == b[0].
a = [0, 1]
i = 0
a.__setitem__(i := i + 1, i)
print(a)
a = [0, 1]
i = 0
a[i := i + 1] = a[i]
print(a)
Solution to Exercise 2

a[i := i + 1] = a[i] is not the same as calling a.__setitem__(i := i + 1, i). According to the python documentation,

  • the expression to be assigned, i.e., a[i], is first evaluated to a[0] and therefore 0;
  • since the target a[i := i + 1] is a user defined object, it continue to evaluate the target reference, i.e., the address of a, which corresponds to the list [0, 1],
  • followed by the subscription i:=i+1, which evaluates to 1;
  • Finally, a.__setitem__ is called with the subscription, 1, and expression to be assigned, 0, and so the list a points to is mutuated to [0, 1].

In comparison, directly calling a.__setitem__(i := i + 1, i)

  • first evaluates the first argument i := i + 1, which gives 1 that is assigned to i, and
  • then evaluates the second argument i to 1, and so
  • a.__setitem__(1, 1) is called instead, which does not change the list a points to.

Why mutate a list?

The following is another implementation of composite_sequence that takes advantage of the mutability of list.

%%optlite -r
def sieve_composite_sequence(stop):
    is_composite = [False] * stop  # initialization
    for factor in range(2, stop):
        if is_composite[factor]:
            continue
        for multiple in range(factor * 2, stop, factor):
            is_composite[multiple] = True
    return (x for x in range(4, stop) if is_composite[x])


for x in sieve_composite_sequence(100):
    print(x, end=" ")

The algorithm

  1. changes is_composite[x] from False to True if x is a multiple of a smaller number factor, and
  2. returns a generator that generates composite numbers according to is_composite.
composite_sequence = lambda stop: (
    x for x in range(2, stop) if any(x % divisor == 0 for divisor in range(2, x))
)
%%timeit
for x in composite_sequence(10000): pass
%%timeit
for x in sieve_composite_sequence(10000): pass
for x in sieve_composite_sequence(10000000): pass
Solution to Exercise 3

The line if is_composite[factor]: continue avoids the redundant computations of checking composite factors.

%%optlite -h 300
a = [[0] * 2] * 2
a[0][0] = a[1][1] = 1
print(a)
### BEGIN SOLUTION
a = [[0] * 2 for i in range(2)]
### END SOLUTION
a[0][0] = a[1][1] = 1
print(a)

Different methods to operate on a sequence

Recall the quicksort algorithm:

def quicksort(seq):
    '''Return a sorted list of items from seq.'''
    if len(seq) <= 1:
        return list(seq)
    i = random.randint(0, len(seq) - 1)
    pivot, others = seq[i], [*seq[:i], *seq[i + 1:]]
    left = quicksort([x for x in others if x < pivot])
    right = quicksort([x for x in others if x >= pivot])
    return [*left, pivot, *right]


seq = [random.randint(0, 99) for i in range(10)]
print(seq, quicksort(seq), sep='\n')

There is also a built-in function sorted for sorting a sequence:

sorted?
sorted(seq)

Is quicksort quicker?

%%timeit
quicksort(seq)
%%timeit
sorted(seq)

Python implements the Timsort algorithm, which is very efficient.

What are other operations on sequences?

The following compares the lists of public attributes for tuple and list.

list_attributes = dir(list)
tuple_attributes = dir(tuple)

print(
    'Common attributes:', ', '.join([
        attr for attr in list_attributes
        if attr in tuple_attributes and attr[0] != '_'
    ]))

print(
    'Tuple-specific attributes:', ', '.join([
        attr for attr in tuple_attributes
        if attr not in list_attributes and attr[0] != '_'
    ]))

print(
    'List-specific attributes:', ', '.join([
        attr for attr in list_attributes
        if attr not in tuple_attributes and attr[0] != '_'
    ]))
  • There are no public tuple-specific attributes, and
  • all the list-specific attributes are methods that mutate the list, except copy.

The common attributes

  • count method returns the number of occurrences of a value in a tuple/list, and
  • index method returns the index of the first occurrence of a value in a tuple/list.
%%optlite -l -h 450
a = (1,2,2,4,5)
count_of_2 = a.count(2)
index_of_1st_2 = a.index(2)

reverse method reverses the list instead of returning a reversed list.

%%optlite -h 300
a = [*range(10)]
print(reversed(a))
print(*reversed(a))
print(a.reverse())
  • copy method returns a copy of a list.
  • tuple does not have the copy method but it is easy to create a copy by slicing.
%%optlite -h 400
a = [*range(10)]
b = tuple(a)
a_reversed = a.copy()
a_reversed.reverse()
b_reversed = b[::-1]

sort method sorts the list in place instead of returning a sorted list.

%%optlite -h 300
import random
a = [random.randint(0,10) for i in range(10)]
print(sorted(a))
print(a.sort())
  • extend method that extends a list instead of creating a new concatenated list.
  • append method adds an object to the end of a list.
  • insert method insert an object to a specified location.
%%optlite -h 300
a = b = [*range(5)]
print(a + b)
print(a.extend(b))
print(a.append('stop'))
print(a.insert(0,'start'))
  • pop method deletes and return the last item of the list.
  • remove method removes the first occurrence of a value in the list.
  • clear method clears the entire list.

We can also use the function del to delete a selection of a list.

%%optlite -h 300
a = [*range(10)]
del a[::2]
print(a.pop())
print(a.remove(5))
print(a.clear())