# Compound Data Types

## Iteration

Last week we introduced `for` loops.

```
for var_name in iterable:
 statement # presumably using var_name
```

What is an **iterable**? Why not just say **sequence**?

What **sequences** have we seen?

### More Iterables

#### range

Another iterable!

`range(stop)` # goes from 0 to (stop-1)

`range(start, stop)` # goes from start to (stop-1)

Same rules as slice, always **inclusive** of start, **exclusive** of stop.

or as you might write: ```[start, stop)``` -- we've seen this before with slicing

In [None]:
for x in range(12):
 print(x)

In [None]:
for x in range(8, 12):
 print(x)

In [None]:
z = range(12) # hmm
print(type(z))

In [None]:
i = 0
for x in ["A", "B", "C"]:
 print(i, x)
 i += 1

#### `enumerate`

Another function that returns an iterable, for when we need the index along with the object.

`enumerate(original_iterable)` yields two element tuples: `(index, element)` for every item in the original.

In [None]:
# "incorrect" example
# find using range/len - as you might think to write it based on past experience
def find_r(s, letter_to_find):
 for i in range(len(s)):
 if s[i] == letter_to_find:
 return i
 return -1

In [None]:
find_r("Hello World", "W")

In [None]:
# find using enumerate - Pythonic, more efficient
def find_e(s, letter_to_find):
 for i, letter in enumerate(s): # tuple unpacking
 print(i, letter)
 if letter == letter_to_find:
 return i
 return -1

In [None]:
find_e("Hello world", "w")

In [None]:
find_r("Hello world", "?")

In [None]:
s = "Hello world"
s.find("w") # built-ins are best

Note: For HW#0 it is OK to use range for iteration, for future HWs if you are using the index & value, `enumerate` is the Pythonic way to do this.

### aside: sequence unpacking

When you know exactly how many elements are in a sequence, you can use this syntax to "unpack" them into variables:

In [None]:
tup = (1, 2, 3)
lst = ["a", "b", "c"]

x, y, z = tup
print(x, y, z)

In [None]:
for idx, elem in enumerate(iterable):
 pass

In [1]:
x = 7
y = 8

In [2]:
x, y = y, x

In [3]:
print(x, y)

8 7


## `dict`

A collection of key-value pairs. (aka map/hashmap in other languages)

- Keys must be hashable. `tuple`, `str`, scalars -- why?
- Values are references, can be any type.
- Dynamically resizable
- Implemented using a hashtable, lookup is constant-time. **O(1)**

- Iterable? Yes
- Mutable? Yes
- Sequence? No. (Why not?)

In [10]:
record1 = {
 "name": "Anna",
 2024: 42,
 2023: 12,
}
print(record1)

{'name': 'Anna', 2024: 42, 2023: 12}


In [2]:
# declaration
record1 = {
 "name": "Anna",
 "age": 42,
}
record1["name"] = "James"

empty = {}

# alternate form
record2 = dict(age=42, name="Anna")
# list("a", "b")

# can also construct from sequence of tuples

record3 = dict(
 [
 ("name", "Anna"),
 ("age", 42)
 ]
)

# can compare for equality
record1 == record2

False

In [12]:
print(record1, record2)

{'name': 'Anna', 'age': 42} {'age': 42, 'name': 'Anna'}


In [5]:
# indexing by key
print(record1["name"])

James


In [6]:
record1["name"] = "Anne"

In [7]:
# 'in' tests if a key exists (not a value!)
print(record1)
print("name" in record1)
print(42 in record1)

{'name': 'Anne', 'age': 42}
True
False


In [18]:
# keys, values, items
print(record1.keys())
print(record1.values())
print((record1.items()))

dict_keys(['name', 'age'])
dict_values(['Anna', 42])
dict_items([('name', 'Anna'), ('age', 42)])


In [11]:
for k, v in record1.items():
 print(k, v)

name Anne
age 42


In [22]:
for k,v in record1.items():
 print(k, v)

name Anna
age 42


In [16]:
hash({})

TypeError: unhashable type: 'dict'

In [14]:
hash(1)

1

In [19]:
d = {}
d[(1, 2, 3)] = 4
print(d)

{(1, 2, 3): 4}


In [25]:
## hashable?

print(f"{hash('abc')=}")
print(f"{hash(1234.3)=}")
print(f"{hash((1,2,3))=}")

print(f"{hash([1,2,3,4])=}")

hash('abc')=-7376796221354515387
hash(1234.3)=691752902764004562
hash((1,2,3))=529344067295497451


TypeError: unhashable type: 'list'

In [1]:
hash("abc")

-4894370073748428294

In [29]:
hash("abd")

8446955659539365509

In [30]:
d2 = {}
d2[[1, 2, 3]] = "OK"

TypeError: unhashable type: 'list'

In [None]:
hash("Python")

### Mutability

Dictionaries are *mutable*, you can change, expand, and shrink them in place.

This means we aren't copying/creating new dictionaries on every edit.

In [20]:
order = {"spam": 1, "eggs": 2, "coffee": 1}

order["sausage"] = 1
print(order)

{'spam': 1, 'eggs': 2, 'coffee': 1, 'sausage': 1}


In [3]:
del order["eggs"]
print(order)

{'spam': 5, 'coffee': 1}


In [4]:
order["bagel"] = 1
print(order)

{'spam': 5, 'coffee': 1, 'bagel': 1}


In [5]:
hash("bagel"), hash("Bagel")

(3611625396340438220, -2119394878459364811)

In [6]:
## dictionaries are iterable

for key in order:
 print(key)

spam
coffee
bagel


In [None]:
# can use .items() or .values() to loop over non-keys
for key, value in order.items():
 print(f"{key=} {value=}")


print(order.items())

In [None]:
# can use .items() or .values() to loop over non-keys
for a_tuple in order.items():
 print(a_tuple[0], a_tuple[1])

### common dictionary methods

| Operation | Meaning |
|-----------|---------|
| `d.keys()` | View of all keys. |
| `d.values()` | View of all values. |
| `d.items()` | View of key, value tuples. |
| `d.copy()` | Make a (shallow) copy. |
| `d.clear()` | Remove all items. |
| `d.get(key, default=None)` | Same as d[key] except if item isn't present, default will be returned. |
| `d.pop(key, default=None)` | Fetch item & remove it from dict. |
| `len(d)` | Number of stored entries. |

See all at https://docs.python.org/3/library/stdtypes.html#dict

In [12]:
d = order
#print(order)
key = "fish"

print("james ordered", d.get(key, 0), key)

james ordered 0 fish


In [13]:
d

{'spam': 5, 'coffee': 1, 'bagel': 1}

In [17]:

print(d.pop("coffee"))

KeyError: 'coffee'

In [15]:
d

{'spam': 5, 'bagel': 1}

In [None]:
len(record1)

In [None]:
record1

In [None]:
order

number_ordered = order.pop("spam", 0)
print(number_ordered)

In [None]:
print(order)

### Dictionary View Objects

As noted above, `keys(), values() and items()` return "view objects."

The returned object is a dynamic view, so when the dictionary changes, the view changes.

In [None]:
dishes = {"eggs": 2, "sausage": 1, "bacon": 1, "spam": 500}

# Keys is a view object of the keys from the dishes dictionary
keys = dishes.keys()
values = dishes.values()
items = dishes.items()

print(keys)
print(values)
print(items)

In [None]:
# View objects are dynamic and reflect dictionary changes

# Lets delete the 'eggs' entry
del dishes["eggs"]

# Notice the both the views have removed key and its value
print(keys)
print(values)
print(items)

In [21]:
# Nested Dictionaries Example

menu = {
 "Breakfast": {"Eggs": 2.19, "Toast": 0.99, "Orange Juice": 1.99},
 "Lunch": {"BLT": 3.99, "Chicken": 5.99, "Salad": 4.50},
 "Dinner": {"Cheeseburger": 9.99, "Salad": 7.50, "Special": 8.49},
}

print(menu["Lunch"])

print(menu["Lunch"]["Salad"])

{'BLT': 3.99, 'Chicken': 5.99, 'Salad': 4.5}
4.5


### Caveats

- Downsides of mutables?
- Modifying a `dict` while iterating through it.

In [23]:
def something(d):
 to_remove = []

 d_copy = d.copy()
 for k, v in d.items():
 if v < 50:
 d_copy.pop(k)
 #to_remove.append(k)

 #for item in to_remove:
 # d.pop(item)
 # ...
 return d_copy


scores = {"A": 100, "B": 20, "C": 48}
something(scores)
print(scores)

{'A': 100}


In [None]:
# iteration example
d = {"A": 1, "B": 2, "C": 3}
to_remove = []
for key, value in d.items():
 if value == 2:
 to_remove.append(key)
for key in to_remove:
 d.pop(key)

print(d)

In [22]:
students = {
 "Anne": 98,
 "Mitch": 13,
 "Zach": 65,
}

below_60 = []

for student in students:
 grade = students[student]
 if grade < 60:
 below_60.append(student)

for name in below_60:
 students.pop(name)

print(students)

{'Anne': 98, 'Zach': 65}


## `set`

Sets contain an unordered collection of *unique* & *immutable* values.

 - Unique: no duplicates

 - Immutable: values cannot be `dict`, `set`, `list`.


Sets themselves are *mutable*.

In [23]:
# defining a set
animals = {"llama", "panda", "ostrich"}
print(animals)

# or can be made from an iterable
animals = set(["llama", "panda", "ostrich"])
print(animals)

{'panda', 'ostrich', 'llama'}
{'panda', 'llama', 'ostrich'}


In [24]:
s = set()

In [25]:
# no duplicates
animals = set(["llama", "panda", "ostrich", "ostrich", "panda"])
print(animals)

{'panda', 'llama', 'ostrich'}


In [27]:
lst = [1, 23, 4920, 2091, 4920, 4920, 4920, 23]

In [28]:
deduped = list(set(lst))
print(deduped)

[4920, 1, 2091, 23]



### Set Theory Operations

Sets are fundamentally mathematical in nature and contain operations based on set theory. They allow the following operations:

 - Union (`union()` or `|`}: A set containing all elements that are in both sets

 - Difference (`difference()` or `-`): A set that consists of elements that are in one set but not the other.

 - Intersection (`intersection` or `&`): A set that consists of all elements that are in both sets.



In [24]:
# The following creates a set of single strings 'a','b','c','d','e'
# and another set of single strings 'b','d','x','y','z'
A = set("abcde")
B = set(["b", "d", "x", "y", "z"])

print("A = ", A)
print()
print("B = ", B)

A = {'a', 'c', 'b', 'e', 'd'}

B = {'y', 'z', 'b', 'x', 'd'}


In [25]:
# Union Operation
new_set = A | B
print(new_set)
print("---")
new_set = A.union(B) # Same operation as above but using method
print(new_set)

{'y', 'a', 'z', 'c', 'b', 'x', 'e', 'd'}
---
{'y', 'a', 'z', 'c', 'b', 'x', 'e', 'd'}


In [26]:
# Difference Operation
new_set = A - B
print(new_set)
print("---")
new_set = B.difference(A) # note that order matters for difference
print(new_set)

{'c', 'a', 'e'}
---
{'y', 'z', 'x'}


In [33]:
# Intersection Operation
new_set = A & B
print(new_set)
print("---")
new_set = A.intersection(B) # same operation as above but using method
print(new_set)

{'d', 'b'}
---
{'d', 'b'}


In [32]:
# Symmetric Difference Operation
new_set = A ^ B
print(new_set)
print("---")
new_set = A.symmetric_difference(B) # same operation as above but using method
print(new_set)

{'z', 'y', 'e', 'c', 'x', 'a'}
---
{'z', 'y', 'e', 'c', 'x', 'a'}


### Other Set Methods

| Method | Purpose | 
|--------|---------|
| `s.add(item)` | Adds an item to set. |
| `s.update(iterable)` | Adds all items from iterable to the set. |
| `s.remove(item)` | Remove an item from set. |
| `s.discard(item)` | Remove an item from set if it is present, fail silently if not. |
| `s.pop()` | Remove an arbitrary item from the set. |
| `s.clear()` | Remove all items from the set. |

In [32]:
s = {1, 2, 3}
print(s.remove(4))
#print(s)

None


In [33]:
s = set() # why not {}?

s.update(["A", "2", "3", "4", "5", "6", "7", "8", "9", "J", "Q", "K"])

s.remove("A")
print("Removed Ace")
print(s)

Removed Ace
{'J', '5', '8', '4', '6', '9', 'Q', 'K', '2', '3', '7'}


In [34]:
s.pop()

'J'

In [None]:
s.discard("9")
# print("Discarded Ace")
print(s)

In [None]:
card = s.pop()
print("Popped", card)
print(s)

In [None]:
print("---")
s.add("Joker")
print(s)


"Honda Civic" in [
 "Honda Civic",
 "Ford Focus",
 "Honda Civic",
 "Honda Civic",
 "Honda Civic",
 "Honda Civic",
 "Honda Civic",
 "Escalade",
]

In [35]:
d1 = {"eggs": 2, "pancakes": 100, "juice": 1}
d2 = {"eggs": 3, "waffles": 1, "coffee": 1}
d3 = {"eggs": 1, "fruit salad": 1}

print("All 3 ordered:", set(d1) & set(d2) & set(d3))
print("Only ordered by #1:", set(d1) - set(d2))

All 3 ordered: {'eggs'}
Only ordered by #1: {'juice', 'pancakes'}


In [None]:
set(d1.items())

In [None]:
s = {"one", "two", "three", "four"}
for x in s:
 print(x)

In [37]:
students = [
 {"name": "adam", "num": 123},
 {"name": "quynh", "num": 456},
 {"name": "quynh", "num": 456},
 {"name": "adam", "num": 999},
]

s = set()
for student in students:
 s.add(tuple(student.items()))
 # not 
 #s.add(student)
deduplicated = s


In [38]:
for student in deduplicated:
 print(dict(student))

{'name': 'adam', 'num': 123}
{'name': 'adam', 'num': 999}
{'name': 'quynh', 'num': 456}


## Discussion

#### Are sets sequences?

#### Why do set members need to be immutable?

#### How can we store compound values in sets?

#### Why do dictionary keys have the same restrictions?

In [46]:
# frozenset demo
nums = [1, 2, 2, 2, 3, 3]
frozen_nums = frozenset(nums)
print(frozen_nums)

frozenset({1, 2, 3})


In [50]:
nested = {frozen_nums, frozenset("ABC")}

print(nested)

frozen_nums.add(4)

{frozenset({1, 2, 3}), frozenset({'B', 'C', 'A'})}


AttributeError: 'frozenset' object has no attribute 'add'

In [56]:
xx = set("hello")

In [57]:
vowels = set("aeiou")
print(vowels)

{'e', 'a', 'u', 'o', 'i'}


In [58]:
xx - vowels

{'h', 'l'}

## Mutability

Mutable values can be changed in place.

We've seen that `list` was mutable, and `dict` and `set` as well now.

#### Mutable
 - `list`
 - `dict`
 - `set`
 
#### Immutable
 - `str`
 - `tuple`
 - `frozenset`
 - scalars: `int`, `float`, `complex`, `bool`, `None`

In [None]:
# list
d = [1, 2, 3]
d.append(4)
print(d)

In [None]:
# str
s = "Hello"
s = s + " World"
s

# how did s change?

In [None]:
s = "Hello World"
t = s.lower()
print(s)
print(t)