{ "cells": [ { "cell_type": "markdown", "id": "4444d9dd", "metadata": { "slideshow": { "slide_type": "slide" }, "tags": [] }, "source": [ "# Compound Data Types" ] }, { "cell_type": "markdown", "id": "a5918cff-1c93-41bd-811a-69d97c797f49", "metadata": { "tags": [], "toc-hr-collapsed": true }, "source": [ "## Iteration\n", "\n", "Last week we introduced `for` loops.\n", "\n", "```\n", "for var_name in iterable:\n", " statement # presumably using var_name\n", "```\n", "\n", "What is an **iterable**? Why not just say **sequence**?\n", "\n", "What **sequences** have we seen?\n", "\n", "### More Iterables" ] }, { "cell_type": "markdown", "id": "c0f1720d-937e-4030-b3a4-18c1382fb3ec", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### range\n", "\n", "Another iterable!\n", "\n", "`range(stop)` # goes from 0 to (stop-1)\n", "\n", "`range(start, stop)` # goes from start to (stop-1)\n", "\n", "Same rules as slice, always **inclusive** of start, **exclusive** of stop.\n", "\n", "or as you might write: ```[start, stop)``` -- we've seen this before with slicing" ] }, { "cell_type": "code", "execution_count": null, "id": "66d82d83-b8b8-4ad4-9f5a-b237d8bbe1d8", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "for x in range(12):\n", " print(x)" ] }, { "cell_type": "code", "execution_count": null, "id": "4f6334e0-eeaa-45f7-a3c0-f975911f5ddb", "metadata": { "tags": [] }, "outputs": [], "source": [ "for x in range(8, 12):\n", " print(x)" ] }, { "cell_type": "code", "execution_count": null, "id": "8280f1b2-1935-496c-b372-3ccc3d1bc7f2", "metadata": { "tags": [] }, "outputs": [], "source": [ "z = range(12) # hmm\n", "print(type(z))" ] }, { "cell_type": "code", "execution_count": null, "id": "58dc9fc1-9056-4be4-9f91-8747aa7e7925", "metadata": { "tags": [] }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "7ed88d5c-e848-46f8-8fc5-8d48fadc303e", "metadata": { "tags": [] }, "outputs": [], "source": [ "i = 0\n", "for x in [\"A\", \"B\", \"C\"]:\n", " print(i, x)\n", " i += 1" ] }, { "cell_type": "markdown", "id": "c241b5fd-d0b6-4d72-a554-238931d28d36", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### `enumerate`\n", "\n", "Another function that returns an iterable, for when we need the index along with the object.\n", "\n", "`enumerate(original_iterable)` yields two element tuples: `(index, element)` for every item in the original." ] }, { "cell_type": "code", "execution_count": null, "id": "f8a79062-4012-4700-9611-83a4cdcd641c", "metadata": { "tags": [] }, "outputs": [], "source": [ "# \"incorrect\" example\n", "# find using range/len - as you might think to write it based on past experience\n", "def find_r(s, letter_to_find):\n", " for i in range(len(s)):\n", " if s[i] == letter_to_find:\n", " return i\n", " return -1" ] }, { "cell_type": "code", "execution_count": null, "id": "844ce2af-b8f2-4dd3-b1fe-46ff050b2664", "metadata": { "tags": [] }, "outputs": [], "source": [ "find_r(\"Hello World\", \"W\")" ] }, { "cell_type": "code", "execution_count": null, "id": "21715b8f-4ce2-4c63-879b-69e06e9cef00", "metadata": { "scrolled": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# find using enumerate - Pythonic, more efficient\n", "def find_e(s, letter_to_find):\n", " for i, letter in enumerate(s): # tuple unpacking\n", " print(i, letter)\n", " if letter == letter_to_find:\n", " return i\n", " return -1" ] }, { "cell_type": "code", "execution_count": null, "id": "9b37d43a-bb10-420f-b86e-c70a8c56c55e", "metadata": { "tags": [] }, "outputs": [], "source": [ "find_e(\"Hello world\", \"w\")" ] }, { "cell_type": "code", "execution_count": null, "id": "75e6a9da-e5a9-4f8a-93f2-ee0964a6efb4", "metadata": { "tags": [] }, "outputs": [], "source": [ "find_r(\"Hello world\", \"?\")" ] }, { "cell_type": "code", "execution_count": null, "id": "04a6fdb6-f136-47a8-b6dc-c32c87a40542", "metadata": { "tags": [] }, "outputs": [], "source": [ "s = \"Hello world\"\n", "s.find(\"w\") # built-ins are best" ] }, { "cell_type": "markdown", "id": "527d4283-34bb-4a44-86c8-d1bee46be555", "metadata": {}, "source": [ "Note: For HW#0 it is OK to use range for iteration, for future HWs if you are using the index & value, `enumerate` is the Pythonic way to do this." ] }, { "cell_type": "markdown", "id": "13234aa7-ec30-44f1-8453-778db6ecd6ce", "metadata": { "tags": [] }, "source": [ "### aside: sequence unpacking\n", "\n", "When you know exactly how many elements are in a sequence, you can use this syntax to \"unpack\" them into variables:" ] }, { "cell_type": "code", "execution_count": null, "id": "bedcf76f-d5f9-42d3-bc01-a845dd2e75b1", "metadata": { "tags": [] }, "outputs": [], "source": [ "tup = (1, 2, 3)\n", "lst = [\"a\", \"b\", \"c\"]\n", "\n", "x, y, z = tup\n", "print(x, y, z)" ] }, { "cell_type": "code", "execution_count": null, "id": "a1bdb25b-f6fc-4a93-b8a0-14b7e3e5eaab", "metadata": {}, "outputs": [], "source": [ "for idx, elem in enumerate(iterable):\n", " pass" ] }, { "cell_type": "code", "execution_count": 1, "id": "dc82927d-c336-4eff-abde-4d9dc1507bc5", "metadata": { "tags": [] }, "outputs": [], "source": [ "x = 7\n", "y = 8" ] }, { "cell_type": "code", "execution_count": 2, "id": "f1e0714f-5d7b-426b-9748-a82b5c179251", "metadata": {}, "outputs": [], "source": [ "x, y = y, x" ] }, { "cell_type": "code", "execution_count": null, "id": "7646739a-8b0e-45cb-b63d-f1e54a9aa05b", "metadata": { "tags": [] }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 3, "id": "e1d328df-5016-4f82-9f8c-8c425f47146c", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "8 7\n" ] } ], "source": [ "print(x, y)" ] }, { "cell_type": "markdown", "id": "2b894ba7", "metadata": { "slideshow": { "slide_type": "slide" }, "tags": [], "toc-hr-collapsed": true }, "source": [ "## `dict`\n", "\n", "A collection of key-value pairs. (aka map/hashmap in other languages)\n", "\n", "- Keys must be hashable. `tuple`, `str`, scalars -- why?\n", "- Values are references, can be any type.\n", "- Dynamically resizable\n", "- Implemented using a hashtable, lookup is constant-time. **O(1)**\n", "\n", "- Iterable? Yes\n", "- Mutable? Yes\n", "- Sequence? No. (Why not?)" ] }, { "cell_type": "code", "execution_count": 10, "id": "7aec887f-2f55-4f8f-80ce-ea3324fde2c6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'name': 'Anna', 2024: 42, 2023: 12}\n" ] } ], "source": [ "record1 = {\n", " \"name\": \"Anna\",\n", " 2024: 42,\n", " 2023: 12,\n", "}\n", "print(record1)" ] }, { "cell_type": "code", "execution_count": 2, "id": "b5123db7", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# declaration\n", "record1 = {\n", " \"name\": \"Anna\",\n", " \"age\": 42,\n", "}\n", "record1[\"name\"] = \"James\"\n", "\n", "empty = {}\n", "\n", "# alternate form\n", "record2 = dict(age=42, name=\"Anna\")\n", "# list(\"a\", \"b\")\n", "\n", "# can also construct from sequence of tuples\n", "\n", "record3 = dict(\n", " [\n", " (\"name\", \"Anna\"),\n", " (\"age\", 42)\n", " ]\n", ")\n", "\n", "# can compare for equality\n", "record1 == record2" ] }, { "cell_type": "code", "execution_count": 12, "id": "9948e31b-aff8-4ce4-8a28-50d0d07dc18a", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'name': 'Anna', 'age': 42} {'age': 42, 'name': 'Anna'}\n" ] } ], "source": [ "print(record1, record2)" ] }, { "cell_type": "code", "execution_count": 5, "id": "af8c64ca", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "James\n" ] } ], "source": [ "# indexing by key\n", "print(record1[\"name\"])" ] }, { "cell_type": "code", "execution_count": 6, "id": "903268d7-73e0-4a63-9404-0c52936c6e9c", "metadata": { "tags": [] }, "outputs": [], "source": [ "record1[\"name\"] = \"Anne\"" ] }, { "cell_type": "code", "execution_count": 7, "id": "7f1c685a", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'name': 'Anne', 'age': 42}\n", "True\n", "False\n" ] } ], "source": [ "# 'in' tests if a key exists (not a value!)\n", "print(record1)\n", "print(\"name\" in record1)\n", "print(42 in record1)" ] }, { "cell_type": "code", "execution_count": 18, "id": "41b1f5a1", "metadata": { "scrolled": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "dict_keys(['name', 'age'])\n", "dict_values(['Anna', 42])\n", "dict_items([('name', 'Anna'), ('age', 42)])\n" ] } ], "source": [ "# keys, values, items\n", "print(record1.keys())\n", "print(record1.values())\n", "print((record1.items()))" ] }, { "cell_type": "code", "execution_count": 11, "id": "b73dbddb-5e64-4340-9037-d6400fca8218", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "name Anne\n", "age 42\n" ] } ], "source": [ "for k, v in record1.items():\n", " print(k, v)" ] }, { "cell_type": "code", "execution_count": 22, "id": "300408f0-e9f2-4cdc-96ba-7db185154f16", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "name Anna\n", "age 42\n" ] } ], "source": [ "for k,v in record1.items():\n", " print(k, v)" ] }, { "cell_type": "code", "execution_count": 16, "id": "0f6a9550-84b0-4f2d-8cb6-e48272be69a7", "metadata": { "tags": [] }, "outputs": [ { "ename": "TypeError", "evalue": "unhashable type: 'dict'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[16], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;28;43mhash\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m{\u001b[49m\u001b[43m}\u001b[49m\u001b[43m)\u001b[49m\n", "\u001b[0;31mTypeError\u001b[0m: unhashable type: 'dict'" ] } ], "source": [ "hash({})" ] }, { "cell_type": "code", "execution_count": 14, "id": "033855c4-ec3a-4ea9-ba67-9abee0715840", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hash(1)" ] }, { "cell_type": "code", "execution_count": 19, "id": "0566050d-86ae-4961-983c-bcea78960580", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{(1, 2, 3): 4}\n" ] } ], "source": [ "d = {}\n", "d[(1, 2, 3)] = 4\n", "print(d)" ] }, { "cell_type": "code", "execution_count": 25, "id": "37c96ad5", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "hash('abc')=-7376796221354515387\n", "hash(1234.3)=691752902764004562\n", "hash((1,2,3))=529344067295497451\n" ] }, { "ename": "TypeError", "evalue": "unhashable type: 'list'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[25], line 7\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mhash\u001b[39m(\u001b[38;5;241m1234.3\u001b[39m)\u001b[38;5;132;01m=}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 5\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mhash\u001b[39m((\u001b[38;5;241m1\u001b[39m,\u001b[38;5;241m2\u001b[39m,\u001b[38;5;241m3\u001b[39m))\u001b[38;5;132;01m=}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m----> 7\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28;43mhash\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m1\u001b[39;49m\u001b[43m,\u001b[49m\u001b[38;5;241;43m2\u001b[39;49m\u001b[43m,\u001b[49m\u001b[38;5;241;43m3\u001b[39;49m\u001b[43m,\u001b[49m\u001b[38;5;241;43m4\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;132;01m=}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n", "\u001b[0;31mTypeError\u001b[0m: unhashable type: 'list'" ] } ], "source": [ "## hashable?\n", "\n", "print(f\"{hash('abc')=}\")\n", "print(f\"{hash(1234.3)=}\")\n", "print(f\"{hash((1,2,3))=}\")\n", "\n", "print(f\"{hash([1,2,3,4])=}\")" ] }, { "cell_type": "code", "execution_count": 1, "id": "007eeb5c-550b-4ede-9c87-a411d75d1dcd", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "-4894370073748428294" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hash(\"abc\")" ] }, { "cell_type": "code", "execution_count": 29, "id": "98e1b75f-f9a2-47f4-822a-57c2053295ad", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "8446955659539365509" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hash(\"abd\")" ] }, { "cell_type": "code", "execution_count": 30, "id": "d9939e43", "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "unhashable type: 'list'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[30], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m d2 \u001b[38;5;241m=\u001b[39m {}\n\u001b[0;32m----> 2\u001b[0m \u001b[43md2\u001b[49m\u001b[43m[\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m1\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m2\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m3\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m]\u001b[49m \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mOK\u001b[39m\u001b[38;5;124m\"\u001b[39m\n", "\u001b[0;31mTypeError\u001b[0m: unhashable type: 'list'" ] } ], "source": [ "d2 = {}\n", "d2[[1, 2, 3]] = \"OK\"" ] }, { "cell_type": "code", "execution_count": null, "id": "90ea61ba", "metadata": {}, "outputs": [], "source": [ "hash(\"Python\")" ] }, { "cell_type": "markdown", "id": "9b3ffe31", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Mutability\n", "\n", "Dictionaries are *mutable*, you can change, expand, and shrink them in place.\n", "\n", "This means we aren't copying/creating new dictionaries on every edit." ] }, { "cell_type": "code", "execution_count": 20, "id": "56ada375", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'spam': 1, 'eggs': 2, 'coffee': 1, 'sausage': 1}\n" ] } ], "source": [ "order = {\"spam\": 1, \"eggs\": 2, \"coffee\": 1}\n", "\n", "order[\"sausage\"] = 1\n", "print(order)" ] }, { "cell_type": "code", "execution_count": 3, "id": "aa6d1aed", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'spam': 5, 'coffee': 1}\n" ] } ], "source": [ "del order[\"eggs\"]\n", "print(order)" ] }, { "cell_type": "code", "execution_count": 4, "id": "8450549b", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'spam': 5, 'coffee': 1, 'bagel': 1}\n" ] } ], "source": [ "order[\"bagel\"] = 1\n", "print(order)" ] }, { "cell_type": "code", "execution_count": 5, "id": "ae853627", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(3611625396340438220, -2119394878459364811)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hash(\"bagel\"), hash(\"Bagel\")" ] }, { "cell_type": "code", "execution_count": 6, "id": "f5d1307f", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "spam\n", "coffee\n", "bagel\n" ] } ], "source": [ "## dictionaries are iterable\n", "\n", "for key in order:\n", " print(key)" ] }, { "cell_type": "code", "execution_count": null, "id": "96ece38d", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# can use .items() or .values() to loop over non-keys\n", "for key, value in order.items():\n", " print(f\"{key=} {value=}\")\n", "\n", "\n", "print(order.items())" ] }, { "cell_type": "code", "execution_count": null, "id": "182bd434", "metadata": {}, "outputs": [], "source": [ "# can use .items() or .values() to loop over non-keys\n", "for a_tuple in order.items():\n", " print(a_tuple[0], a_tuple[1])" ] }, { "cell_type": "markdown", "id": "01a7d316", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### common dictionary methods\n", "\n", "| Operation | Meaning |\n", "|-----------|---------|\n", "| `d.keys()` | View of all keys. |\n", "| `d.values()` | View of all values. |\n", "| `d.items()` | View of key, value tuples. |\n", "| `d.copy()` | Make a (shallow) copy. |\n", "| `d.clear()` | Remove all items. |\n", "| `d.get(key, default=None)` | Same as d[key] except if item isn't present, default will be returned. |\n", "| `d.pop(key, default=None)` | Fetch item & remove it from dict. |\n", "| `len(d)` | Number of stored entries. |\n", "\n", "See all at https://docs.python.org/3/library/stdtypes.html#dict" ] }, { "cell_type": "code", "execution_count": 12, "id": "d5801652-9284-4a55-8f94-53f4a1cce0cf", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "james ordered 0 fish\n" ] } ], "source": [ "d = order\n", "#print(order)\n", "key = \"fish\"\n", "\n", "print(\"james ordered\", d.get(key, 0), key)" ] }, { "cell_type": "code", "execution_count": 13, "id": "c8d24981-0150-483e-a338-a617bb7e92f8", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'spam': 5, 'coffee': 1, 'bagel': 1}" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d" ] }, { "cell_type": "code", "execution_count": 17, "id": "ca4b8660-8307-4f52-a9e9-154d51c7ab94", "metadata": { "tags": [] }, "outputs": [ { "ename": "KeyError", "evalue": "'coffee'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[17], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[43md\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mpop\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mcoffee\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m)\n", "\u001b[0;31mKeyError\u001b[0m: 'coffee'" ] } ], "source": [ "\n", "print(d.pop(\"coffee\"))" ] }, { "cell_type": "code", "execution_count": 15, "id": "2f3dcdac-5818-4d84-87a5-d3375bfb35b2", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'spam': 5, 'bagel': 1}" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d" ] }, { "cell_type": "code", "execution_count": null, "id": "462515ae-a3e3-4886-bfab-d95b6c847b84", "metadata": { "tags": [] }, "outputs": [], "source": [ "len(record1)" ] }, { "cell_type": "code", "execution_count": null, "id": "567de037-1184-4814-9be1-15a7b66db504", "metadata": { "tags": [] }, "outputs": [], "source": [ "record1" ] }, { "cell_type": "code", "execution_count": null, "id": "bef0fcac", "metadata": {}, "outputs": [], "source": [ "order\n", "\n", "number_ordered = order.pop(\"spam\", 0)\n", "print(number_ordered)" ] }, { "cell_type": "code", "execution_count": null, "id": "bd5364b0", "metadata": {}, "outputs": [], "source": [ "print(order)" ] }, { "cell_type": "markdown", "id": "6cf39963", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Dictionary View Objects\n", "\n", "As noted above, `keys(), values() and items()` return \"view objects.\"\n", "\n", "The returned object is a dynamic view, so when the dictionary changes, the view changes." ] }, { "cell_type": "code", "execution_count": null, "id": "0f1881a1", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "dishes = {\"eggs\": 2, \"sausage\": 1, \"bacon\": 1, \"spam\": 500}\n", "\n", "# Keys is a view object of the keys from the dishes dictionary\n", "keys = dishes.keys()\n", "values = dishes.values()\n", "items = dishes.items()\n", "\n", "print(keys)\n", "print(values)\n", "print(items)" ] }, { "cell_type": "code", "execution_count": null, "id": "674cc686", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# View objects are dynamic and reflect dictionary changes\n", "\n", "# Lets delete the 'eggs' entry\n", "del dishes[\"eggs\"]\n", "\n", "# Notice the both the views have removed key and its value\n", "print(keys)\n", "print(values)\n", "print(items)" ] }, { "cell_type": "code", "execution_count": 21, "id": "b6658174", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'BLT': 3.99, 'Chicken': 5.99, 'Salad': 4.5}\n", "4.5\n" ] } ], "source": [ "# Nested Dictionaries Example\n", "\n", "menu = {\n", " \"Breakfast\": {\"Eggs\": 2.19, \"Toast\": 0.99, \"Orange Juice\": 1.99},\n", " \"Lunch\": {\"BLT\": 3.99, \"Chicken\": 5.99, \"Salad\": 4.50},\n", " \"Dinner\": {\"Cheeseburger\": 9.99, \"Salad\": 7.50, \"Special\": 8.49},\n", "}\n", "\n", "print(menu[\"Lunch\"])\n", "\n", "print(menu[\"Lunch\"][\"Salad\"])" ] }, { "cell_type": "markdown", "id": "da42d6b5", "metadata": {}, "source": [ "### Caveats\n", "\n", "- Downsides of mutables?\n", "- Modifying a `dict` while iterating through it." ] }, { "cell_type": "code", "execution_count": 23, "id": "ecb495c8-ac3a-4b70-9e74-1bfed1d4b466", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'A': 100}\n" ] } ], "source": [ "def something(d):\n", " to_remove = []\n", "\n", " d_copy = d.copy()\n", " for k, v in d.items():\n", " if v < 50:\n", " d_copy.pop(k)\n", " #to_remove.append(k)\n", "\n", " #for item in to_remove:\n", " # d.pop(item)\n", " # ...\n", " return d_copy\n", "\n", "\n", "scores = {\"A\": 100, \"B\": 20, \"C\": 48}\n", "something(scores)\n", "print(scores)" ] }, { "cell_type": "code", "execution_count": null, "id": "9c988c56", "metadata": {}, "outputs": [], "source": [ "# iteration example\n", "d = {\"A\": 1, \"B\": 2, \"C\": 3}\n", "to_remove = []\n", "for key, value in d.items():\n", " if value == 2:\n", " to_remove.append(key)\n", "for key in to_remove:\n", " d.pop(key)\n", "\n", "print(d)" ] }, { "cell_type": "code", "execution_count": 22, "id": "410613ac", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'Anne': 98, 'Zach': 65}\n" ] } ], "source": [ "students = {\n", " \"Anne\": 98,\n", " \"Mitch\": 13,\n", " \"Zach\": 65,\n", "}\n", "\n", "below_60 = []\n", "\n", "for student in students:\n", " grade = students[student]\n", " if grade < 60:\n", " below_60.append(student)\n", "\n", "for name in below_60:\n", " students.pop(name)\n", "\n", "print(students)" ] }, { "cell_type": "markdown", "id": "976e988f", "metadata": { "slideshow": { "slide_type": "slide" }, "toc-hr-collapsed": true }, "source": [ "## `set`\n", "\n", "Sets contain an unordered collection of *unique* & *immutable* values.\n", "\n", " - Unique: no duplicates\n", "\n", " - Immutable: values cannot be `dict`, `set`, `list`.\n", "\n", "\n", "Sets themselves are *mutable*." ] }, { "cell_type": "code", "execution_count": 23, "id": "3a3db482", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'panda', 'ostrich', 'llama'}\n", "{'panda', 'llama', 'ostrich'}\n" ] } ], "source": [ "# defining a set\n", "animals = {\"llama\", \"panda\", \"ostrich\"}\n", "print(animals)\n", "\n", "# or can be made from an iterable\n", "animals = set([\"llama\", \"panda\", \"ostrich\"])\n", "print(animals)" ] }, { "cell_type": "code", "execution_count": null, "id": "b240ba8b-2e54-464c-8003-bca5b66c532e", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 24, "id": "2bb720e8-5830-4fd3-a7a5-cf50af5a0868", "metadata": { "tags": [] }, "outputs": [], "source": [ "s = set()" ] }, { "cell_type": "code", "execution_count": 25, "id": "4628529f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'panda', 'llama', 'ostrich'}\n" ] } ], "source": [ "# no duplicates\n", "animals = set([\"llama\", \"panda\", \"ostrich\", \"ostrich\", \"panda\"])\n", "print(animals)" ] }, { "cell_type": "code", "execution_count": 27, "id": "bbe43b2e-d95d-4e16-be98-ac03f227003d", "metadata": { "tags": [] }, "outputs": [], "source": [ "lst = [1, 23, 4920, 2091, 4920, 4920, 4920, 23]" ] }, { "cell_type": "code", "execution_count": 28, "id": "1fd3e328-c0f5-4fd0-8407-b1cd33010f90", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[4920, 1, 2091, 23]\n" ] } ], "source": [ "deduped = list(set(lst))\n", "print(deduped)" ] }, { "cell_type": "markdown", "id": "56cb6bab", "metadata": {}, "source": [ "\n", "### Set Theory Operations\n", "\n", "Sets are fundamentally mathematical in nature and contain operations based on set theory. They allow the following operations:\n", "\n", " - Union (`union()` or `|`}: A set containing all elements that are in both sets\n", "\n", " - Difference (`difference()` or `-`): A set that consists of elements that are in one set but not the other.\n", "\n", " - Intersection (`intersection` or `&`): A set that consists of all elements that are in both sets.\n", "\n" ] }, { "cell_type": "code", "execution_count": 24, "id": "9ef87b91", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A = {'a', 'c', 'b', 'e', 'd'}\n", "\n", "B = {'y', 'z', 'b', 'x', 'd'}\n" ] } ], "source": [ "# The following creates a set of single strings 'a','b','c','d','e'\n", "# and another set of single strings 'b','d','x','y','z'\n", "A = set(\"abcde\")\n", "B = set([\"b\", \"d\", \"x\", \"y\", \"z\"])\n", "\n", "print(\"A = \", A)\n", "print()\n", "print(\"B = \", B)" ] }, { "cell_type": "code", "execution_count": 25, "id": "cd7cd6d7", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'y', 'a', 'z', 'c', 'b', 'x', 'e', 'd'}\n", "---\n", "{'y', 'a', 'z', 'c', 'b', 'x', 'e', 'd'}\n" ] } ], "source": [ "# Union Operation\n", "new_set = A | B\n", "print(new_set)\n", "print(\"---\")\n", "new_set = A.union(B) # Same operation as above but using method\n", "print(new_set)" ] }, { "cell_type": "code", "execution_count": 26, "id": "cb6bd2f9", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'c', 'a', 'e'}\n", "---\n", "{'y', 'z', 'x'}\n" ] } ], "source": [ "# Difference Operation\n", "new_set = A - B\n", "print(new_set)\n", "print(\"---\")\n", "new_set = B.difference(A) # note that order matters for difference\n", "print(new_set)" ] }, { "cell_type": "code", "execution_count": 33, "id": "4e516175", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'d', 'b'}\n", "---\n", "{'d', 'b'}\n" ] } ], "source": [ "# Intersection Operation\n", "new_set = A & B\n", "print(new_set)\n", "print(\"---\")\n", "new_set = A.intersection(B) # same operation as above but using method\n", "print(new_set)" ] }, { "cell_type": "code", "execution_count": 32, "id": "6d6fcff7", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'z', 'y', 'e', 'c', 'x', 'a'}\n", "---\n", "{'z', 'y', 'e', 'c', 'x', 'a'}\n" ] } ], "source": [ "# Symmetric Difference Operation\n", "new_set = A ^ B\n", "print(new_set)\n", "print(\"---\")\n", "new_set = A.symmetric_difference(B) # same operation as above but using method\n", "print(new_set)" ] }, { "cell_type": "markdown", "id": "2558302f", "metadata": {}, "source": [ "### Other Set Methods\n", "\n", "| Method | Purpose | \n", "|--------|---------|\n", "| `s.add(item)` | Adds an item to set. |\n", "| `s.update(iterable)` | Adds all items from iterable to the set. |\n", "| `s.remove(item)` | Remove an item from set. |\n", "| `s.discard(item)` | Remove an item from set if it is present, fail silently if not. |\n", "| `s.pop()` | Remove an arbitrary item from the set. |\n", "| `s.clear()` | Remove all items from the set. |" ] }, { "cell_type": "code", "execution_count": 32, "id": "e7be6cfb-5778-4106-a454-ab13cbd96e30", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "None\n" ] } ], "source": [ "s = {1, 2, 3}\n", "print(s.remove(4))\n", "#print(s)" ] }, { "cell_type": "code", "execution_count": 33, "id": "322c8f1a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Removed Ace\n", "{'J', '5', '8', '4', '6', '9', 'Q', 'K', '2', '3', '7'}\n" ] } ], "source": [ "s = set() # why not {}?\n", "\n", "s.update([\"A\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"J\", \"Q\", \"K\"])\n", "\n", "s.remove(\"A\")\n", "print(\"Removed Ace\")\n", "print(s)" ] }, { "cell_type": "code", "execution_count": 34, "id": "41a8ea98-1ffe-43b9-9548-fa43197cad94", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "'J'" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.pop()" ] }, { "cell_type": "code", "execution_count": null, "id": "2555b568", "metadata": {}, "outputs": [], "source": [ "s.discard(\"9\")\n", "# print(\"Discarded Ace\")\n", "print(s)" ] }, { "cell_type": "code", "execution_count": null, "id": "ccc84afc", "metadata": {}, "outputs": [], "source": [ "card = s.pop()\n", "print(\"Popped\", card)\n", "print(s)" ] }, { "cell_type": "code", "execution_count": null, "id": "eb3ed63b", "metadata": {}, "outputs": [], "source": [ "print(\"---\")\n", "s.add(\"Joker\")\n", "print(s)\n", "\n", "\n", "\"Honda Civic\" in [\n", " \"Honda Civic\",\n", " \"Ford Focus\",\n", " \"Honda Civic\",\n", " \"Honda Civic\",\n", " \"Honda Civic\",\n", " \"Honda Civic\",\n", " \"Honda Civic\",\n", " \"Escalade\",\n", "]" ] }, { "cell_type": "code", "execution_count": 35, "id": "c1ec32b4", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "All 3 ordered: {'eggs'}\n", "Only ordered by #1: {'juice', 'pancakes'}\n" ] } ], "source": [ "d1 = {\"eggs\": 2, \"pancakes\": 100, \"juice\": 1}\n", "d2 = {\"eggs\": 3, \"waffles\": 1, \"coffee\": 1}\n", "d3 = {\"eggs\": 1, \"fruit salad\": 1}\n", "\n", "print(\"All 3 ordered:\", set(d1) & set(d2) & set(d3))\n", "print(\"Only ordered by #1:\", set(d1) - set(d2))" ] }, { "cell_type": "code", "execution_count": null, "id": "434fc861", "metadata": {}, "outputs": [], "source": [ "set(d1.items())" ] }, { "cell_type": "code", "execution_count": null, "id": "32718b95", "metadata": {}, "outputs": [], "source": [ "s = {\"one\", \"two\", \"three\", \"four\"}\n", "for x in s:\n", " print(x)" ] }, { "cell_type": "code", "execution_count": 37, "id": "c86b847d", "metadata": {}, "outputs": [], "source": [ "students = [\n", " {\"name\": \"adam\", \"num\": 123},\n", " {\"name\": \"quynh\", \"num\": 456},\n", " {\"name\": \"quynh\", \"num\": 456},\n", " {\"name\": \"adam\", \"num\": 999},\n", "]\n", "\n", "s = set()\n", "for student in students:\n", " s.add(tuple(student.items()))\n", " # not \n", " #s.add(student)\n", "deduplicated = s\n" ] }, { "cell_type": "code", "execution_count": 38, "id": "236f3706", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'name': 'adam', 'num': 123}\n", "{'name': 'adam', 'num': 999}\n", "{'name': 'quynh', 'num': 456}\n" ] } ], "source": [ "for student in deduplicated:\n", " print(dict(student))" ] }, { "cell_type": "markdown", "id": "5ddacfcf", "metadata": { "slideshow": { "slide_type": "slide" }, "toc-hr-collapsed": true }, "source": [ "## Discussion\n", "\n", "#### Are sets sequences?\n", "\n", "#### Why do set members need to be immutable?\n", "\n", "#### How can we store compound values in sets?\n", "\n", "#### Why do dictionary keys have the same restrictions?" ] }, { "cell_type": "code", "execution_count": 46, "id": "cdf70650-cd3f-440c-a9e1-2d4d27dfca99", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "frozenset({1, 2, 3})\n" ] } ], "source": [ "# frozenset demo\n", "nums = [1, 2, 2, 2, 3, 3]\n", "frozen_nums = frozenset(nums)\n", "print(frozen_nums)" ] }, { "cell_type": "code", "execution_count": 50, "id": "f3a752f5-c01e-4b7e-83f9-560f2db27559", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{frozenset({1, 2, 3}), frozenset({'B', 'C', 'A'})}\n" ] }, { "ename": "AttributeError", "evalue": "'frozenset' object has no attribute 'add'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[50], line 5\u001b[0m\n\u001b[1;32m 1\u001b[0m nested \u001b[38;5;241m=\u001b[39m {frozen_nums, \u001b[38;5;28mfrozenset\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mABC\u001b[39m\u001b[38;5;124m\"\u001b[39m)}\n\u001b[1;32m 3\u001b[0m \u001b[38;5;28mprint\u001b[39m(nested)\n\u001b[0;32m----> 5\u001b[0m \u001b[43mfrozen_nums\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43madd\u001b[49m(\u001b[38;5;241m4\u001b[39m)\n", "\u001b[0;31mAttributeError\u001b[0m: 'frozenset' object has no attribute 'add'" ] } ], "source": [ "nested = {frozen_nums, frozenset(\"ABC\")}\n", "\n", "print(nested)\n", "\n", "frozen_nums.add(4)" ] }, { "cell_type": "code", "execution_count": 56, "id": "690909f7-5bd4-4040-94ee-fad99d4e6c85", "metadata": { "tags": [] }, "outputs": [], "source": [ "xx = set(\"hello\")" ] }, { "cell_type": "code", "execution_count": 57, "id": "72ed3c15-7a07-40a1-b5bd-af08213283a9", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'e', 'a', 'u', 'o', 'i'}\n" ] } ], "source": [ "vowels = set(\"aeiou\")\n", "print(vowels)" ] }, { "cell_type": "code", "execution_count": 58, "id": "b3c19f35-ced1-44a9-b82a-c6ef116a7a63", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'h', 'l'}" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xx - vowels" ] }, { "cell_type": "markdown", "id": "3037023e", "metadata": { "slideshow": { "slide_type": "slide" }, "toc-hr-collapsed": true }, "source": [ "## Mutability\n", "\n", "Mutable values can be changed in place.\n", "\n", "We've seen that `list` was mutable, and `dict` and `set` as well now.\n", "\n", "#### Mutable\n", " - `list`\n", " - `dict`\n", " - `set`\n", " \n", "#### Immutable\n", " - `str`\n", " - `tuple`\n", " - `frozenset`\n", " - scalars: `int`, `float`, `complex`, `bool`, `None`" ] }, { "cell_type": "code", "execution_count": null, "id": "38a9d1bf", "metadata": {}, "outputs": [], "source": [ "# list\n", "d = [1, 2, 3]\n", "d.append(4)\n", "print(d)" ] }, { "cell_type": "code", "execution_count": null, "id": "62198c82", "metadata": {}, "outputs": [], "source": [ "# str\n", "s = \"Hello\"\n", "s = s + \" World\"\n", "s\n", "\n", "# how did s change?" ] }, { "cell_type": "code", "execution_count": null, "id": "b52f9779-83e9-4781-a785-3c8fff6341cf", "metadata": { "tags": [] }, "outputs": [], "source": [ "s = \"Hello World\"\n", "t = s.lower()\n", "print(s)\n", "print(t)" ] }, { "cell_type": "code", "execution_count": null, "id": "5a9bf4ab-dbcd-4dd3-a928-89d33ce88273", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.15" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "305.8px" }, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 5 }