01 and 02

This commit is contained in:
James Turk 2024-09-30 00:23:37 -05:00
parent 694be1e596
commit 23a364774e
6 changed files with 183 additions and 206 deletions

View File

@ -1327,7 +1327,7 @@
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": 1,
"id": "f3b01815",
"metadata": {
"slideshow": {
@ -1353,12 +1353,12 @@
"\n",
"bad_tuple = (1+492)\n",
"\n",
"print(bad_tuple)"
"print(bad_tuple) # why is this not a tuple?"
]
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": 2,
"id": "f6c54b13",
"metadata": {
"slideshow": {
@ -1368,16 +1368,12 @@
},
"outputs": [],
"source": [
"multi_item = (1, 2.0, \"three\")\n",
"\n",
"# parentheses are optional\n",
"\n",
"multi_item2 = 1, 2.0"
"multi_item = (1, 2.0, \"three\")"
]
},
{
"cell_type": "code",
"execution_count": 27,
"execution_count": 4,
"id": "310a20b3-ef99-43cf-a9eb-60674beb8967",
"metadata": {
"tags": []
@ -1387,12 +1383,12 @@
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 2.0)\n"
"(1, 2.0, 'three')\n"
]
}
],
"source": [
"print(multi_item2)"
"print(multi_item)"
]
},
{
@ -2551,7 +2547,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.6"
"version": "3.10.15"
},
"toc": {
"base_numbering": 1,

View File

@ -29,7 +29,7 @@
"source": [
"## Iteration\n",
"\n",
"Last week we ended on `for` loops.\n",
"Last week we introduced `for` loops.\n",
"\n",
"```\n",
"for var_name in iterable:\n",
@ -38,6 +38,8 @@
"\n",
"What is an **iterable**? Why not just say **sequence**?\n",
"\n",
"What **sequences** have we seen?\n",
"\n",
"### More Iterables"
]
},
@ -60,7 +62,7 @@
"\n",
"Same rules as slice, always **inclusive** of start, **exclusive** of stop.\n",
"\n",
"*or as you'd write mathematically:* ```[start, stop)``` -- we've seen this before with slicing"
"or as you might write: ```[start, stop)``` -- we've seen this before with slicing"
]
},
{
@ -154,14 +156,23 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"id": "7ed88d5c-e848-46f8-8fc5-8d48fadc303e",
"metadata": {
"tags": []
},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 A\n",
"1 B\n",
"2 C\n"
]
}
],
"source": [
"\n",
"i = 0\n",
"for x in [\"A\", \"B\", \"C\"]:\n",
" print(i, x)\n",
@ -2602,7 +2613,7 @@
"toc-hr-collapsed": true
},
"source": [
"## Functions\n",
"## Functions Revisited\n",
"\n",
"A function is a set of statements that can be called more than once.\n",
"\n",
@ -2667,84 +2678,6 @@
"This means mutability determines whether or not a function can modify a parameter in the outer scope."
]
},
{
"cell_type": "markdown",
"id": "d9f602cd-d726-47fa-b3d6-ec51a86f760d",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### return\n",
"\n",
"- `return` may appear anywhere in a function body, including multiple times.\n",
"\n",
"- The first `return` encountered exits the function.\n",
"\n",
"- Every function in python returns a value. \n",
"\n",
"- If no `return` statement is present, `None` is implicitly returned."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3da80043-16b9-47e7-9aa2-157b5fa29ea0",
"metadata": {},
"outputs": [],
"source": [
"def is_even(num):\n",
" return num % 2 == 0\n",
"\n",
"\n",
"print(is_even(3))"
]
},
{
"cell_type": "markdown",
"id": "31cd6b0d-6932-43cb-94ea-183fbe3491b4",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### `pass` statement\n",
"\n",
"Can be used whenever you need to leave a block empty. Usually temporarily.\n",
"\n",
"```python\n",
"\n",
"if x < 0:\n",
" pass # TODO: figure this out later\n",
"\n",
"\n",
"def func():\n",
" pass\n",
"```\n",
"\n",
"**What does func return?**"
]
},
{
"cell_type": "markdown",
"id": "61fdaebf-2b45-4c83-a4c0-f46bbeced5c1",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### docstrings\n",
"\n",
"Functions should provide docstrings, which are strings declared as the first statement within the function body.\n",
"\n",
"Almost always use triple-quotes to allow multi-line formatting.\n",
"\n",
"The style guide & assignments show examples of the format we expect."
]
},
{
"cell_type": "markdown",
"id": "4729a9a2-66af-4d9c-af2e-441c8486a92c",
@ -2753,7 +2686,7 @@
"toc-hr-collapsed": true
},
"source": [
"# I/O"
"## I/O"
]
},
{
@ -2761,7 +2694,7 @@
"id": "59141a54-daf9-4a55-ad87-b80d34260378",
"metadata": {},
"source": [
"## `print()`\n",
"### `print()`\n",
"\n",
"`print(*objects, sep=' ', end='\\n', file=sys.stdout, flush=False)`\n",
"\n",
@ -2807,27 +2740,67 @@
},
{
"cell_type": "markdown",
"id": "2bc22aaa-6192-4d5d-a01f-4c36e8e41ac0",
"id": "d07f8320-b048-477b-8b59-f171b5dbecd3",
"metadata": {},
"source": [
"## Files\n",
"### pathlib\n",
"\n",
"Another built in type in Python.\n",
"There are a few ways of working with files in Python, mostly due to improvements over time.\n",
"\n",
"Requires us to understand a bit more about how files & memory work.\n",
"You'll still sometimes see code that uses the older method with `open`, but there's almost no reason to write code in that style now that `pathlib` is widely available.\n",
"\n",
"### Typical workflow:\n",
"To use `pathlib`, you'll need to import the `Path` object. (We'll discuss these imports more soon.)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "118ba56e-91c3-4ad9-935e-2a44d1fd064c",
"metadata": {},
"outputs": [],
"source": [
"from pathlib import Path"
]
},
{
"cell_type": "markdown",
"id": "9c24256b-e9df-44da-884c-db0670bda68b",
"metadata": {},
"source": [
"Imports like this should be at the top of the file.\n",
"\n",
"To use this type you'll create objects with file paths, for example:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "c07cd9ee-48bc-4ae1-b813-20380bb2733d",
"metadata": {},
"outputs": [],
"source": [
"# this looks like a function call\n",
"# but the capital letter denotes that this is instead a class\n",
"file_path = Path(\"data/names.txt\")"
]
},
{
"cell_type": "markdown",
"id": "cfec4041-a186-40ed-9967-e096d4f11ffb",
"metadata": {},
"source": [
"#### Typical workflow:\n",
"\n",
"- Read contents of file(s) from disk into working memory.\n",
"- Parse and/or manipulate data as needed.\n",
"- (Optional) Write data back to disk with modifications.\n",
"\n",
"### Other Workflows\n",
"#### Other Workflows\n",
"\n",
"- Append-only (e.g. logging)\n",
"- Streaming data (needed for large files where we can't fit into memory)\n",
"\n",
"### Text vs. Binary\n",
"#### Text vs. Binary\n",
"\n",
"We're opening our files in the default, text mode. It is also possible to open files in a binary mode where it isn't assumed we're reading strings."
]
@ -2850,74 +2823,23 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 7,
"id": "e205aaba",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"data/emails.txt\n"
]
}
],
"source": [
"# to access a file's contents, we need to open it\n",
"fd = open(\"emails.txt\")\n",
"\n",
"print(fd)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9a4277ac",
"metadata": {},
"outputs": [],
"source": [
"# fd is a `file` object, we can use methods to read from the file\n",
"emails = fd.read()\n",
"print(type(emails))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fd55b361",
"metadata": {},
"outputs": [],
"source": [
"# read() got all the data at once, split with \\n newlines\n",
"\n",
"# We can also iterate over the lines in the file\n",
"\n",
"fd.readlines()"
]
},
{
"cell_type": "markdown",
"id": "86dc9c0c-0712-4100-8da1-58bc084ed08c",
"metadata": {},
"source": [
"Open files have a 'cursor', we've reached the end of the file (EOF) so there isn't more to read."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f3691c99",
"metadata": {},
"outputs": [],
"source": [
"# if we use 'seek' we can rewind to the beginning of the file\n",
"fd.seek(0)\n",
"fd.readlines()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0e12a4c5",
"metadata": {},
"outputs": [],
"source": [
"# we can also iterate over the file\n",
"f = open(\"emails.txt\")\n",
"for email in f.readlines():\n",
" print(email.strip()) # extra newline?"
"# to access a file's contents, we create the path, and then\n",
"# use read_text()\n",
"emails_path = Path(\"data/emails.txt\")\n",
"emails = emails_path.read_text()"
]
},
{
@ -2932,48 +2854,55 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 22,
"id": "d0aa3bf0",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[38;5;8m───────┬────────────────────────────────────────────────────────────────────────\u001b[0m\n",
" \u001b[38;5;8m│ \u001b[0mFile: \u001b[1mdata/animals.txt\u001b[0m <EMPTY>\n",
"\u001b[38;5;8m───────┴────────────────────────────────────────────────────────────────────────\u001b[0m\n"
]
}
],
"source": [
"!rm names.txt\n",
"names_file = Path(\"data/animals.txt\").open(\"w\")\n",
"names_file.write(\"Aardvark\\nChimpanzee\\nElephant\\n\")\n",
"\n",
"f = open(\"names.txt\", \"w\")\n",
"f.write(\"Bob\\nPhil\\n\")\n",
"f.write(\"Sally\\n\")\n",
"f.write(\"Rebecca\\n\")\n",
"f.write(\"Joan\\n\")\n",
"f.close()\n",
"\n",
"!cat names.txt"
"# (the ! indicates this is is a shell command, not Python)\n",
"!cat data/animals.txt"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 23,
"id": "d9e2b317",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[38;5;8m───────┬────────────────────────────────────────────────────────────────────────\u001b[0m\n",
" \u001b[38;5;8m│ \u001b[0mFile: \u001b[1mdata/animals.txt\u001b[0m\n",
"\u001b[38;5;8m───────┼────────────────────────────────────────────────────────────────────────\u001b[0m\n",
"\u001b[38;5;8m 1\u001b[0m \u001b[38;5;8m│\u001b[0m \u001b[37mAardvark\u001b[0m\n",
"\u001b[38;5;8m 2\u001b[0m \u001b[38;5;8m│\u001b[0m \u001b[37mChimpanzee\u001b[0m\n",
"\u001b[38;5;8m 3\u001b[0m \u001b[38;5;8m│\u001b[0m \u001b[37mElephant\u001b[0m\n",
"\u001b[38;5;8m 4\u001b[0m \u001b[38;5;8m│\u001b[0m \u001b[37mKangaroo\u001b[0m\n",
"\u001b[38;5;8m───────┴────────────────────────────────────────────────────────────────────────\u001b[0m\n"
]
}
],
"source": [
"f = open(\"names.txt\", \"a\")\n",
"f.write(\"Hector\\n\")\n",
"f.flush()\n",
"!cat names.txt"
]
},
{
"cell_type": "markdown",
"id": "d8b46e54-4aca-4587-9a02-899bc04bbf5c",
"metadata": {},
"source": [
"**Important:** Opening in write mode clears the contents of the file.\n",
"\n",
"\"r\" : Read (default)\n",
"\n",
"\"w\" : Write\n",
"\n",
"\"a\" : Append"
"# open(\"w\") erases the file, use \"a\" if you want to append\n",
"names_file = Path(\"data/animals.txt\").open(\"a\")\n",
"names_file.write(\"Kangaroo\\n\")\n",
"names_file.flush()\n",
"!cat data/animals.txt"
]
},
{
@ -2981,9 +2910,13 @@
"id": "554800e9-bc9e-4c03-94f8-34da10d205fe",
"metadata": {},
"source": [
"#### `close`\n",
"#### `flush` and `close`\n",
"\n",
"Very important to close a file.\n",
"`flush` ensures that the in-memory contents get written to disk, actually saved.\n",
"\n",
"(Analogy: program crashes and you lose your unsaved work)\n",
"\n",
"At the end, important to `close` the file.\n",
"\n",
"- Frees resources.\n",
"- Allows other programs to access file contents.\n",
@ -3003,7 +2936,7 @@
"\n",
"```python\n",
"\n",
"with open(filename) as variable:\n",
"with path.open() as variable:\n",
" statement1\n",
" statement2\n",
"```\n",
@ -3126,10 +3059,46 @@
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "02eb6878-7a2d-46b0-a0f3-e7aab18ebe11",
"metadata": {},
"source": [
"### Note: Relative Paths\n",
"\n",
"You may find that if you are running your code from, for example, the homework1 directory instead of homework1/problem3, you'd need to modify this path to be `Path(\"problem3/towing.csv\")`.\n",
"\n",
"That is because by default, paths are *relative*, meaning that they are assumed to start in the directory that you are running your code from.\n",
"\n",
"This can be frustrating at first, you want your code to work the same regardless of what directory you are in.\n",
"\n",
"### Building an absolute path\n",
"\n",
"To get around this, you can construct an absolute path:\n",
"\n",
"First you can use the special `__file__` variable which always contains the path to the current file.\n",
"\n",
"Then you can use that as the \"anchor\" of your path, and navigate from there.\n",
"\n",
"A common pattern then is to get the current file's parent, and navigate from there:\n",
"\n",
"```python\n",
"from pathlib import Path\n",
"\n",
"path = Path(__file__).parent / \"towing.csv\"\n",
"```\n",
"\n",
"This line uses the special built-in variable `__file__` to get the path of the Python file itself.\n",
"It then gets this file's parent directory (`.parent`) and appends the filename \"towing.csv\" to it.\n",
"\n",
"Using this technique in your code allows you to set paths that don't depend on the current working directory.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d0ab6c18-cf27-41a9-96fb-bf1fe78dfdca",
"id": "5a9bf4ab-dbcd-4dd3-a928-89d33ce88273",
"metadata": {},
"outputs": [],
"source": []
@ -3151,7 +3120,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.6"
"version": "3.10.15"
},
"toc": {
"base_numbering": 1,

4
data/animals.txt Normal file
View File

@ -0,0 +1,4 @@
Aardvark
Chimpanzee
Elephant
Kangaroo

3
data/cnetids.txt Normal file
View File

@ -0,0 +1,3 @@
borja
jturk
lamonts

3
data/emails.txt Normal file
View File

@ -0,0 +1,3 @@
borja@cs.uchicago.edu
jturk@uchicago.edu
lamonts@uchicago.edu

2
data/names.txt Normal file
View File

@ -0,0 +1,2 @@
Bob
Phil