{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Python Data Model\n", "\n", "## Python Data Model\n", "\n", "Reference: https://docs.python.org/3/reference/datamodel.html\n", "\n", "### StaticArray Example\n", "\n", "To demonstrate operator overloading, we'll implement a sequence type seen in other languages known as a *static array*:\n", "\n", "- A static array is a sequence type (review: what makes a sequence type) where there is a fixed capacity to number of items the collection can hold.\n", "\n", "- Resizing of the array is not allowed after initialization. \n", "\n", "- We will define a class ``StaticArray`` that will allow use to use built-in python operators.\n", "\n", "We'll be able to use it like this:\n", "\n", "```python\n", ">>> from static_array import StaticArray\n", ">>> sa = StaticArray([1, 2, 3])\n", ">>> print(sa * 2)\n", "# should produce the following output:\n", "# [1, 2, 3, 1, 2, 3]\n", ">>> print(sa[1])\n", "2\n", "```" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "from collections.abc import Iterable\n", "\n", "\n", "class StaticArray:\n", " def __init__(self, init_val, capacity=5):\n", " if isinstance(init_val, Iterable):\n", " self.items = list(init_val)\n", " self.capacity = len(self.items)\n", " else:\n", " self.items = [init_val] * capacity\n", " self.capacity = capacity\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sa = StaticArray([1, 2, 3])\n", "print(sa)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "sa = StaticArray(0, 5)\n", "print(sa)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# we can fix that by defining a __repr__ method\n", "\n", "\n", "class StaticArray:\n", " def __init__(self, init_val, capacity=5):\n", " if isinstance(init_val, Iterable):\n", " self.items = list(init_val)\n", " self.capacity = len(self.items)\n", " else:\n", " self.items = [init_val] * capacity\n", " self.capacity = capacity\n", "\n", " def __repr__(self):\n", " return f\"StaticArray({self.items})\"\n", "\n", "\n", "sa = StaticArray([1, 2, 3])\n", "print(sa)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "sa = StaticArray(0, 5)\n", "print(sa)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Emulating Collections & Sequences\n", "\n", "**Collections**\n", "\n", "- Have a length: `len(obj)`\n", "- Can be iterated over: `for item in obj`\n", "- Can query for membership: `item in obj`\n", "\n", "**Sequences**\n", "\n", "- Everything a collection can do\n", "- Can be indexed: `obj[0]`\n", "\n", "| You Write... | Python Calls... |\n", "|--------------|-----------------|\n", "| `len(obj)` | `obj.__len__()` |\n", "| `for item in obj` | `obj.__iter__()` |\n", "| `item in obj` | `obj.__contains__(item)` |\n", "| `obj[i]` | `obj.__getitem__(i)` |\n", "| `obj[i] = x` | `obj.__setitem__(i, x)` |\n", "| `del obj[i]` | `obj.__delitem__(i)` |\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class StaticArray:\n", " def __init__(self, init_val, capacity=5):\n", " if isinstance(init_val, Iterable):\n", " self.items = list(init_val)\n", " self.capacity = len(self.items)\n", " else:\n", " self.items = [init_val] * capacity\n", " self.capacity = capacity\n", "\n", " def __repr__(self):\n", " return f\"StaticArray({self.items})\"\n", "\n", " def __str__(self):\n", " return f\"StaticArray({self.items})\"\n", "\n", " def __len__(self):\n", " return self.capacity\n", "\n", " def __contains__(self, item):\n", " return item in self.items\n", "\n", " def __getitem__(self, index):\n", " if index >= self.capacity or index < -self.capacity:\n", " raise IndexError(\"Index out of range\")\n", " return self.items[index]\n", "\n", " def __setitem__(self, index, val):\n", " if index >= self.capacity or index < -self.capacity:\n", " raise IndexError(\"Index out of range\")\n", " self.items[index] = val\n", "\n", " def __delitem__(self, index):\n", " raise NotImplementedError(\"StaticArray does not support deletion\")\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sa = StaticArray([1, \"hi\", 3.14, True])\n", "len(sa)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "42 in sa\n", "\"hi\" in sa" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sa[3]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sa[42] = \"hello\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Numeric Operators\n", "\n", "| You Write... | Python Calls... |\n", "|--------------|-----------------|\n", "| `x + y` | `x.__add__(y)` |\n", "| `x - y` | `x.__sub__(y)` |\n", "| `x * y` | `x.__mul__(y)` |\n", "| `x / y` | `x.__truediv__(y)` |\n", "| `x // y` | `x.__floordiv__(y)` |\n", "| `x % y` | `x.__mod__(y)` |\n", "| `x ** y` | `x.__pow__(y)` |\n", "| `x @ y` | `x.__matmul__(y)` |\n", "\n", "### Reverse / Reflected / Right Operators\n", "\n", "| You Write... | Python Calls... |\n", "|--------------|-----------------|\n", "| `x + y` | `y.__radd__(x)` |\n", "| `x - y` | `y.__rsub__(x)` |\n", "| `x * y` | `y.__rmul__(x)` |\n", "| `x / y` | `y.__rtruediv__(x)` |\n", "| `x // y` | `y.__rfloordiv__(x)` |\n", "| `x % y` | `y.__rmod__(x)` |\n", "| `x ** y` | `y.__rpow__(x)` |\n", "| `x @ y` | `y.__rmatmul__(x)` |\n", "\n", "![](images/reverse_operators.png)\n", "\n", "### Comparison Operators\n", "\n", "| You Write... | Python Calls... |\n", "|--------------|-----------------| \n", "| `x < y` | `x.__lt__(y)` |\n", "| `x <= y` | `x.__le__(y)` |\n", "| `x > y` | `x.__gt__(y)` |\n", "| `x >= y` | `x.__ge__(y)` |\n", "| `x == y` | `x.__eq__(y)` |\n", "| `x != y` | `x.__ne__(y)` |\n", "\n", "\n", "### Iteration\n", "\n", "Iterable vs. Iterator (Review)\n", "\n", "Objects like lists, tuples, and strings are *iterable*.\n", "\n", "To keep track of the position within a given iteration (for loop, calls to `next`), Python uses an *iterator*." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ll = [1, 2, 3, 4]\n", "iterator = iter(ll)\n", "print(\"iterator 1 next()\", next(iterator))\n", "print(\"iterator 1 next()\", next(iterator))\n", "iterator2 = iter(ll)\n", "print(\"iterator 2 next()\", next(iterator2))\n", "print(\"iterator 1 next()\", next(iterator))\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To be iterable, a class needs an `__iter__` method that returns an iterator.\n", "\n", "An iterator is an object with a `__next__` method that returns the next item in the iteration. It should raise `StopIteration` when there are no more items.\n", "\n", "Common Pattern: If a class only needs to be iterable once, it can return itself as the iterator, thus fulfilling both roles." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in iterable:\n", " print(i)\n", "\n", "iterator = iter(iterable)\n", "while True:\n", " print(next(iterator))" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "class SimpleRange:\n", " def __init__(self, n):\n", " self.current = 0\n", " self.n = n\n", "\n", " def __iter__(self):\n", " print(\"iter has been called\")\n", " return self\n", "\n", " def __next__(self):\n", " if self.current >= self.n:\n", " print(\"at the end\")\n", " raise StopIteration\n", " else:\n", " print(f\"next was called, moving {self.current} to {self.current+1}\")\n", " self.current += 1\n", " return self.current - 1\n", "\n", " def __repr__(self):\n", " return f\"SimpleRange({self.n}, current={self.current})\"\n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "iter has been called\n", "next was called, moving 0 to 1\n", "iter has been called\n", "next was called, moving 1 to 2\n", "0 1\n", "next was called, moving 2 to 3\n", "0 2\n", "at the end\n", "at the end\n" ] } ], "source": [ "sr = SimpleRange(3)\n", "for i in sr:\n", " for j in sr:\n", " print(i, j)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "sr = SimpleRange(5)\n", "siter = iter(sr)\n", "print(siter)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "siter is sr" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "next(siter)\n", "print(siter)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Iteration Advice \n", "\n", "1. Do not implement the ``__next__()`` in a class that should only be an iterable. \n", "2. In order to support multiple traversals, the iterator must be a seperate object. \n", "3. A common design pattern is to delegate iteration to a seperate class that is iterable.\n", "\n", "For example, defining an ``StaticArrayIterator`` class that is in charge iterating through the objects within an ``StaticArray`` object. " ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "# Adding __iter__ to StaticArray\n", "class StaticArrayIterator:\n", " def __init__(self, values):\n", " self.values = values\n", " self.position = 0\n", "\n", " def __next__(self):\n", " if self.position >= len(self.values):\n", " raise StopIteration\n", " item = self.values[self.position]\n", " self.position += 1\n", " return item\n", "\n", " def __repr__(self):\n", " return f\"iterating over {self.values}, at position {self.position}\"\n", "\n", "\n", "# Adding __iter__\n", "\n", "\n", "class StaticArray:\n", " def __init__(self, capacity, initial=None):\n", " self._items = [initial] * capacity\n", " self._capacity = capacity\n", " self._iter_position = 0\n", "\n", " @classmethod\n", " def from_iterable(self, iterable):\n", " new_array = StaticArray(len(iterable))\n", " for idx, item in enumerate(iterable):\n", " new_array._items[idx] = item\n", " return new_array\n", "\n", " def __repr__(self):\n", " # __repr__ is the unambiguous string representation\n", " # of an object\n", " return f\"StaticArray({self._capacity}, {self._items})\"\n", "\n", " def __str__(self):\n", " return repr(self._items)\n", "\n", " # Sequence Operations ####\n", "\n", " def __len__(self):\n", " return self._capacity\n", "\n", " def __contains__(self, x):\n", " return x in self._items\n", "\n", " def __getitem__(self, i):\n", " if i >= self._capacity or i < -self._capacity:\n", " raise IndexError # an invalid index\n", " return self._items[i]\n", "\n", " def __setitem__(self, i, x):\n", " if i >= self._capacity or i < -self._capacity:\n", " raise IndexError # an invalid index\n", " self._items[i] = x\n", "\n", " def __delitem__(self, i):\n", " raise NotImplementedError(\"Cannot delete from a static array\")\n", "\n", " # Iterable Operations ####\n", " def __iter__(self):\n", " return StaticArrayIterator(self._items.copy())\n" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 2, 3, 4, 5]\n", "1 1\n", "1 2\n", "1 3\n", "1 4\n", "1 5\n", "2 1\n", "2 2\n", "2 3\n", "2 4\n", "2 5\n", "3 1\n", "3 2\n", "3 3\n", "3 4\n", "3 5\n", "4 1\n", "4 2\n", "4 3\n", "4 4\n", "4 5\n", "5 1\n", "5 2\n", "5 3\n", "5 4\n", "5 5\n" ] } ], "source": [ "sa = StaticArray(5, 2)\n", "sa[0] = 1\n", "sa[1] = 2\n", "sa[2] = 3\n", "sa[3] = 4\n", "sa[4] = 5\n", "print(sa)\n", "for x in sa:\n", " for y in sa:\n", " print(x, y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Context Managers / `with`\n", "\n", "We also saw this idea of needing to clean up after ourselves when we used `with` to open files.\n", "\n", "```python\n", "\n", "with open(filename) as f:\n", " # do things with f\n", " g(f)\n", "# f is guaranteed to be closed even if \n", "# exceptions are raised within with block\n", "```\n", "\n", "### Yet another set of dunder methods..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class DatabaseConnection:\n", " def __init__(self, username, password):\n", " # connect to database\n", " self.username = username\n", " self.password = password\n", " self.connected = True\n", "\n", " def __enter__(self):\n", " print(\"__enter__\")\n", " # must return self!\n", " return self\n", "\n", " def __exit__(self, exc_type, exc_val, exc_traceback):\n", " print(\"__exit__\")\n", " if exc_type:\n", " print(\"rolling back changes\")\n", " self.connected = False\n", "\n", " def query(self, sql):\n", " print(\"ran query\", sql)\n", "\n", " def __repr__(self):\n", " return f\"Connection connected={self.connected}\"\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "db = DatabaseConnection(\"hello\", \"world\")\n", "db.query(\"SELECT * FROM users;\")\n", "# do something dangerous\n", "1 / 0" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# our connection is possibly left in a broken state\n", "print(db)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with DatabaseConnection(\"hello\", \"world\") as db:\n", " # __enter__\n", " db.query(\"SELECT * from users;\")\n", " 1 / 0\n", " # __exit__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# changes were rolled back, and our connection is safe\n", "db.connected\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Callable Objects Examples\n", "\n", "Functions have a few attributes like `__name__` and `__doc__` that we can use to introspect on them." ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "add\n", "Adds two numbers\n" ] } ], "source": [ "def add(x, y):\n", " \"\"\"Adds two numbers\"\"\"\n", " return x + y\n", "\n", "\n", "print(add.__name__)\n", "print(add.__doc__)\n", "\n", "x = add" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "x.__name__" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "tags": [] }, "outputs": [], "source": [ "class Example:\n", " def __init__(self, name):\n", " self.name = name\n", " self.num_calls = 0\n", " def __call__(self, **kwargs):\n", " print(self.num_calls)\n", " self.num_calls += 1\n", " print(self.name, \"got\", args)\n", "\n", "example = Example(\"one\")\n", "two = Example(\"two\")" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "7\n", "one got (1, 2, 3)\n" ] } ], "source": [ "example(1, 2, 3)" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2\n", "two got ()\n" ] } ], "source": [ "two()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "They also have a `__call__` method that allows us to make our own objects callable. For example:" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [], "source": [ "class Memoized:\n", " def __init__(self, func):\n", " self.cache = {}\n", " self.wrapped_func = func\n", "\n", " def __call__(self, *args):\n", " if args not in self.cache:\n", " self.cache[args] = self.wrapped_func(*args)\n", " return self.cache[args]\n" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "running expensive_func\n", "6\n", "6\n" ] } ], "source": [ "@Memoized\n", "def expensive_func(a, b, c):\n", " print(\"running expensive_func\")\n", " return a + b + c\n", "\n", "#expensive_func = Memoized(expensive_func)\n", "\n", "print(expensive_func(1, 2, 3))\n", "print(expensive_func(1, 2, 3))\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class PartialFunc:\n", " # simplified functools.partial\n", "\n", " def __init__(self, func, *args, **kwargs):\n", " self.func = func\n", " self.args = args\n", " self.kwargs = kwargs\n", "\n", " def __call__(self, *args, **kwargs):\n", " temp_kwargs = self.kwargs.copy()\n", " temp_kwargs.update(kwargs)\n", " return self.func(*(self.args + args), **temp_kwargs)\n", "\n", " @property\n", " def __name__(self):\n", " return f\"{self.func.__name__}(args={self.args} kwargs={self.kwargs})\"\n", "\n", " @property\n", " def __doc__(self):\n", " return self.func.__doc__\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def add(x, y):\n", " \"\"\"Adds two numbers\"\"\"\n", " return x + y\n", "\n", "add_5 = PartialFunc(add, 5)\n", "print(add_5(10))\n", "\n", "print(add_5.__name__)\n", "print(add_5.__doc__)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.15" } }, "nbformat": 4, "nbformat_minor": 4 }