51042-notes/18.testing.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Testing\n",
    "\n",
    "As you've no doubt discovered through working on PAs, tests are useful for ensuring that code works as expected.\n",
    "\n",
    "We break our code into functions and classes to encapsulate functionality that we intend to reuse.\n",
    "These boundaries also provide a natural place to test our code.\n",
    "\n",
    "If you only have one big function that does everything, it can be difficult to test:\n",
    "\n",
    "```python\n",
    "\n",
    "def advance_all_students_with_passing_grades():\n",
    "    conn = sqlite3.connect('students.db')\n",
    "    c = conn.cursor()\n",
    "    c.execute('''\n",
    "    SELECT student.id, avg(class.grade) as average \n",
    "    FROM students JOIN classes ON students.id = classes.student_id\n",
    "    GROUP BY student.id HAVING average >= 70\n",
    "    ''')\n",
    "    students = c.fetchall()\n",
    "    for student in students:\n",
    "        c.execute('UPDATE student_enrollment SET grade = grade + 1 WHERE student_id = ?', (student[0],))\n",
    "    conn.commit()\n",
    "    conn.close()\n",
    "\n",
    "```\n",
    "\n",
    "How would you test this function?  You'd need to have a database with a specific set of data in it and then run the function and check that the data was updated as expected.\n",
    "\n",
    "If you break the function up into smaller functions, you can test each function in isolation:\n",
    "\n",
    "```python\n",
    "\n",
    "def get_students(conn, grade_threshold=70):\n",
    "    ...\n",
    "\n",
    "def advance_student(conn, student):\n",
    "    ...\n",
    "\n",
    "```\n",
    "\n",
    "Now you can test each function in isolation.\n",
    "\n",
    "By having the function take parameters, you can also test the function with different inputs.\n",
    "\n",
    "It is also possible to test the function with a mock database connection that doesn't actually connect to a database but provides the same interface.\n",
    "\n",
    "This is called \"mocking\" and is a useful technique for testing code that interacts with external systems.\n",
    "\n",
    "## `pytest`\n",
    "\n",
    "`pytest` is a popular testing framework for Python, and the one we've been using in class.\n",
    "\n",
    "It is easy to use and provides a lot of useful features.  Python has a built in `unittest` module, but it is more verbose and less flexible.\n",
    "\n",
    "`pytest` provides both a command line tool `pytest`, which you've been using, and a library that you can use to help you write tests.\n",
    "\n",
    "## `pytest` command line tool\n",
    "\n",
    "When you run `pytest`, it will look for files named `test_*.py` in the current directory and its subdirectories. It will then run any functions in those files that start with `test_`.\n",
    "\n",
    "If you take a look at any PA, you'll see that there are files named `test_*.py` in the `tests` directory.  This is a common convention, but you can also place the files in other directories.\n",
    "\n",
    "Within each `test_module1.py` file, there are functions that start with `test_`.  These are the tests that `pytest` will run.  You can include other functions in the file as helper functions, but they won't be run by `pytest`.\n",
    "\n",
    "### Simple Example\n",
    "\n",
    "```python\n",
    "# my_module.py\n",
    "from collections import namedtuple\n",
    "\n",
    "Point = namedtuple('Point', ['x', 'y'])\n",
    "\n",
    "def circle_contains(radius: float, center: Point, point: Point):\n",
    "    return (point.x - center.x) ** 2 + (point.y - center.y) ** 2 <= radius ** 2\n",
    "\n",
    "def points_within(radius: float, center: Point, points: list[Point]):\n",
    "    \"\"\" Find all points within a circle. \"\"\"\n",
    "    return [point for point in points if circle_contains(radius, center, point)]\n",
    "```\n",
    "\n",
    "```python\n",
    "# test_my_module.py\n",
    "\n",
    "from my_module import circle_contains, points_within\n",
    "\n",
    "origin = Point(0, 0)\n",
    "\n",
    "def test_circle_contains():\n",
    "    # centered at origin, radius 1\n",
    "    result = circle_contains(1, origin, origin)\n",
    "    assert result\n",
    "    assert circle_contains(1, origin, Point(.5, .5)), \"circle should contain 0.5, 0.5\"\n",
    "\n",
    "def test_circle_contains_edge():\n",
    "    assert circle_contains(1, origin, Point(1, 0))  # on the circle\n",
    "\n",
    "def test_circle_contains_outside():\n",
    "    assert not circle_contains(1, origin, Point(1.1, 0))\n",
    "```\n",
    "\n",
    "Now running `pytest` would run the test function and report the results.\n",
    "\n",
    "## `assert`\n",
    "\n",
    "The `assert` statement is used to check that a condition is true.\n",
    "\n",
    "If the condition is `True`, nothing happens. If the condition is `False`, an `AssertionError` is raised.\n",
    "\n",
    "You can also provide a message to be printed if the assertion fails:\n",
    "\n",
    "```python\n",
    "assert 1 == 2, \"1 is not equal to 2\"\n",
    "```\n",
    "\n",
    "**Note:** `assert` is not a function. Using parentheses leads to confusing results because the parentheses are interpreted as a tuple.\n",
    "\n",
    "```python\n",
    "assert(1 == 2, \"1 is not equal to 2\")\n",
    "\n",
    "# This is equivalent to:\n",
    "assert (1 == 2, \"1 is not equal to 2\")\n",
    "\n",
    "```\n",
    "\n",
    "### Aside: Truthiness\n",
    "\n",
    "In Python, every type has an implicit conversion to a boolean value.  This is called \"truthiness\".\n",
    "\n",
    "The following values are considered \"falsey\":\n",
    "\n",
    "* `False`\n",
    "* `None`\n",
    "* `0`   # int\n",
    "* `0.0` # float\n",
    "* `0j`  # complex\n",
    "* \"\"    # empty string\n",
    "* []    # empty list\n",
    "* ()    # empty tuple\n",
    "* {}    # empty dict\n",
    "* set() # empty set\n",
    "\n",
    "All other values are considered `True`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "False is False\n",
      "None is False\n",
      "0 is False\n",
      "0.0 is False\n",
      "0j is False\n",
      " is False\n",
      "[] is False\n",
      "() is False\n",
      "{} is False\n",
      "set() is False\n",
      "True is True\n",
      "42 is True\n",
      "3.14 is True\n",
      "hello is True\n",
      "[1, 2, 3] is True\n",
      "{'a': 1} is True\n"
     ]
    }
   ],
   "source": [
    "values = [False, None, 0, 0.0, 0j, \"\", [], (), {}, set()]\n",
    "values += [True, 42, 3.14, \"hello\", [1, 2, 3], {\"a\": 1}]\n",
    "for value in values:\n",
    "    # notice we're using the value as a boolean expression here\n",
    "    if value:\n",
    "        print(f\"{value} is True\")\n",
    "    else:\n",
    "        print(f\"{value} is False\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Writing good tests\n",
    "\n",
    "Above we had the tests\n",
    "\n",
    "```python\n",
    "def test_circle_contains():\n",
    "    # centered at origin, radius 1\n",
    "    assert circle_contains(1, origin, origin)\n",
    "    assert circle_contains(1, origin, Point(.5, .5))\n",
    "\n",
    "def test_circle_contains_edge():\n",
    "    assert circle_contains(1, origin, Point(1, 0))  # on the circle\n",
    "\n",
    "def test_circle_contains_outside():\n",
    "    assert not circle_contains(1, origin, Point(1.1, 0))\n",
    "```\n",
    "\n",
    "Why not just do this?\n",
    "\n",
    "```python\n",
    "\n",
    "def test_circle_contains():\n",
    "    # centered at origin, radius 1\n",
    "    assert circle_contains(1, origin, origin)\n",
    "    assert circle_contains(1, origin, Point(.5, .5))\n",
    "    assert circle_contains(1, origin, Point(1, 0))  # on the circle\n",
    "    assert not circle_contains(1, origin, Point(1.1, 0))\n",
    "```\n",
    "\n",
    "If the first assertion fails, the second assertion will not be run.  This makes it harder to debug the problem.\n",
    "\n",
    "As you've no doubt noticed throughout these courses, granular tests make debugging easier. It is also easier to reason about what isn't yet being tested.\n",
    "\n",
    "**What bugs could be lurking in the code above?**\n",
    "\n",
    "### General Rules\n",
    "\n",
    "* Each test should test one thing.\n",
    "* Each test should be independent of the others.\n",
    "* Each test should be repeatable.\n",
    "* Each test should be easy to understand.\n",
    "\n",
    "### What Tests To Write\n",
    "\n",
    "When considering what to test, usually there are a couple of obvious cases.  For a string distance function, you might consider you need to have at least one test for when the strings are identical, and one test for when the strings are completely different.\n",
    "\n",
    "Those are the obvious cases, it is then worth considering edge cases.  What about an empty string?  Perhaps a string with one character as well?\n",
    "\n",
    "**(0, 1, Many)** is a good rule to consider.\n",
    "\n",
    "\n",
    "### Test One Thing\n",
    "\n",
    "If you were testing a function that summed a list of numbers, consider these two approaches:\n",
    "\n",
    "```python\n",
    "def test_sum():\n",
    "    assert sum([]) == 0\n",
    "    assert sum([1]) == 1\n",
    "    assert sum([1, 2, 3]) == 6\n",
    "    with pytest.raises(TypeError):\n",
    "        sum([1, \"hello\"])\n",
    "```\n",
    "\n",
    "```python\n",
    "def test_sum_empty():\n",
    "    assert sum([]) == 0\n",
    "\n",
    "def test_sum_one():\n",
    "    assert sum([1]) == 1\n",
    "\n",
    "def test_sum_many():\n",
    "    assert sum([1, 2, 3]) == 6\n",
    "\n",
    "def test_sum_type_error():\n",
    "    ...\n",
    "    with pytest.raises(TypeError):\n",
    "        sum([1, \"hello\"])\n",
    "```\n",
    "\n",
    "By having each test focus on one thing, test failures will be more informative.\n",
    "\n",
    "### Test Independence\n",
    "\n",
    "Tests should be independent of each other.  This means that if one test fails, it should not affect the outcome of any other test.\n",
    "\n",
    "This can be a challenge when testing functions that modify data or global state.\n",
    "\n",
    "For example:\n",
    "\n",
    "```python\n",
    "def test_create_user():\n",
    "    db = Database(\"test.db\")\n",
    "    db.create_user(username=\"alice\")\n",
    "    assert db.get_user(username=\"alice\").id == 1\n",
    "\n",
    "def test_delete_user():\n",
    "    db = Database(\"test.db\")\n",
    "    db.delete_user(username=\"alice\")\n",
    "    assert db.get_user(username=\"alice\") is None\n",
    "```\n",
    "\n",
    "These tests are not independent.  If the first test fails, the second test will fail because the database will be empty.\n",
    "\n",
    "You'd instead likely need to do something like this:\n",
    "\n",
    "```python\n",
    "def create_test_database():\n",
    "    remove_file_if_exists(\"test.db\")\n",
    "    db = Database(\"test.db\")\n",
    "    db.init_schema()\n",
    "    return db\n",
    "\n",
    "def test_create_user():\n",
    "    db = create_test_database()\n",
    "    db.create_user(username=\"alice\")\n",
    "    assert db.get_user(username=\"alice\").id == 1\n",
    "\n",
    "def test_delete_user():\n",
    "    db = create_test_database()\n",
    "    db.create_user(username=\"alice\")\n",
    "    db.delete_user(username=\"alice\")\n",
    "    assert db.get_user(username=\"alice\") is None\n",
    "```\n",
    "\n",
    "Note: `test_delete_user` will fail if `create_user` doesn't work.  There's still a dependency in terms of behavior in this case, but you can see that the tests can now be run independently of one another since each starts with a blank database.\n",
    "\n",
    "#### `pytest` Fixtures\n",
    "\n",
    "Another way to handle this is to use `pytest` fixtures.  A fixture is a function that is run before each test.  It can be used to set up the test environment.\n",
    "\n",
    "```python\n",
    "import pytest\n",
    "\n",
    "@pytest.fixture\n",
    "def db():\n",
    "    remove_file_if_exists(\"test.db\")\n",
    "    db = Database(\"test.db\")\n",
    "    db.init_schema()\n",
    "    return db\n",
    "\n",
    "# parameter names must match fixture names\n",
    "def test_create_user(db):\n",
    "    db.create_user(username=\"alice\")\n",
    "    assert db.get_user(username=\"alice\").id == 1\n",
    "\n",
    "def test_delete_user(db):\n",
    "    db.create_user(username=\"alice\")\n",
    "    db.delete_user(username=\"alice\")\n",
    "    assert db.get_user(username=\"alice\") is None\n",
    "```\n",
    "\n",
    "This is a powerful feature that can be used to set up complex test environments.\n",
    "\n",
    "### Test Repeatability\n",
    "\n",
    "Tests should be repeatable.  This means that if a test fails, it should be possible to run it again and get the same result.\n",
    "\n",
    "This means that tests should not depend on external factors such as:\n",
    "\n",
    "* The current time or date\n",
    "* Random numbers\n",
    "* The state of the network\n",
    "* The state of the database\n",
    "\n",
    "To reduce the chance of a test failing due to an external factor, you can use a library like `freezegun` to freeze the current time that a test sees.  The `mock` module can be used to mock out external functions so they return consistent data for the purpose of the test.\n",
    "\n",
    "`freezegun`: <https://github.com/spulec/freezegun>\n",
    "\n",
    "`mock`: <https://docs.python.org/3/library/unittest.mock.html>\n",
    "\n",
    "### Test Readability\n",
    "\n",
    "Tests should be easy to understand.  This means that the test should be written in a way that makes it clear what is being tested and what the expected result is.\n",
    "\n",
    "Make liberal use of comments and descriptive test names to make it clear what is being tested so that when a modification to the code in the future breaks a test, it is easy to understand why.\n",
    "\n",
    "## Test Driven Devleopment\n",
    "\n",
    "Test Driven Development (TDD) is a software development process that involves writing tests before writing the code that will be tested.\n",
    "\n",
    "In many ways this is how your PAs have been structured. By knowing what the expected output is, you can write tests that will fail if the code is not working correctly.\n",
    "\n",
    "It can be useful to write tests before writing the code that will be tested.  This can help you think through the problem and make sure you understand what the code is supposed to do.\n",
    "\n",
    "## Unit Testing vs. Integration Testing\n",
    "\n",
    "Unit tests are tests that test individual units of code.  These are usually functions or methods.\n",
    "\n",
    "Integration tests are tests that test how different units of code work together. It can be useful to have a mixture of both, but unit tests are usually easier to write and maintain.\n",
    "\n",
    "In our initial example, we broke `advance_all_students_with_passing_grade` into smaller functions.  It may still make sense in some cases to test the integration of these functions to make sure that (e.g.) the list of users being returned is still in the same format expected by the `advance_student` function.\n",
    "\n",
    "## `pytest` Features\n",
    "\n",
    "### `pytest.fixture`\n",
    "\n",
    "As shown above, `pytest` fixtures can be used to set up the test environment or provide data to tests.\n",
    "\n",
    "```python\n",
    "import pytest\n",
    "\n",
    "@pytest.fixture\n",
    "def user_list()\n",
    "    return [\n",
    "        {\"name\": \"alice\", \"id\": 1, \"email\": \"alice@domain\"},\n",
    "        {\"name\": \"carol\", \"id\": 3, \"email\": \"carol@domain\"},\n",
    "        {\"name\": \"bob\", \"id\": 2, \"email\": \"bob@domain\"},\n",
    "        {\"name\": \"diego\", \"id\": 4, \"email\": \"diego@otherdomain\"},\n",
    "    ]\n",
    "\n",
    "def test_sort_users(user_list):\n",
    "    sorted_list = sort_users(user_list)\n",
    "    assert sorted_list == [\n",
    "        {\"name\": \"alice\", \"id\": 1, \"email\": \"alice@domain\"},\n",
    "        {\"name\": \"bob\", \"id\": 2, \"email\": \"bob@domain\"},\n",
    "        {\"name\": \"carol\", \"id\": 3, \"email\": \"carol@domain\"},\n",
    "        {\"name\": \"diego\", \"id\": 4, \"email\": \"diego@otherdomain\"},\n",
    "    ]\n",
    "\n",
    "def test_filter_users(user_list):\n",
    "    filtered_list = filter_users(user_list, domain=\"domain\")\n",
    "    assert filtered_list == [\n",
    "        {\"name\": \"alice\", \"id\": 1, \"email\": \"alice@domain\"},\n",
    "        {\"name\": \"bob\", \"id\": 2, \"email\": \"bob@domain\"},\n",
    "        {\"name\": \"carol\", \"id\": 3, \"email\": \"carol@domain\"},\n",
    "    ]\n",
    "```\n",
    "\n",
    "### `pytest.raises`\n",
    "\n",
    "It's often desirable to test that certain errors were raised.  `pytest.raises` can be used to test that a function raises an exception.\n",
    "\n",
    "```python\n",
    "def test_reject_invalid_domain():\n",
    "    with pytest.raises(ValueError):\n",
    "        validate_email(\"alice@invalid$\")\n",
    "```\n",
    "\n",
    "### Parameterized Tests\n",
    "\n",
    "Sometimes the same test needs to be run with different inputs. `pytest` provides a way to do this with parameterized tests.\n",
    "\n",
    "```python\n",
    "@pytest.mark.parametrize(\"str1,str2,expected\", [\n",
    "    (\"abc\", \"abd\", 1),\n",
    "    (\"abc\", \"abc\", 0),\n",
    "    (\"abc\", \"xyz\", 3),\n",
    "    (\"abc\", \"abcd\", 1),\n",
    "])\n",
    "def test_hamming_distance(str1, str2, expected):\n",
    "    assert hamming_distance(str1, str2) == expected\n",
    "```\n",
    "\n",
    "This runs as three distinct tests in `pytest`, converting each input to a distinct test by calling the test function with the parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_hamming_distance[abc-abd-1]\n",
    "test_hamming_distance[abc-abc-0]\n",
    "test_hamming_distance[abc-xyz-3]"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.15"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}