# Object Oriented Programming


A programming paradigm that utilizes the concepts of "objects" which represent related bits of data & code.

A common misconception is that a language needs classes to be object-oriented.  While classes are the most common feature provided in OO-focused languages, one can write code in any language that fits this paradigm.

In a hypothetical language that only had data structures and functions, we might write code like:

In [None]:
person_a = {"name": "Andy", "costume": "Cowboy", "candy": []}
person_b = {"name": "Gil", "costume": "Robot", "candy": []}
person_c = {"name": "Lisa", "costume": "Ghost", "candy": []}

candy_bag = ["Kit Kat", "Kit Kat", "Lollipop", "M&Ms"]

def costume_is_scary(person):
    return person["costume"] in ("Ghost", "Wolfman", "Mummy")

def do_trick(person):
    print(f"{person['name']} did a trick")

def trick_or_treat(person):
    success = give_candy(candy_bag, person)
    # extra candy for scary costumes!
    if costume_is_scary(person):
        give_candy(candy_bag, person)
    if not success:
        do_trick(person)

def give_candy(candy_bag, person):
    if candy_bag:
        candy = random.choice(candy_bag)
        candy_bag.remove(candy)
        person["candy"].append(candy)
        return True
    else:
        return False

A common pattern to see is a lot of functions that need a particular data structure as a parameter.  Objects give us a way to connect that code, making it more clear what should be passed in, and reducing the chance of errors.

The same code might be rewritten as:

In [None]:
class Person:
    def __init__(self, name, costume):
        self.name = name
        self.costume = costume
        self.candy = []

    def is_scary(self):
        return self.costume in ("Ghost", "Wolfman", "Mummy")
    
    def do_trick(self):
        self.tricks = True
        print(f"{self.name} did a trick")
        
    def accept_candy(self, candy):
        self.candy.append(candy)
        
class NoCandy(Exception):
    pass

class House:
    def __init__(self, initial_candy):
        self.candy = initial_candy
    
    def get_candy(self):
        if not self.candy:
            raise NoCandy("no more candy!")
        candy = random.choice(self.candy)
        self.candy.remove(candy)
        return candy
    

def trick_or_treat(person, house):
    try:
        candy = house.get_candy()
        person.accept_candy(candy)
        if person.is_scary():
            person.accept_candy(house.get_candy())
    except NoCandy:
        do_trick(person, house)

This code provides blueprints for what data & actions a "person" has.  We also take our "candy_bag" list and turn it into a full-fledged object as well, since presumably we'd have multiple copies of it in our real-world application.

## Terminology

- **Object** - An encapsulation of data & related operations.
- **Class** - A blueprint for an object, providing methods that will act on instances of the data.
- **Instance** - An object created from a class "blueprint".
- **Method** - A function that is tied to a specific class.
- **Attribute** - Data that is tied to a specific instance.
- **Constructor** - A special method that creates & populates an instance of a class.


## Everything in Python is an Object

`isinstance` is the preferred way to check if an item is of a particular type.

It can return true for multiple types, we'll see why this is the case shortly.

In [11]:
isinstance([1, 2, 3], list)

True

In [12]:
isinstance([1, 2, 3], tuple)

False

In [13]:
isinstance([1, 2, 3], object)

True

In [14]:
s = set([1,2,3])

# using constructors here for demo purposes, generally would use a literal (e.g. [], 0, "") for these
ll = list()  
ll.append(str())
ll.append(int())
ll.append(float())
ll.append(s)
ll.append(print)

print(ll)

['', 0, 0.0, {1, 2, 3}, <built-in function print>]


In [15]:
[isinstance(item, object) for item in ll]

[True, True, True, True, True]

Keeping this in mind can help keep things straight when we delve deeper into making our own objects.

Let's revisit a few things that we already know:

- each `list` is independent of all others, when you create a new via `list()` (or `[]`) that is an **instance**
- calling things like `.append` operate on the instance they are called from. 
- Some methods modify the underlying object (`.append`) while others just provide a return value like any other function.  (What are some non-modifying methods?)

## Classes in Python

### Instances, Classes, and Instantiation

One way to think of classes are as blueprints for creating specific realizations.

The blueprint can specify features that vary from car to car (color, transmission type, etc.).  We can create multiple car **instances** with different values for a given attribute.

In [16]:
class Car:
    # __init__ is a special method
    # known as a double-underscore or dunder method
    #  in Python it represents our constructor
    def __init__(self, make, model, year=2000):
        self.make = make
        self.model = model
        self.year = year
        self.mileage = 0
        self.hybrid = False
        
# to actually create Cars, we need to call this constructor
car1 = Car("Honda", "Civic", 2019)
car2 = Car("Chevy", "Volt", 2022)
print(car1.make, car1.model, car1.year)
print(car2.make, car2.model, car2.year)
car3 = car2

Honda Civic 2019
Chevy Volt 2022


In [5]:
car3 is car2

True

This is known as *instantiation*, making an instance of the class.

### `self` & methods

The first parameter of methods is always `self`.  

This parameter is never passed directly, but is a local reference to the object the instance is being called upon.


In [17]:
class Car:
    def __init__(self, make, model, year):
        self.make = make
        self.model = model
        self.year = year
        self.mileage = 0
        self.hybrid = False
        
    def print_report(self):
        print(f"{self.year} {self.make} {self.model} with {self.mileage} miles")
        
    def drive(self, miles):
        self.mileage += miles
        
car1 = Car("Honda", "Civic", 2019)
car2 = Car("Chevy", "Volt", 2022)


In [18]:
car1.print_report()

2019 Honda Civic with 0 miles


In [19]:
car2.drive(500)
car2.print_report()

2022 Chevy Volt with 500 miles


In [20]:
car1.print_report()

2019 Honda Civic with 0 miles


In [21]:
print(car2.mileage)

500


Because of `self`, methods can know which instance they are operating upon.

#### How does this work?

This is confusing at first glance, where does `self` come from? 

It is actually the "parameter before the dot".


In [22]:
# explicitly call Car.print_report and pass self
Car.print_report(car2)   
# this is not how we call class methods! (but it works)

2022 Chevy Volt with 500 miles


In [23]:
ll = []
list.append(ll, 4) # list is class, ll is self here
ll

[4]

#### What happens if `self` is omitted?


In [24]:
class Mistake:
    def __init__(self):
        print("constructor!")
    
    def method_no_self(self):
        print("method!")

In [25]:
m = Mistake()
m.method_no_self()
# rewritten as Mistake.method_no_self(m)

constructor!
method!


### Attributes

- Created on assignment, like other variables.
- `self.name = value`
- All attributes are accessible from inside the class and outside:
  - `self.name` from inside.
  - `instance_name.name` from outside.
  
**Best practice: create all attributes inside constructor!**

Why?

In [26]:
my_car = Car("DMC", "DeLorean", 1982)
#print(my_car.miles_driven)
my_car.driver = "Marty" # allowed, but to be avoided

#### Exception to the rule: function objects

Functions are objects, and can have attributes assigned to them as well.

We sometimes do this since there's no opportunity to assign them before. (Because functions do not have constructors we can modify.)

In [27]:
def f():
    print(f"called f()")
    #f.call_count = 0 # NO
f.call_count = 0

In [28]:
f.call_count += 1
f()
print(f.call_count)

called f()
1


In [29]:
# using a decorator to add call_count to any function
def counter(func):
    #inner.call_count
    def inner(*args, **kwargs):
        inner.call_count += 1
        print(f"call count {inner.call_count}")
        return func(*args, **kwargs)
    inner.call_count = 0
    return inner

In [30]:
@counter
def f():
    print("called f()")

In [31]:
@counter
def g():
    print(f"called g()")

In [32]:
f()
f()
f()

call count 1
called f()
call count 2
called f()
call count 3
called f()


In [None]:
g()

## Encapsulation

Why might it be a bad idea to allow users to change attributes?


In [34]:
# oops!
car2.mileage -= 100
car2.hybrid = "no"

Furthermore, imagine we've noticed sometimes `year` is an `int` and other times a `str`.  We could decide to remedy this in our constructor like:

```python
class Car:
    def __init__(self, make, model, year):
        self.make = make
        self.model = model
        self.year = int(year)
        self.mileage = 0
        
    def drive(self, miles):
        if miles < 0:
            return # maybe an error instead? 
        self.mileage += miles
```

We can also protect against trying to roll back the odometer by driving in reverse.

If other code is assigning to internal variables, we need to make these checks/changes in dozens of places.

**Encapsulation** therefore allows the implementation of a class interface to be changed with *minimal impact* upon users of the class.

Good object-oriented design involves thinking through what **interface** you're providing to code making use of your objects.

Python has changed how (e.g.) `dict` works internally several times over the decades.  Imagine if each time they did so, methods like `.keys()` and `pop()` stopped working.

### "private" in Python

Some languages use access specifiers like "private", "public", "protected" to handle this.  Python instead relies on convention.

A name with a single underscore at the front is meant to be "internal" to the class, and should not be modified except from methods of that class.

A name with a double underscore at the front is actually modified internally by Python to avoid people assigning to it accidentally.


In [21]:
class Car: 
    def __init__(self, make, model, year):
        self._make = make 
        self._model = model 
        self._year = year
        self.__mileage = 0
                
    def drive(self, miles):
        if miles > 0:
            self.__mileage += miles
        else:
            ...
            
    def print_report(self):
        print(f"{self._year} {self._make} {self._model} with {self.__mileage} miles")
    
car1 = Car("Honda", "Civic", 2019)
car2 = Car("Chevy", "Volt", 2022)

car2.drive(500)
car2.print_report()

2022 Chevy Volt with 500 miles


In [20]:
print(car2.__mileage)

AttributeError: 'Car' object has no attribute '__mileage'

In [36]:
car2._make = "???"
print(car2._make) 
# soft protection, can still access but "at your own risk"

???


### Dunder Methods

Methods that begin and end with double-underscore are called special methods or dunder methods.  These allow us to define classes that implement existing protocols.

* `__repr__`
* `__str__`
* `__eq__`

In [38]:
class Car: 
    def __init__(self, make, model, year):
        self._make = make 
        self._model = model 
        self._year = year
        self.__mileage = 0

    def drive(self, miles):
        if miles > 0:
            self.__mileage += miles
        else:
            ...
       
    def __eq__(self, other):
        # we can decide equality does/doesn't include mileage
        return (self._make == other._make 
                and self._model == other._model 
                and self._year == other._year)
    
    def __repr__(self):
        return f"Car({self._make}, {self._model}, {self._year})"

    def __str__(self):
        return f"{self._year} {self._make} {self._model} with {self.__mileage} miles"

In [39]:
truck = Car("Ford", "F-150", 1985)
truck2 = Car("Ford", "F-150", 1985)

In [42]:
truck

Car(Ford, F-150, 1985)

In [40]:
str(truck) # implicit conversion to string, uses repr(truck)

'1985 Ford F-150 with 0 miles'

In [41]:
print(str(truck)) # uses str(truck)

1985 Ford F-150 with 0 miles


In [34]:
truck == car1 # eq

False

In [36]:
truck.__eq__(car1)

False

In [35]:
truck == truck2 # eq

True

In [None]:
truck is truck2

### `str` vs `repr`

`repr` is supposed to be a programmatic interpretation, used in debugging output. In jupyter/ipython if a function returns a value we see the repr by default.

`str` is used when print is called, or an explicit conversion to string as shown above.

If only `__repr__` is defined, then `str(obj)` will use `__repr__`, so if you don't have a need for them to differ, then define `__repr__`.

## Protocols, Duck-Typing, and Polymorphism

In a language like C++, functions can be created with one name but different argument lists.

```c++
void foo(int x)
void foo(double x)
void foo(int x, double y)
```

The compiler can decide which function to call at compile time based on the types given.

This is called "polymorphism".

We've seen one way to achieve similar results via variadic arguments.

In Python, polymorphism stems from the idea **"the meaning of an operation depends on the objects being operated on"**.

```python
1 + 5  # addition
"1" + "5" # string concatenation
[1,2,3] + [4,5] # list concatenation
```

Remember, we mentioned that everything in Python is an `object` and `object`s have operations associated with them. 

```python
def times(x, y):
     return x * y
```

As long as our objects `x` and `y` support the `*` protocol, it is safe to call `times(x, y)`.

### Duck Typing

In Python, instead of forcing our arguments to be specific types, we use something known as "Duck Typing."  This comes from the expression:
"If it looks like a duck, and it quacks like a duck, it might as well be a duck."

## Protocols

Another way of thinking about this is that objects of a given type follow a certain protocol.

- "addable"
- "comparable"
- "iterable"
- "callable"

These are typically implemented via dunder methods.  To be "addable" an object needs a `__add__` method at minimum.  To be comparable it needs `__eq__` and `__lt__` or `__gt__` at least.

Let's look at iterable now through this new lens:


In [2]:
l = [1, 2, 3, 4, 5]
g = (x**2 for x in l)
r = range(8)

# iterable: we can use it in a for loop
for x in l:  # or g, or r
    print(x)

1
2
3
4
5


In [6]:
g

<generator object <genexpr> at 0x1085c0860>

What actually happens here?

The iteration protocol has a few requirements:

- When an iterable is passed to an iteration context (for loop, comprehension, `map`, etc.) The context will call `iter(iterable)` to obtain the object's iterator.  
- `iter(obj)` calls `obj.__iter__`
- Iterator objects return values one at a time, we can call `next(iterator)` to obtain the next value. 
`next(it)` calls `it.__next__`
- `StopIteration` is raised when there are no more values.

In [8]:
l = [1, 2, 3]
g = (x**2 for x in l)
r = range(8)

li = iter(l)
gi = iter(g)
ri = iter(r)
li2 = iter(l)

print(li)
print(gi)
print(ri)
print(li2)

<list_iterator object at 0x108583100>
<generator object <genexpr> at 0x1085c2c20>
<range_iterator object at 0x1085804b0>
<list_iterator object at 0x108580ca0>


In [13]:
next(li)

StopIteration: 

In [10]:
next(li2)

1

In [None]:
print(next(li))
print(next(gi))
print(next(ri))

In [None]:
print(next(li))
print(next(gi))
print(next(ri))

### Discussion

- What else is iterable?
- What are other protocols we've seen?
- Do all iterables eventually raise `StopIteration`?