2.5 Object-Oriented Programming
Object-oriented programming (OOP) is a method for organizing programs that brings together many of the ideas introduced in this chapter. Like abstract data types, objects create an abstraction barrier between the use and implementation of data. Like dispatch dictionaries in message passing, objects respond to behavioral requests. Like mutable data structures, objects have local state that is not directly accessible from the global environment. The Python object system provides new syntax to ease the task of implementing all of these useful techniques for organizing programs.
But the object system offers more than just convenience; it enables a new metaphor for designing programs in which several independent agents interact within the computer. Each object bundles together local state and behavior in a way that hides the complexity of both behind a data abstraction. Our example of a constraint program began to develop this metaphor by passing messages between constraints and connectors. The Python object system extends this metaphor with new ways to express how different parts of a program relate to and communicate with each other. Not only do objects pass messages, they also share behavior among other objects of the same type and inherit characteristics from related types.
The paradigm of object-oriented programming has its own vocabulary that reinforces the object metaphor. We have seen that an object is a data value that has methods and attributes, accessible via dot notation. Every object also has a type, called a class. New classes can be defined in Python, just as new functions can be defined.
2.5.1 Objects and Classes
A class serves as a template for all objects whose type is that class. Every object is an instance of some particular class. The objects we have used so far all have built-in classes, but new classes can be defined similarly to how new functions can be defined. A class definition specifies the attributes and methods shared among objects of that class. We will introduce the class statement by revisiting the example of a bank account.
When introducing local state, we saw that bank accounts are naturally modeled as mutable values that have a
balance. A bank account object should have a
withdraw method that updates the account balance and returns the requested amount, if it is available. We would like additional behavior to complete the account abstraction: a bank account should be able to return its current balance, return the name of the account holder, and accept deposits.
Account class allows us to create multiple instances of bank accounts. The act of creating a new object instance is known as instantiating the class. The syntax in Python for instantiating a class is identical to the syntax of calling a function. In this case, we call
Account with the argument
'Jim', the account holder's name.
>>> a = Account('Jim')
An attribute of an object is a name-value pair associated with the object, which is accessible via dot notation. The attributes specific to a particular object, as opposed to all objects of a class, are called instance attributes. Each
Account has its own balance and account holder name, which are examples of instance attributes. In the broader programming community, instance attributes may also be called fields, properties, or instance variables.
>>> a.holder 'Jim' >>> a.balance 0
Functions that operate on the object or perform object-specific computations are called methods. The side effects and return value of a method can depend upon, and change, other attributes of the object. For example,
deposit is a method of our
a. It takes one argument, the amount to deposit, changes the
balance attribute of the object, and returns the resulting balance.
>>> a.deposit(15) 15
In OOP, we say that methods are invoked on a particular object. As a result of invoking the
withdraw method, either the withdrawal is approved and the amount is deducted and returned, or the request is declined and the account prints an error message.
>>> a.withdraw(10) # The withdraw method returns the balance after withdrawal 5 >>> a.balance # The balance attribute has changed 5 >>> a.withdraw(10) 'Insufficient funds'
As illustrated above, the behavior of a method can depend upon the changing attributes of the object. Two calls to
withdraw with the same argument return different results.
2.5.2 Defining Classes
User-defined classes are created by
class statements, which consist of a single clause. A class statement defines the class name and a base class (discussed in the section on Inheritance), then includes a suite of statements to define the attributes of the class:
class <name>(<base class>): <suite>
When a class statement is executed, a new class is created and bound to
<name> in the first frame of the current environment. The suite is then executed. Any names bound within the
<suite> of a
class statement, through
def or assignment statements, create or modify attributes of the class.
Classes are typically organized around manipulating instance attributes, which are the name-value pairs associated not with the class itself, but with each object of that class. The class specifies the instance attributes of its objects by defining a method for initializing new objects. For instance, part of initializing an object of the
Account class is to assign it a starting balance of 0.
<suite> of a
class statement contains
def statements that define new methods for objects of that class. The method that initializes objects has a special name in Python,
__init__ (two underscores on each side of "init"), and is called the constructor for the class.
>>> class Account(object): def __init__(self, account_holder): self.balance = 0 self.holder = account_holder
__init__ method for
Account has two formal parameters. The first one,
self, is bound to the newly created
Account object. The second parameter,
account_holder, is bound to the argument passed to the class when it is called to be instantiated.
The constructor binds the instance attribute name
0. It also binds the attribute name
holder to the value of the name
account_holder. The formal parameter
account_holder is a local name to the
__init__ method. On the other hand, the name
holder that is bound via the final assignment statement persists, because it is stored as an attribute of
self using dot notation.
Having defined the
Account class, we can instantiate it.
>>> a = Account('Jim')
This "call" to the
Account class creates a new object that is an instance of
Account, then calls the constructor function
__init__ with two arguments: the newly created object and the string
'Jim'. By convention, we use the parameter name
self for the first argument of a constructor, because it is bound to the object being instantiated. This convention is adopted in virtually all Python code.
Now, we can access the object's
holder using dot notation.
>>> a.balance 0 >>> a.holder 'Jim'
Identity. Each new account instance has its own balance attribute, the value of which is independent of other objects of the same class.
>>> b = Account('Jack') >>> b.balance = 200 >>> [acc.balance for acc in (a, b)] [0, 200]
To enforce this separation, every object that is an instance of a user-defined class has a unique identity. Object identity is compared using the
is not operators.
>>> a is a True >>> a is not b True
Despite being constructed from identical calls, the objects bound to
b are not the same. As usual, binding an object to a new name using assignment does not create a new object.
>>> c = a >>> c is a True
New objects that have user-defined classes are only created when a class (such as
Account) is instantiated with call expression syntax.
Methods. Object methods are also defined by a
def statement in the suite of a
class statement. Below,
withdraw are both defined as methods on objects of the
>>> class Account(object): def __init__(self, account_holder): self.balance = 0 self.holder = account_holder def deposit(self, amount): self.balance = self.balance + amount return self.balance def withdraw(self, amount): if amount > self.balance: return 'Insufficient funds' self.balance = self.balance - amount return self.balance
While method definitions do not differ from function definitions in how they are declared, method definitions do have a different effect. The function value that is created by a
def statement within a
class statement is bound to the declared name, but bound locally within the class as an attribute. That value is invoked as a method using dot notation from an instance of the class.
Each method definition again includes a special first parameter
self, which is bound to the object on which the method is invoked. For example, let us say that
deposit is invoked on a particular
Account object and passed a single argument value: the amount deposited. The object itself is bound to
self, while the argument is bound to
amount. All invoked methods have access to the object via the
self parameter, and so they can all access and manipulate the object's state.
To invoke these methods, we again use dot notation, as illustrated below.
>>> tom_account = Account('Tom') >>> tom_account.deposit(100) 100 >>> tom_account.withdraw(90) 10 >>> tom_account.withdraw(90) 'Insufficient funds' >>> tom_account.holder 'Tom'
When a method is invoked via dot notation, the object itself (bound to
tom_account, in this case) plays a dual role. First, it determines what the name
withdraw is not a name in the environment, but instead a name that is local to the
Account class. Second, it is bound to the first parameter
self when the
withdraw method is invoked. The details of the procedure for evaluating dot notation follow in the next section.
2.5.3 Message Passing and Dot Expressions
Methods, which are defined in classes, and instance attributes, which are typically assigned in constructors, are the fundamental elements of object-oriented programming. These two concepts replicate much of the behavior of a dispatch dictionary in a message passing implementation of a data value. Objects take messages using dot notation, but instead of those messages being arbitrary string-valued keys, they are names local to a class. Objects also have named local state values (the instance attributes), but that state can be accessed and manipulated using dot notation, without having to employ
nonlocal statements in the implementation.
The central idea in message passing was that data values should have behavior by responding to messages that are relevant to the abstract type they represent. Dot notation is a syntactic feature of Python that formalizes the message passing metaphor. The advantage of using a language with a built-in object system is that message passing can interact seamlessly with other language features, such as assignment statements. We do not require different messages to "get" or "set" the value associated with a local attribute name; the language syntax allows us to use the message name directly.
Dot expressions. The code fragment
tom_account.deposit is called a dot expression. A dot expression consists of an expression, a dot, and a name:
<expression> . <name>
<expression> can be any valid Python expression, but the
<name> must be a simple name (not an expression that evaluates to a name). A dot expression evaluates to the value of the attribute with the given
<name>, for the object that is the value of the
The built-in function
getattr also returns an attribute for an object by name. It is the function equivalent of dot notation. Using
getattr, we can look up an attribute using a string, just as we did with a dispatch dictionary.
>>> getattr(tom_account, 'balance') 10
We can also test whether an object has a named attribute with
>>> hasattr(tom_account, 'deposit') True
The attributes of an object include all of its instance attributes, along with all of the attributes (including methods) defined in its class. Methods are attributes of the class that require special handling.
Method and functions. When a method is invoked on an object, that object is implicitly passed as the first argument to the method. That is, the object that is the value of the
<expression> to the left of the dot is passed automatically as the first argument to the method named on the right side of the dot expression. As a result, the object is bound to the parameter
To achieve automatic
self binding, Python distinguishes between functions, which we have been creating since the beginning of the course, and bound methods, which couple together a function and the object on which that method will be invoked. A bound method value is already associated with its first argument, the instance on which it was invoked, which will be named
self when the method is called.
We can see the difference in the interactive interpreter by calling
type on the returned values of dot expressions. As an attribute of a class, a method is just a function, but as an attribute of an instance, it is a bound method:
>>> type(Account.deposit) <class 'function'> >>> type(tom_account.deposit) <class 'method'>
These two results differ only in the fact that the first is a standard two-argument function with parameters
amount. The second is a one-argument method, where the name
self will be bound to the object named
tom_account automatically when the method is called, while the parameter
amount will be bound to the argument passed to the method. Both of these values, whether function values or bound method values, are associated with the same
deposit function body.
We can call
deposit in two ways: as a function and as a bound method. In the former case, we must supply an argument for the
self parameter explicitly. In the latter case, the
self parameter is bound automatically.
>>> Account.deposit(tom_account, 1001) # The deposit function requires 2 arguments 1011 >>> tom_account.deposit(1000) # The deposit method takes 1 argument 2011
getattr behaves exactly like dot notation: if its first argument is an object but the name is a method defined in the class, then
getattr returns a bound method value. On the other hand, if the first argument is a class, then
getattr returns the attribute value directly, which is a plain function.
Practical guidance: naming conventions. Class names are conventionally written using the CapWords convention (also called CamelCase because the capital letters in the middle of a name are like humps). Method names follow the standard convention of naming functions using lowercased words separated by underscores.
In some cases, there are instance variables and methods that are related to the maintenance and consistency of an object that we don't want users of the object to see or use. They are not part of the abstraction defined by a class, but instead part of the implementation. Python's convention dictates that if an attribute name starts with an underscore, it should only be accessed within methods of the class itself, rather than by users of the class.
2.5.4 Class Attributes
Some attribute values are shared across all objects of a given class. Such attributes are associated with the class itself, rather than any individual instance of the class. For instance, let us say that a bank pays interest on the balance of accounts at a fixed interest rate. That interest rate may change, but it is a single value shared across all accounts.
Class attributes are created by assignment statements in the suite of a
class statement, outside of any method definition. In the broader developer community, class attributes may also be called class variables or static variables. The following class statement creates a class attribute for
Account with the name
>>> class Account(object): interest = 0.02 # A class attribute def __init__(self, account_holder): self.balance = 0 self.holder = account_holder # Additional methods would be defined here
This attribute can still be accessed from any instance of the class.
>>> tom_account = Account('Tom') >>> jim_account = Account('Jim') >>> tom_account.interest 0.02 >>> jim_account.interest 0.02
However, a single assignment statement to a class attribute changes the value of the attribute for all instances of the class.
>>> Account.interest = 0.04 >>> tom_account.interest 0.04 >>> jim_account.interest 0.04
Attribute names. We have introduced enough complexity into our object system that we have to specify how names are resolved to particular attributes. After all, we could easily have a class attribute and an instance attribute with the same name.
As we have seen, a dot expressions consist of an expression, a dot, and a name:
<expression> . <name>
To evaluate a dot expression:
- Evaluate the
<expression>to the left of the dot, which yields the object of the dot expression.
<name>is matched against the instance attributes of that object; if an attribute with that name exists, its value is returned.
<name>does not appear among instance attributes, then
<name>is looked up in the class, which yields a class attribute value.
- That value is returned unless it is a function, in which case a bound method is returned instead.
In this evaluation procedure, instance attributes are found before class attributes, just as local names have priority over global in an environment. Methods defined within the class are bound to the object of the dot expression during the third step of this evaluation procedure. The procedure for looking up a name in a class has additional nuances that will arise shortly, once we introduce class inheritance.
Assignment. All assignment statements that contain a dot expression on their left-hand side affect attributes for the object of that dot expression. If the object is an instance, then assignment sets an instance attribute. If the object is a class, then assignment sets a class attribute. As a consequence of this rule, assignment to an attribute of an object cannot affect the attributes of its class. The examples below illustrate this distinction.
If we assign to the named attribute
interest of an account instance, we create a new instance attribute that has the same name as the existing class attribute.
>>> jim_account.interest = 0.08
and that attribute value will be returned from a dot expression.
>>> jim_account.interest 0.08
However, the class attribute
interest still retains its original value, which is returned for all other accounts.
>>> tom_account.interest 0.04
Changes to the class attribute
interest will affect
tom_account, but the instance attribute for
jim_account will be unaffected.
>>> Account.interest = 0.05 # changing the class attribute >>> tom_account.interest # changes instances without like-named instance attributes 0.05 >>> jim_account.interest # but the existing instance attribute is unaffected 0.08
When working in the OOP paradigm, we often find that different abstract data types are related. In particular, we find that similar classes differ in their amount of specialization. Two classes may have similar attributes, but one represents a special case of the other.
For example, we may want to implement a checking account, which is different from a standard account. A checking account charges an extra $1 for each withdrawal and has a lower interest rate. Here, we demonstrate the desired behavior.
>>> ch = CheckingAccount('Tom') >>> ch.interest # Lower interest rate for checking accounts 0.01 >>> ch.deposit(20) # Deposits are the same 20 >>> ch.withdraw(5) # withdrawals decrease balance by an extra charge 14
CheckingAccount is a specialization of an
Account. In OOP terminology, the generic account will serve as the base class of
CheckingAccount will be a subclass of
Account. (The terms parent class and superclass are also used for the base class, while child class is also used for the subclass.)
A subclass inherits the attributes of its base class, but may override certain attributes, including certain methods. With inheritance, we only specify what is different between the subclass and the base class. Anything that we leave unspecified in the subclass is automatically assumed to behave just as it would for the base class.
Inheritance also has a role in our object metaphor, in addition to being a useful organizational feature. Inheritance is meant to represent is-a relationships between classes, which contrast with has-a relationships. A checking account is-a specific type of account, so having a
CheckingAccount inherit from
Account is an appropriate use of inheritance. On the other hand, a bank has-a list of bank accounts that it manages, so neither should inherit from the other. Instead, a list of account objects would be naturally expressed as an instance attribute of a bank object.
2.5.6 Using Inheritance
We specify inheritance by putting the base class in parentheses after the class name. First, we give a full implementation of the
Account class, which includes docstrings for the class and its methods.
>>> class Account(object): """A bank account that has a non-negative balance.""" interest = 0.02 def __init__(self, account_holder): self.balance = 0 self.holder = account_holder def deposit(self, amount): """Increase the account balance by amount and return the new balance.""" self.balance = self.balance + amount return self.balance def withdraw(self, amount): """Decrease the account balance by amount and return the new balance.""" if amount > self.balance: return 'Insufficient funds' self.balance = self.balance - amount return self.balance
A full implementation of
CheckingAccount appears below.
>>> class CheckingAccount(Account): """A bank account that charges for withdrawals.""" withdraw_charge = 1 interest = 0.01 def withdraw(self, amount): return Account.withdraw(self, amount + self.withdraw_charge)
Here, we introduce a class attribute
withdraw_charge that is specific to the
CheckingAccount class. We assign a lower value to the
interest attribute. We also define a new
withdraw method to override the behavior defined in the
Account class. With no further statements in the class suite, all other behavior is inherited from the base class
>>> checking = CheckingAccount('Sam') >>> checking.deposit(10) 10 >>> checking.withdraw(5) 4 >>> checking.interest 0.01
checking.deposit evaluates to a bound method for making deposits, which was defined in the
Account class. When Python resolves a name in a dot expression that is not an attribute of the instance, it looks up the name in the class. In fact, the act of "looking up" a name in a class tries to find that name in every base class in the inheritance chain for the original object's class. We can define this procedure recursively. To look up a name in a class.
- If it names an attribute in the class, return the attribute value.
- Otherwise, look up the name in the base class, if there is one.
In the case of
deposit, Python would have looked for the name first on the instance, and then in the
CheckingAccount class. Finally, it would look in the
Account class, where
deposit is defined. According to our evaluation rule for dot expressions, since
deposit is a function looked up in the class for the
checking instance, the dot expression evaluates to a bound method value. That method is invoked with the argument
10, which calls the deposit method with
self bound to the
checking object and
amount bound to
The class of an object stays constant throughout. Even though the
deposit method was found in the
deposit is called with
self bound to an instance of
CheckingAccount, not of
Calling ancestors. Attributes that have been overridden are still accessible via class objects. For instance, we implemented the
withdraw method of
CheckingAccount by calling the
withdraw method of
Account with an argument that included the
Notice that we called
self.withdraw_charge rather than the equivalent
CheckingAccount.withdraw_charge. The benefit of the former over the latter is that a class that inherits from
CheckingAccount might override the withdrawal charge. If that is the case, we would like our implementation of
withdraw to find that new value instead of the old one.
2.5.7 Multiple Inheritance
Python supports the concept of a subclass inheriting attributes from multiple base classes, a language feature called multiple inheritance.
Suppose that we have a
SavingsAccount that inherits from
Account, but charges customers a small fee every time they make a deposit.
>>> class SavingsAccount(Account): deposit_charge = 2 def deposit(self, amount): return Account.deposit(self, amount - self.deposit_charge)
Then, a clever executive conceives of an
AsSeenOnTVAccount account with the best features of both
SavingsAccount: withdrawal fees, deposit fees, and a low interest rate. It's both a checking and a savings account in one! "If we build it," the executive reasons, "someone will sign up and pay all those fees. We'll even give them a dollar."
>>> class AsSeenOnTVAccount(CheckingAccount, SavingsAccount): def __init__(self, account_holder): self.holder = account_holder self.balance = 1 # A free dollar!
In fact, this implementation is complete. Both withdrawal and deposits will generate fees, using the function definitions in
>>> such_a_deal = AsSeenOnTVAccount("John") >>> such_a_deal.balance 1 >>> such_a_deal.deposit(20) # $2 fee from SavingsAccount.deposit 19 >>> such_a_deal.withdraw(5) # $1 fee from CheckingAccount.withdraw 13
Non-ambiguous references are resolved correctly as expected:
>>> such_a_deal.deposit_charge 2 >>> such_a_deal.withdraw_charge 1
But what about when the reference is ambiguous, such as the reference to the
withdraw method that is defined in both
CheckingAccount? The figure below depicts an inheritance graph for the
AsSeenOnTVAccount class. Each arrow points from a subclass to a base class.
For a simple "diamond" shape like this, Python resolves names from left to right, then upwards. In this example, Python checks for an attribute name in the following classes, in order, until an attribute with that name is found:
AsSeenOnTVAccount, CheckingAccount, SavingsAccount, Account, object
There is no correct solution to the inheritance ordering problem, as there are cases in which we might prefer to give precedence to certain inherited classes over others. However, any programming language that supports multiple inheritance must select some ordering in a consistent way, so that users of the language can predict the behavior of their programs.
Further reading. Python resolves this name using a recursive algorithm called the C3 Method Resolution Ordering. The method resolution order of any class can be queried using the
mro method on all classes.
>>> [c.__name__ for c in AsSeenOnTVAccount.mro()] ['AsSeenOnTVAccount', 'CheckingAccount', 'SavingsAccount', 'Account', 'object']
The precise algorithm for finding method resolution orderings is not a topic for this course, but is described by Python's primary author with a reference to the original paper.
2.5.8 The Role of Objects
The Python object system is designed to make data abstraction and message passing both convenient and flexible. The specialized syntax of classes, methods, inheritance, and dot expressions all enable us to formalize the object metaphor in our programs, which improves our ability to organize large programs.
In particular, we would like our object system to promote a separation of concerns among the different aspects of the program. Each object in a program encapsulates and manages some part of the program's state, and each class statement defines the functions that implement some part of the program's overall logic. Abstraction barriers enforce the boundaries between different aspects of a large program.
Object-oriented programming is particularly well-suited to programs that model systems that have separate but interacting parts. For instance, different users interact in a social network, different characters interact in a game, and different shapes interact in a physical simulation. When representing such systems, the objects in a program often map naturally onto objects in the system being modeled, and classes represent their types and relationships.
On the other hand, classes may not provide the best mechanism for implementing certain abstractions. Functional abstractions provide a more natural metaphor for representing relationships between inputs and outputs. One should not feel compelled to fit every bit of logic in a program within a class, especially when defining independent functions for manipulating data is more natural. Functions can also enforce a separation of concerns.
Multi-paradigm languages like Python allow programmers to match organizational paradigms to appropriate problems. Learning to identify when to introduce a new class, as opposed to a new function, in order to simplify or modularize a program, is an important design skill in software engineering that deserves careful attention.