Special Methods & Protocols
Special methods (also called magic methods) are the secret sauce to Python’s duck typing.
Defining the appropriate special methods in your classes is how you make your class act like the standard classes.
What’s in a Name?
We’ve seen at least one special method so far:
__init__
It’s all in the double underscores…
Pronounced “dunder”
try: dir(2)
or dir(list)
Generally Useful Special Methods
Most classes should at least have these special methods:
object.__str__
:Called by the str() built-in function and by the print function to compute the informal string representation of an object.
object.__repr__
:Called by the repr() built-in function to compute the official string representation of an object.
Ideally:
eval( repr(something) ) == something
This means that the “repr” is what you type to create the object. In practice, this is impractical for complex objects… but it is still a more “formal” form.
Note that if you don’t define a
__str__
method, then the__repr__
will be used. And the base class (object
) has a__repr__
defined, so every class automatically gets one – but it’s ugly :-)
Protocols
The set of special methods needed to emulate a particular type of Python object is called a protocol.
Your classes can “become” like Python built-in classes by implementing the methods in a given protocol.
Remember, these are more guidelines than laws. Implement only what you need.
The Numerics Protocol
Do you want your class to behave like a number? Implement these methods:
object.__add__(self, other)
object.__sub__(self, other)
object.__mul__(self, other)
object.__matmul__(self, other)
object.__truediv__(self, other)
object.__floordiv__(self, other)
object.__mod__(self, other)
object.__divmod__(self, other)
object.__pow__(self, other[, modulo])
object.__lshift__(self, other)
object.__rshift__(self, other)
object.__and__(self, other)
object.__xor__(self, other)
object.__or__(self, other)
(or the fraction you actually need).
Operator Overloading
Most of the previous examples map to “operators”: +, - , *, //, /, %
etc. This is often known as “operator overloading”, as you are redefining what the operators mean for that specific type.
Note that you can define these operators to do ANYTHING you want – but it is a really good idea to only define them to mean something that makes sense in the usual way.
One interesting exception to this rule is the pathlib.Path
class, that has defined __truediv__
to mean path concatenation:
In [19]: import pathlib
In [20]: p1 = pathlib.Path.cwd()
In [21]: p1
Out[21]: PosixPath('/Users/Chris/PythonStuff/UWPCE/PythonCertDevel')
In [22]: p1 / "a_filename"
Out[22]: PosixPath('/Users/Chris/PythonStuff/UWPCE/PythonCertDevel/a_filename')
While this is not division in any sense, the slash is used as a path separator – so this does make intuitive sense. At least to me – I think it’s pretty cool.
Comparing
If you want your objects to be comparable:
A > B
A < B
A >= B
etc…
There is a full set of magic methods you can use to override the “comparison operators”
__lt__ : < (less than)
__le__ : <= (less than or equal)
__eq__ : == (equal)
__ge__ : >= (greater than or equal)
__gt__ : > (greater than)
__ne__ : != (not equal)
These are known as the “rich comparison” operators, as they allow fuller featured comparisons. In particular, they are used by numpy to provide “element-wise” comparison – that is, comparing two arrays yields an array of results, rather than a single result:
In [26]: import numpy as np
In [27]: arr1 = np.array([3,4,5,6,7,8,9])
In [28]: arr2 = np.array([9,2,6,2,6,3,9])
In [29]: arr1 > arr2
Out[29]: array([False, True, False, True, True, True, False], dtype=bool)
In [30]: arr1 == arr2
Out[30]: array([False, False, False, False, False, False, True], dtype=bool)
This is just one example – the point is that for your particular class, you can define these comparisons however you want.
Total Ordering
You may notice that those operators are kind of redundant – if A > B is True
then we know that A < B is False
and A <= B is False
.
In fact, there is a mathematical / computer science concept known as “Total Order”: (https://en.wikipedia.org/wiki/Total_order), which strictly defines “well behaved” objects in this regard.
There may be some special cases, where these rules may not apply for your classes (though I can’t think of any :-) ), but for the most part, you want your classes, if they support comparisons at all, to be well behaved, or “total ordered”.
Because this is the common case, Python comes with a nifty utility that implements total ordering for you:
https://docs.python.org/3.6/library/functools.html#functools.total_ordering
It can be found in the functools module, and allows you to specify __eq__ and only one of: __lt__()
, __le__()
, __gt__()
, or __ge__()
. It will then fill in the others for you.
Note: if you define only one, it should be __lt__
, because this is the one used for sorting (see below for more about that).
Here is the truncated example from the docs:
@total_ordering
class Student:
def __eq__(self, other):
return ((self.lastname.lower(), self.firstname.lower()) ==
(other.lastname.lower(), other.firstname.lower()))
def __lt__(self, other):
return ((self.lastname.lower(), self.firstname.lower()) <
(other.lastname.lower(), other.firstname.lower()))
Note that this makes it a lot easier than implementing all six comparison operators. However, if you read the doc, it lets you know that total_ordering
has poor performance – it is doing extra method call re-direction when the operators are used. If performance matters to your use case (and it probably doesn’t), you need to write all six comparison dunders.
Sorting
Python has a handful of sorting methods built in:
list.sort()
– for sorting a list in place.
sorted(iterable)
– for creating a sorted copy of an iterable (sequence).
And a couple of more obscure ones.
In order for your custom objects to be sortable, they need the __lt__
(less than) magic method defined – that’s about it.
So if you are using the total_ordering
decorator, it’s best to define __eq__
and __lt__
– that way sorting will be able to use a “native” method for sorting, and maybe get better performance.
Sort key methods
By default, the sorting methods use __lt__
for comparison, and that algorithm calls __lt__
O(n log(n)) times. But if you pass a “key” function in to the sort call:
a_list.sort(key=key_fun)
then the key_fun is only called n times, and if the key returns a simple type, like an integer or float, then the sorting will be faster.
So it often helps to provide a sort_key() method on your class, so it can be passed in to the sort methods:
class Simple:
"""
simple class to demonstrate a simple sorting key method
"""
def __init__(self, val):
self.val = val
def sort_key(self):
return self.val
And to use it:
list_of_Simple_objects.sort(key=Simple.sort_key)
See: sort_key.py
for a complete example with timing. Here is an example of running it:
Timing for 10000 items
regular sort took: 0.04288s
key sort took: 0.004779s
performance improvement factor: 8.9726
So almost 9 times faster for a 10,000 item list. Pretty good, eh?
An Example
Each of these methods supports a common Python operation.
For example, to make ‘+’ work with a sequence type in a vector-like fashion,
implement __add__
:
def __add__(self, v):
"""return the element-wise vector sum of self and v
"""
assert len(self) == len(v)
return vector([x1 + x2 for x1, x2 in zip(self, v)])
[a slightly more complete example may be seen here vector.py
]
Emulating Standard types
Making your classes behave like the built-ins.
The Container Protocol
Want to make a container type? Here’s what you need:
object.__len__(self)
object.__getitem__(self, key)
object.__setitem__(self, key, value)
object.__delitem__(self, key)
object.__iter__(self)
object.__reversed__(self)
object.__contains__(self, item)
object.__index__(self)
__len__
is called when len(object) is called.
__reversed__
is called when reversed(object) is called.
__contains__
is called with in
is used: something in object
__iter__
is used for iteration – called when in a for loop.
__index__
is used to convert the object into an integer for indexing. So you don’t define this in a container type but rather define it for a type so it can be used as an index. If you have a class that could reasonably be interpreted as an index, you should define this. It should return an integer. This was added to support multiple integer types for numpy.
Indexing and Slicing
__getitem__
and set__item__
are used when indexing:
x = object[i]
calls __getitem__
, and object[i] = something
calls __setitem__
.
But indexing is pretty complex in python. There is simple indexing: object[i]
, but there is also slicing: object[i:j:skip]
When you implement __getitem__(self, index)
, index
will simply be the index if it’s a simple index, but if it’s slicing, it will be a slice
object. Python also supports multiple slices:
object[a:b,c:d]
These are used in numpy to support multi-dimensional arrays, for instance.
In this case, a tuple of slice objects is passed in.
See: index_slicing.py
Callable classes
We’ve been using functions a lot:
def my_fun(something):
do_something
...
return something
And then we can call it:
result = my_fun(some_arguments)
But what if we need to store some data to know how to evaluate that function?
Example: a function that computes a quadratic function:
You could pass in a, b and c each time:
def quadratic(x, a, b, c):
return a * x**2 + b * x + c
But what if you are using the same a, b, and c numerous times?
Or what if you need to pass this in to something (like map) that requires a function that takes a single argument?
“Callables”
Various places in Python expect a “callable” – something that you can call like a function:
a_result = something(some_arguments)
“Something” in this case is often a function, but can be anything else that is “callable”.
What have we been introduced to recently that is “callable”, but not a function object?
Custom callable objects
The trick is one of Python’s “magic methods”
__call__(*args, **kwargs)
If you define a __call__
method in your class, it will be used when
code “calls” an instance of your class:
class Callable:
def __init__(self, .....)
some_initilization
def __call__(self, some_parameters)
Then you can do:
callable_instance = Callable(some_arguments)
result = callable_instance(some_arguments)
Callable example
An example of writing a callable class:
Write a class for a quadratic equation.
The initializer for that class should take the parameters:
a, b, c
It should store those parameters as attributes.
The resulting instance should evaluate the function when called, and return the result:
my_quad = Quadratic(a=2, b=3, c=1)
my_quad(0)
Here’s one way to do that:
quadratic.py
Protocols in Summary
Use special methods when you want your class to act like a “standard” type or class in some way.
Look up the special methods you need and define them (and only the ones you need).
There’s more to read about the details of implementing these methods:
References
Here is a good reference for magic methods:
http://minhhh.github.io/posts/a-guide-to-pythons-magic-methods
And with a bit more explanation: