Mimicking Immutability in Python with Type Hints
Immutable variables are supported by many languages, with some going even further by having immutability by default. Python is unfortunately not one of those languages. Still yet, we can leverage type hints and mypy to gain some of the benefits of immutability!
Why do we even want immutability?
First off, why do we even want to have immutability? Especially in a language like Python where achieving an acceptable level of immutability is nontrivial?
Immutability makes our programs easier to reason about. It helps us reduce the realm of possibility, and it can provide some assurances to our data. Knowing that a variable is immutable gives us one less thing to worry about, and lessening our worries is always welcome!
While these benefits are particularly pronounced when dealing with multithreaded code (e.g. by helping to prevent data races), single-threaded code can also benefit greatly. Immutability helps prevent accidental modification to our data. If we have mutable state that we accidentally modify, then at best our program will fail with a sufficiently informative error message, allowing us to quickly identify and fix the bug.
Unfortunately, another likely outcome is that our state will become silently corrupted, causing seemingly unrelated bugs in our program. These bugs are hard to identify and frustrating to deal with. They also have the potential to become catastrophic, especially if the accidental corruption causes the state to violate invariants that other parts of our code expect to be upheld.
In contrast, accidental modifications to immutable state fail with either of the following:
- A straightforward error message at runtime:
$ python test.py
Traceback (most recent call last):
File "test.py", line 9, in <module>
state.x = 2
File "<string>", line 4, in __setattr__
dataclasses.FrozenInstanceError: cannot assign to field 'x'
- Or, even better, with an error message at “compile time” from either your compiler or static analysis tool (like mypy):
$ mypy test.py
test.py:9: error: Property "x" defined in "State" is read-only
Found 1 error in 1 file (checked 1 source file)
Applying this to Python
While Python does not have built-in language support for general immutability, we can leverage type hints from the typing
module to get us some of the way there.
To start, let's look at two type hints that are frequently used in Python code: List
and Dict
. Both List
and Dict
declare methods that support mutation, like append
and __setitem__
. So, mypy reports no errors with the following:
from typing import Dict, List
l: List[int] = [1, 2, 3]
d: Dict[int, str] = {1: "Hello, world!"}
l.append(4)
d[0] = "Hi!"
$ mypy test.py
Success: no issues found in 1 source file
Great! This meets our expectations. We can freely mutate lists and dictionaries within Python, and the analogous type hints declare all of the appropriate methods to support this.
Now, what if we want to have an immutable list or immutable dictionary. Unfortunately, Python does not have a frozenset
alternative for lists or dicts. We can instead reach into the typing
module for "immutable" versions, namely Sequence
and Mapping
respectively.
Calling Sequence
and Mapping
“immutable” may be slightly off. They do not enforce immutability; instead, they just do not declare methods that can mutate the object.
Here's an example of what I mean:
from typing import Mapping
d: Mapping[str, bool] = {"a": True, "b": False}
d["c"] = False
With the corresponding mypy output:
$ mypy test.py
test.py:7: error: Unsupported target for indexed assignment ("Mapping[str, bool]")
Found 1 error in 1 file (checked 1 source file)
Mapping
does not declare __setitem__
, which is implicitly called with the d["c"] = False
syntax, and this causes mypy to report an error. We've used type hints and mypy to catch unintentional mutation without having to run our code!
Looking at the Python standard library documentation for collections abstract base classes, we can see that Mapping
also does not declare __delitem__
either. We can quickly verify that Mapping
+ mypy catches when we try to delete items:
from typing import Mapping
d: Mapping[str, bool] = {"a": True, "b": False}
del d["a"]
$ mypy test.py
test.py:7: error: "Mapping[str, bool]" has no attribute "__delitem__"; maybe "__getitem__"?
Found 1 error in 1 file (checked 1 source file)
Yay!
Sequence
+ mypy provides us the same benefits for lists, as Sequence
also does not declare __setitem__
or __delitem__
:
from typing import Sequence
l: Sequence[int] = [1, 2, 3]
l[0] = -1
del l[2]
$ mypy test.py
test.py:7: error: Unsupported target for indexed assignment ("Sequence[int]")
test.py:8: error: "Sequence[int]" has no attribute "__delitem__"; maybe "__getitem__"?
Found 2 errors in 1 file (checked 1 source file)
Frozen dataclasses
I think dataclasses deserve a special mention here. I love dataclasses. They're one of my favorite parts of Python. They can provide so many benefits to your codebase, most of which are out of the scope of this post.
One of their benefits, though, is the fact that you can declare your dataclasses to be "frozen", like so:
from dataclasses import dataclass
@dataclass(frozen=True)
class State:
x: int
Modifying attributes of an instance of a frozen dataclass will fail at runtime with a FrozenInstanceError
:
state = State(1)
state.x = 2
$ python test.py
Traceback (most recent call last):
File "test.py", line 10, in <module>
state.x = 2
File "<string>", line 4, in __setattr__
dataclasses.FrozenInstanceError: cannot assign to field 'x'
And if that wasn't exciting enough, mypy is smart enough to understand that dataclasses can be frozen, and will report an error with the above code:
$ mypy test.py
test.py:10: error: Property "x" defined in "State" is read-only
Found 1 error in 1 file (checked 1 source file)
frozen=True
also gives some nice benefits beyond this, like automatically adding a __hash__
implementation provided that eq=True
(which is the default).
While it would have been nicer for frozen=True
to be the default, dataclasses still provide an elegant way of grouping related data in your program, akin to structs in other languages. They save a lot of boilerplate, and they provide a nice foundation for other great libraries like dataclasses-json.
Of course, this is still Python
Unfortunately, you can't create truly immutable objects in Python. The Python interpreter ignores type hints when running your code, so the following will work:
from typing import Sequence
x: Sequence[int] = [1, 2]
x.append(3)
assert x == [1, 2, 3]
mypy will flag an error here, but Python will happily run this because as far as Python is concerned, type(x) == <class 'list'>
. The same is true with Mapping
:
from typing import Mapping
x: Mapping[str, int] = {"a": 1}
x["b"] = 2
assert x == {"a": 1, "b": 2}
Even our beloved frozen dataclasses can be mutated, albeit with slightly arcane syntax:
from dataclasses import dataclass
@dataclass(frozen=True)
class State:
x: int
state = State(1)
object.__setattr__(state, "x", 2)
assert state.x == 2
(Please don't do this.)
But let's remember our "threat model"
We're not pursuing immutability to prevent all cases of our state being modified. This is Python; if someone wants to mutate something they will succeed.
We're pursuing immutability to help reduce bugs in our code and catch developer mistakes. The risk of a developer accidentally modifying state using object.__setattr__
is very low compared to a developer accidentally modifying state via state.x = 2
or foo(state.x)
.
While Python gives us some tools for this, they are not on by default. Therefore, we need to be conscious of them while developing, ideally making them a habit.
As the above illustrates, this is not as good as languages like Rust or Haskell where you need to opt-in to mutability. Another thing to watch out for is that mypy's type inference defaults to the mutable versions of types. We can see this by using the reveal_type()
mypy builtin:
x = [1, 2, 3]
reveal_type(x)
y = {"hi": 1}
reveal_type(y)
z = {True, False}
reveal_type(z)
$ mypy test.py
test.py:5: note: Revealed type is 'builtins.list[builtins.int*]'
test.py:8: note: Revealed type is 'builtins.dict[builtins.str*, builtins.int*]'
test.py:11: note: Revealed type is 'builtins.set[builtins.bool*]'
I try and simulate immutability by default in Python by always defaulting to Sequence
and Mapping
in my method signatures and variables instead of List
and Dict
. Also, I try to always include frozen=True
to any dataclasses I create, unless I for sure need mutability (and dataclasses.replace()
isn't sufficient).
So, mimicking immutability in Python unfortunately has to be a sustained effort while you are developing. But I believe the long term benefits to your project outweigh this initial investment.