How To Adopt Mypy On Bigger Projects
tl;dr don't turn on all the checks for a subset of files, check as many files as possible and add checks on a module-by-module basis
This post is a bit of a response to some online complaining by Armin Ronacher about mypy. While I agree in the abstract that
mypy-s payoff feels less nice than Typescript's, I spent 30 minutes or so looking at Sentry's codebase and felt that Armin's problems are not "just"
mypy problems, but problems compounded by how Sentry is choosing to adopt
Adding typing to Python is a pain in the butt in many situations. The payoffs don't come nearly as quickly as with Typescript, and things like Django just are really tricky to get working.
Having said that, there are very good payoffs once you get to very high coverage. On larger codebases, though, not all paths are created equal.
Unfortunately, Sentry's mypy.ini is, at the time of this writing, a good example of how not to do this.
The most important thing, by far, is you want mypy to look at all of your files, as much as possible.
Imagine you have three Python modules,
A is your platonic ideal of a fully typed module.
B has some work to do, and
C is kind of cowboy land.
Many projects will say "OK, we will run
mypy on A"
[mypy] files = src/A/ disallow_any_generics=True disallow_untyped_defs=True disallow_untyped_calls=True # 100 other amazing correctness and style options
This will make sure
A stays clean, but you won't get any checking of the usage of
So you might go to the next step, of including
[mypy] files = src/A/, src/B/ # all the important correctness options
But now you hit a bunch of errors, because
B is only partially cleaned up!
This is the trap of the perfectionist using any form of linting tool like
mypy. You want all the good options, so you turn them on. But then you can only cover a subset of files.
But you could have 100% of the checks on 10% of the code, or.... 50% of the checks on 50% of the code? It's foolish to pretend there's some objective metric here, but I think the general idea is sound.
If you instead do something like:
[mypy] files = src/A/, src/B/ disallow_any_generics=True disallow_untyped_defs=True disallow_untyped_calls=True [mypy-B.*] disallow_untyped_calls=False disallow_untyped_defs=True
You will now at least get some type checking on some of the code.
But what you really actually want to do, to try and get
src/C/ into the game, is:
[mypy] files = src/ disallow_any_generics=True disallow_untyped_defs=True disallow_untyped_calls=True [mypy-B.*] disallow_untyped_calls=False disallow_untyped_defs=True [mypy-C.*] disallow_untyped_calls=False disallow_untyped_defs=False disallow_any_generics=False
the important part here being that maybe we have to disable some extra checks to get
mypys set of checks, even if it's only a fragment.
This obviously looks very tedious. I would recommend writing a template that generates some of these for people who have loads of modules. That leaves most of the tedium at the "configuring mypy per module" step, rather than the "spend a bunch of time handling
Any false positives".
disallow_untyped_defs is something that feels very wrong to set to
False, but we shouldn't forget that
mypy will do checking of function bodies! You don't want to just have good function definitions, you want those usages to be checked in other places, even if that type checking process is incomplete.
A point about
Any: the unfortunate reality is that right now
mypy will often end up making many things into
Any when code is incompletely transitioned into type-conformance, and that cascades into a lot of places. Because of this, disallowing any usage conflicts with things like allowing missing imports. Again, the point is that you want code that can be checked, to be checked. Being agressive about
Any usage as a pre-requisite to pulling code into
mypy's checks will lead to a lot of friction.
Any of the
disallow_*_any_* options will directly conflict with the objective of getting good coverage of
mypy. Disabling that will get you some checks, though more on the level of
pylint than Typescript in some cases.
There is some weird side effect of adopting
mypy this way: sometimes you make a very minor change of improving/adding a function signature, and suddenly you uncover 10 little bugs. So long as, as a team, you are willing to let people fix what's reasonable, and otherwise punt on problems that are hard to not slow things down too much, you can end up slowly, but surely, getting to a nice set of typed code. And, of course, all of your new code can be fresh and clean.
Some other random tips:
assert isinstance(thing, SomeClass)is a tried-and-true way of type narrowing. If you really care about
if not isinstance(thing, SomeClass): throw ValueError(...).
Unionproblems can go away real easily this way
TypedDictis not that fun ergonomically. It does what it says it will do, but you really have to read the PEP to not just lose all value. Sometimes just opting for dataclasses (or tuples!) is much nicer. More recent Python versions make tuples nicer to write out.
It's important to consider how much you care about types in your test code. I know I'm saying to cover a huge chunk of code, but, well... test code will be run and do its thing. It might be worth having less strict things there.