Implementing Rust's dbg! in Python
Rust has an amazing dbg macro that lets you quickly set up an expression printer that will also put in the source line. It also returns the value of the expression so you can even easily inline the printing when you want to!
#!rust
let a = 2;
let b = dbg!(a * 2) + 1;
// ^-- prints: [src/main.rs:2] a * 2 = 4
assert_eq!(b, 5);
Doing a bunch of Python, I want this in Python as well. I want a debug macro, one that gives me the file location + the expression used to calculate the value.
My ideal would be to have a function where dbg(foo.bar[1])
that would:
- print
[source line] foo.bar[1] = <the value>
- return the value so we can use it within expressions easily
When doing this by hand I always reach for f
-strings, to do something like:
#!python
print(f"foo.bar[1]={foo.bar[1]}")
This is annoying because you end up typing the same expression twice, you can't inline it, and it just doesn't look very cool.
The first thing when trying to implement this is to come to terms with Python's lack of macros. You can't really mess with the syntax tree, so you're really only operating on calculated values. So it's really hard to capture an expression and its result in a single operation because, well, your expressions get evaluated.
So it'll be hard, to print some_expr = <result of some_expr>
without providing the string of some_expr
at one point.
However, thanks to eval
, you can flip this idea on its head. You need the string for the expression, but you don't need the expression itself!
#!python
foo = {
'bar': 3
}
print(eval("foo['bar']")) # prints 3
One step backwards, let's now go two steps forward. We can start writing our dbg
helper function
#!python
foo = {
"bar": 3
}
def dbg(expr):
result = eval(expr)
print(f"{expr} = {result}")
dbg("foo['bar']") # foo['bar'] = 3
So far so good, we have a nice pretty printer here at the cost of two quotation marks.
Of course this doesn't work perfectly:
#!python
expr = 5
dbg("expr") # expr = expr
Turns out that when calling eval
without arguments we are evaluating within the scope of dbg
. Our examples work because dbg
is defined alongside our expressions (so foo
is accessible within dbg
) but it's susceptible to name shadowing or outright not being able to calculate the expression because it could be defined in a different scope
#!python
def dbg(expr):
# we're calculating the result here....
result = eval(expr)
print(f"{expr} = {result}")
# but actually want to calculate
# the result within this scope
dbg("expr")
eval
by default will evaluate the expression within the current frame. But we can pass in our own globals and locals to instead evaluate the expression within an arbitrary environment
#!python
x = 3
print(eval("x", {"x": 4})) # gives 4
So let's reproduce the caller's environment to get us the right results. Enter inspect
:
There are four main kinds of services provided by this module: type checking, getting source code, inspecting classes and functions, and examining the interpreter stack.
Let's try out inspect.stack()
#!python
import inspect
def dbg(expr):
print(inspect.stack()[0].code_context)
result = eval(expr)
print(f"{expr} = {result}")
return result
dbg("foo['bar']")
expr = 5
dbg("expr")
if dbg("True"):
print("Passed")
trying this out, we get the following:
#!python
[' print(inspect.stack()[0].code_context)\n']
foo['bar'] = 3
[' print(inspect.stack()[0].code_context)\n']
expr = expr
[' print(inspect.stack()[0].code_context)\n']
True = True
Passed
Since this is in a function call we'll actually need to go up the stack one frame (inspect.stack()[1]
) to get the proper line:
#!python
['dbg("foo[\'bar\']")\n']
foo['bar'] = 3
['dbg("expr")\n']
expr = expr
['if dbg("True"):\n']
True = True
Passed
OK, now we are getting closer to where we need. the second element in our stack is where we'll also get the file path and the line number. Let's clean this up a bit and now we get:
#!python
[/home/rtpg/proj/configfiles/pyhelpers.py: 21] foo['bar'] = 3
[/home/rtpg/proj/configfiles/pyhelpers.py: 25] expr = expr
[/home/rtpg/proj/configfiles/pyhelpers.py: 27] True = True
Here we have the full file path... one might do a trick here to stick with the file name:
#!python
def dbg(expr):
frame = inspect.stack()[1].frame
filename = frame.filename.split("/")[-1]
print(f"[{filename}: {frame.lineno}] {expr} = {result}")
return result
OK so with this we end up with the following nice printed output
#!python
[pyhelpers.py: 21] foo['bar'] = 3
[pyhelpers.py: 25] expr = expr
[pyhelpers.py: 27] True = True
OK so this is good on the formatting. But we still haven't fixed eval
! That's a tiny fix this:
#!python
def dbg(expr):
frame = inspect.stack()[1].frame
result = eval(expr, frame.f_globals, frame.f_locals)
print(f"{expr} = {result}")
return result
This gets us the right values, properly evaluating expr
from the calling scope instead of dbg
:
#!python
[pyhelpers.py: 21] foo['bar'] = 3
[pyhelpers.py: 25] expr = 5
[pyhelpers.py: 27] True = True
The f_globals
and f_locals
on the frames is still one of my favorite Python features. It lets basically anyone write very good debugging tools without having to do a bunch of magic.
As an added bonus, if you're an IPython user you can add this to a startup script over in ~/.ipython/profile_default/
and make this available in any interactive shell.
For those who value silly things like correctness, performance, and intellectual integrity your journey ends here. This is a nice little helper that you can use in your projects, and there's nothing else for you to see here.
For those who like to live dangerously, I have something to show you.....
So.
Inside the frame, we have a bunch of information, but one thing that's pretty nice for explorative debugging is the code_context
#!python
print(frame.code_context[0])
# prints dbg('foo["bar"]')
So we actually have the source line from the dbg
call built in!
Let's imagine going back a bit to our "idealized" debug function:
#!python
dbg(foo['bar'])
Now, if we ran such a function, the body would receive the value of foo['bar']
and not the expression. But thanks to the stack we could theoretically recover the original expression!
This gets a bit tricker if you want to do more clever stuff like:
#!python
if dbg(my_func(3, foo["bar"] + "()"))) == 5:
do_stuff()
But let's try to work this out from first principles a bit.
If you have the source code line with the dbg
call, how can you recover the expression being debugged?
The first thing is just to find the beginning of the expression. That's pretty easy, just look for dbg(
#!text
if dbg(my_func(3, foo["bar"] + "()")) == 5:
^^^^
So we find the beginning of the expression, right? And since a function call in Python is an open parentheses, some expressions, then a close parentheses, we just have to find the close parentheses and we have the string fragment representing our expression. Let's apply the logic here:
#!text
if dbg(my_func(3, foo["bar"] + "()")) == 5:
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Great! Our expression is my_func(3, foo["bar"] + "(
. Definitely what we wanted. Yep.
This is one of those things where the only way to parse out an expression is to actually parse it out. Simple string matching won't cut it, especially when you start getting inner function calls in the mix.
You could write a simple parser, try and count parens, open quotes, escaping in quotes, all the fun little games. But honestly I want a thing that works, and I don't want to think about it too hard.
I just want to find out where the expression is in this string.
So here's my amazing plan to deal with this:
#!python
ast.parse('my_func(3, foo["bar"] + "()")) == 5:')
# fails
ast.parse('my_func(3, foo["bar"] + "()")) == 5')
# fails
ast.parse('my_func(3, foo["bar"] + "()")) == ')
# fails
...
ast.parse('my_func(3, foo["bar"] + "()")')
# doesn't fail => I have my expression!
I just start with the longest possible string, and slowly shrink it until I find something that looks like an expression. Because the )
from the debug call imbalances everything I'm like... 80% sure that this way of finding the expression is correct.
There are some edge cases that make this strategy not work. For example if you do something like:
#!python
return dbg(x) + dbg(y)
You have two dbg
s on the same line, so it's hard to know which call you're in, because you're only provided the line number. (On some other project I spent a lot of time hand-wringing on this exact problem and could not find a solution, even with tricks like counting executions. I'm fairly convinced this is equivalent to the halting problem somehow)
And to be honest my solution here isn't battle-tested, it just feels right and works with the complicated expressions I've thrown at it. But it was a fun thing to write and now the following works:
#!python
dbg(foo["bar"])
# [pyhelpers.py: 49] foo["bar"] = 3
if dbg(my_func(3, foo["bar"] + "()")) == 5:
# [pyhelpers.py: 52] my_func(3, foo["bar"] + "()") = 5
print("Success")
No more stringifying silliness, no more calls to eval
, just a really silly parsing hack that has no real foundation except my gut feeling.
Good enough for me!
the final implemetaion is below:
#!python
def dbg(result):
"""
Recover the expression giving the result, and then
print a helpful debug statement showing this
"""
frame = inspect.stack()[1]
expr = extract_dbg(frame.code_context[0])
filename = frame.filename.split("/")[-1]
print(f"[{filename}: {frame.lineno}] {expr} = {result}")
return result
def extract_dbg(code_fragment):
# from a line of source text, try and find the expression
# given to a call to dbg
expression_options = code_fragment.split("dbg(")
if len(expression_options) != 2:
# if there are either multiple dbg statements
# or I can't find the dbg line, bail
return "???"
# get the part to the right of dbg(
expr_candidate = expression_options[1]
while expr_candidate:
try:
ast.parse(expr_candidate)
return expr_candidate
except SyntaxError:
expr_candidate = expr_candidate[:-1]
# didn't find anything, also give up
return "???"
2020-12-24 Update: Thanks to everyone who read this and offered feedback and comments on this! It's great to have stuff you write actually read by your peers.
Loads of people mentioned icecream as a library that's offering this functionality (along with tests and a fun name). If I were pulling this into a project I think this would be the option I would go for.
Another important detail: Python 3.8's f
-strings now include a wonderful =
formatting option to make it easier to print the expressions:
print(f"{foo=}")
will print out foo=3
, so that mostly relieves pressure from some of the original motivation here.
And for the specific finding of the executing source, Alex Hall wrote this amazing little library, that goes much much deeper into this problem.
Its solution to resolving the issue of only knowing the line number: taking the source code, adding a sentinel string, compiling, and then comparing the bytecode to try and sniff out the result.
There's more details in the actual github page, but it's quite a smart trick (and, I would say, based in a much more principled foundation).