Checking Out CPython 3.14's remote debugging protocol
python -m pdb -p pid
lets you connect a pdb
session to a running Python process.
This post goes into a part of what makes this possible.
The barrier to entry for writing general debugging tools for Python programs has always been quite low. Unlike many languages, you're rarely finding yourself working with weird internals. Instead, debugging tools can be built off of pretty straightforward knowledge of the language.
This is powered by the languages treating things like exception tracebacks as first class objects, standard library modules like traceback
, and of course the power of eval
/exec
.
And thanks to this, tools like pdb
can be provided out of the box, and easily customized by people without much (if any) understanding of the internals of the language.
Some might comment that the existence of things like pdb
have prevented Python programmers from "simply" learning how to use more universal debuggers like gdb
/lldb
. But I believe that loads of excellent REPL-based debugging experimentation happens because People can simply write tools like ipdb
or pdb++
in Python.
There's always been a bit of a catch, though. Generally these debugging tools require you first to edit your program source.
You can easily stick a:
import pdb
pdb.set_trace()
right in the middle of your program near a problematic spot, or even set up an exception handler beforehand to capture outright exceptions.
But if you haven't done this work beforehand, you'll likely end up needing to edit your source code somehow, restart your program, and then go into it.
There are tools that can work without restarting the program. Beyond language agnostic debugging tools, tools like pyspy
can work on running processes. But they are often relying on tricks and needing to know very specific details about CPython to work.
Python 3.14 offers some new functionality to standardize injecting some Python code to a running process. This should reduce the need for a lot of hacks.
sys.remote_exec
takes a process ID of an existing Python process, along with the path of a Python script.
Calling sys.remote_exec(pid, script_path)
will signal to the remote process that we want it to run the script at script_path
"soon".
# remote-script.py
print("Hello there")
# assuming that you have a Python process
# running at PID 95000
# inside a Python REPL
import sys
sys.remote_Exec(95000, "remote-script.py")
# process 95000, once it has the opportunity to,
# will read the remote-script.py file, and execute
# the Python code it finds.
# over in the process 95000's standard output
Hello there
What does this look like in practice with a "real" program?
# hello.py
import time
import os
total = 0
def add_to_total(value):
global total
total += value
# simulate slowness
time.sleep(value)
def main():
print("Number adder")
print(f"{os.getpid()=}")
global total
while True:
result = input("Give me an integer >>")
try:
result = int(result)
add_to_total(result)
print(f"The total is now {total}")
except ValueError:
print("Invalid input")
if __name__ == "__main__":
main()
This program takes some integers as input, and calculates a cumulative sum. A very basic client-server program where you might wonder "... what is the program doing and why am I waiting so long?"
Running the program starts a "REPL" taking integers to add them up, printing out the PID.
[1] % uv run python hello.py
Number adder
os.getpid()=95700
Give me an integer >>3
The total is now 3
Give me an integer >>
The program is now waiting for input.
Over in remote-script.py
we'll put in some code that we want to inject into the process. Here, we will just print out the stack to see what the program is doing.
print("Hi from remote-script.py!")
from traceback import print_stack
print_stack()
With my original program still running, I've opened a Python REPL in a different shell, and run sys.remote_exec
, pointing at the PID of my number adding process.
>>> import sys
>>> sys.remote_exec(95700, "remote-script.py")
The call to sys.remote_exec
returns immediately, but the remote process at first didn't seem to do anything, still waiting on another integer from me:
Give me an integer >>
But once I actually provide an integer and return from the input
call, I will see that my remote script runs.
Give me an integer >>5
Hi from remote-script.py!
File "/Users/rtpg/proj/remote-exec-sample/hello.py", line 30, in <module>
main()
File "/Users/rtpg/proj/remote-exec-sample/hello.py", line 20, in main
result = input("Give me an integer >>")
File "<string>", line 6, in <module>
The total is now 8
Give me an integer >>
This behavior illustrates a couple of things:
- the remote script execution kicked in on the return from the call to
input
.
The high level idea is that CPython's interpreter checks to see if any remote execution has been requested at specific points in the interpreter's main loop. So if your Python program is just waiting on external input, it's likely you're not making any progress in Python land... and so your remote script won't execute.
- the remote script execution happpens within the context of whatever's running at the moment
our remote script's call to print_stack
printed out a trace where we were within our program's main
, and then the last frame is our script running inline. This means that you can very easily grasp what your program is doing, instead of your script being run in an isolated way
Even with the fact that you need to make some forward progress to get remote execution, given that many cases you will eventually yield back to Python, you hopefully will get some information by the time your remote script executes.
In the following continuation, I add 40. My script then calls time.sleep(40)
. When I call sys.remote_exec
during that sleep, the back trace doesn't appear until the end of that sleep
call. But it did appear!
Give me an integer >>40
Hi from remote-script.py!
File "/Users/rtpg/proj/remote-exec-sample/hello.py", line 30, in <module>
main()
File "/Users/rtpg/proj/remote-exec-sample/hello.py", line 23, in main
add_to_total(result)
File "/Users/rtpg/proj/remote-exec-sample/hello.py", line 10, in add_to_total
time.sleep(value)
File "<string>", line 6, in <module>
The total is now 48
At this point the world is your oyster.
You could go spelunking for data in the remote process:
# figure out the total
import __main__
print(f"script saw value of {__main__.total}")
You could use a library like remote_pdb
to set up an interactive debugging sessional:
# with remote_pdb needing to be installed ahead of time
import remote_pdb
remote_pdb.set_trace()
with this, when you call sys.remote_exec
you'll see a prompt like:
RemotePdb session open at 127.0.0.1:51509, waiting for connection ...
You can then connect to the pdb
session with netcat
, and have your interactive debugging session.
You might think that this is too many steps. If you're already on the same host, you can run python -m pdb -p pid
to get a shell into a running process directly! To state the obvious, though: this will suspend your running process! Maybe not a great thing to do in production.
traceback.print_stack()
in our example? That won't suspend the running process! Scripts that dump state and do nothing else, when done carefully, pose a lot less risk to the running process.
Though we're a far cry from dtrace
-style protections against breaking our existing process. We're still running arbitrary Python code.
But even without the CPython team showing up to build the debugger, you could build out something like support for pdb -p pid
yourself!
Being able to do this kind of script injection in a fully supported way, thanks to CPython providing both the right kind of hooks and a reference implementation means that the barrier to entry for writing tools are lower than ever.
Some reading:
-
The remote debug attachment protocol HOWTO. This describes what you need to do to implement something like
sys.remote_exec
. Valuable if you intend -
PEP 768 – Safe external debugger interface for CPython. This is the original PEP for this functionality. Includes links to existing hacks and details on how this works