Skip to content

Commit 21b4799

Browse files
gh-131591: Add remote debugging attachment protocol documentation
Add a developer-facing document describing the protocol used by remote_exec(pid, script) to execute Python code in a running process. This is intended to guide debugger and tool authors in reimplementing the protocol.
1 parent e42bda9 commit 21b4799

File tree

2 files changed

+337
-0
lines changed

2 files changed

+337
-0
lines changed

Diff for: Doc/howto/index.rst

+2
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ Python Library Reference.
3434
mro.rst
3535
free-threading-python.rst
3636
free-threading-extensions.rst
37+
remote_debugging.rst
3738

3839
General:
3940

@@ -66,3 +67,4 @@ Debugging and profiling:
6667
* :ref:`gdb`
6768
* :ref:`instrumentation`
6869
* :ref:`perf_profiling`
70+
* :ref:`remote-debugging`

Diff for: Doc/howto/remote_debugging.rst

+335
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,335 @@
1+
.. _remote-debugging:
2+
3+
Remote Debugging Attachment Protocol
4+
====================================
5+
6+
This section explains the low-level protocol that allows external code to inject and execute
7+
a Python script inside a running CPython process.
8+
9+
This is the mechanism implemented by the :func:`sys.remote_exec` function, which
10+
instructs a remote Python process to execute a ``.py`` file. This section is not about using that
11+
function, instead, it explains how the underlying protocol works so that it can be
12+
reimplemented in any language.
13+
14+
The protocol assumes you already know the process you want to target and the code you want it to run.
15+
That’s why it takes two pieces of information:
16+
17+
- The process ID (``pid``) of the Python process you want to interact with.
18+
- A path to a Python script file (``.py``) that contains the code to be executed.
19+
20+
Once injected, the script is executed by the target process’s interpreter the next time it reaches
21+
a safe evaluation point. This allows tools to trigger
22+
code execution remotely without modifying the Python program itself.
23+
24+
In the sections that follow, we’ll walk through each step of this protocol in detail: how to locate
25+
the interpreter in memory, how to access internal structures safely, and how to trigger the execution
26+
of your script. Where necessary, we’ll highlight differences across platforms (Linux, macOS, Windows),
27+
and include example code to help clarify each part of the process.
28+
29+
Locating the PyRuntime Structure
30+
================================
31+
32+
The ``PyRuntime`` structure holds CPython's global interpreter state and serves as
33+
the entry point to other internal data, including the list of interpreters,
34+
thread states, and debugger support fields.
35+
36+
To interact with a remote Python process, a debugger must first compute the memory
37+
address of the ``PyRuntime`` structure inside the target process. This cannot be
38+
hardcoded or inferred symbolically, since its location depends on how the binary was
39+
mapped into memory by the operating system.
40+
41+
The process for locating ``PyRuntime`` is platform-specific, but follows the same
42+
high-level approach:
43+
44+
1. Identify where the Python executable or shared library was loaded in the target process.
45+
2. Parse the corresponding binary file on disk to find the offset of the
46+
``.PyRuntime`` section.
47+
3. Compute the in-memory address of ``PyRuntime`` by relocating the section offset
48+
to the base address found in step 1.
49+
50+
Each subsection below explains what must be done and provides a short example of how this
51+
can be implemented.
52+
53+
.. rubric:: Linux (ELF)
54+
55+
To locate the ``PyRuntime`` structure on Linux:
56+
57+
1. Inspect the memory mappings of the target process (e.g. from ``/proc/<pid>/maps``)
58+
to find the memory region where the Python executable or shared ``libpython``
59+
library is loaded. Record its base address.
60+
2. Load the binary file from disk and parse its ELF section headers.
61+
Locate the ``.PyRuntime`` section and determine its file offset.
62+
3. Add the section offset to the base address to compute the address of the
63+
``PyRuntime`` structure in memory.
64+
65+
An example implementation might look like:
66+
67+
.. code-block:: python
68+
69+
def find_py_runtime_linux(pid):
70+
# Step 1: Try to find the Python executable in memory
71+
binary_path, base_address = find_mapped_binary(pid, name_contains="python")
72+
# Step 2: Fallback to shared library if executable is not found
73+
if binary_path is None:
74+
binary_path, base_address = find_mapped_binary(pid, name_contains="libpython")
75+
# Step 3: Parse ELF headers of the binary to get .PyRuntime section offset
76+
section_offset = parse_elf_section_offset(binary_path, ".PyRuntime")
77+
# Step 4: Compute PyRuntime address in memory
78+
return base_address + section_offset
79+
80+
.. rubric:: macOS (Mach-O)
81+
82+
To locate the ``PyRuntime`` structure on macOS:
83+
84+
1. Obtain a handle to the target process that allows memory inspection.
85+
2. Walk the memory regions of the process to identify the one that contains the
86+
Python binary or shared library. Record its base address and associated file path.
87+
3. Load that binary file from disk and parse the Mach-O headers to find the
88+
``__DATA,__PyRuntime`` section.
89+
4. Add the section's offset to the base address of the loaded binary to compute
90+
the address of the ``PyRuntime`` structure.
91+
92+
An example implementation might look like:
93+
94+
.. code-block:: python
95+
96+
def find_py_runtime_macos(pid):
97+
# Step 1: Get access to the process's memory
98+
handle = get_memory_access_handle(pid)
99+
# Step 2: Try to find the Python executable in memory
100+
binary_path, base_address = find_mapped_binary(handle, name_contains="python")
101+
# Step 3: Fallback to libpython if executable is not found
102+
if binary_path is None:
103+
binary_path, base_address = find_mapped_binary(handle, name_contains="libpython")
104+
# Step 4: Parse Mach-O headers to get __DATA,__PyRuntime section offset
105+
section_offset = parse_macho_section_offset(binary_path, "__DATA", "__PyRuntime")
106+
# Step 5: Compute PyRuntime address in memory
107+
return base_address + section_offset
108+
109+
.. rubric:: Windows (PE)
110+
111+
To locate the ``PyRuntime`` structure on Windows:
112+
113+
1. Enumerate all modules loaded in the target process.
114+
Identify the module corresponding to ``python.exe`` or ``pythonXY.dll``, where X and Y
115+
are the major and minor version numbers of the Python version, and record its base address.
116+
2. Load the binary from disk and parse the PE section headers.
117+
Locate the ``.PyRuntime`` section and determine its relative virtual address (RVA).
118+
3. Add the RVA to the module’s base address to compute the full in-memory address
119+
of the ``PyRuntime`` structure.
120+
121+
An example implementation might look like:
122+
123+
.. code-block:: python
124+
125+
def find_py_runtime_windows(pid):
126+
# Step 1: Try to find the Python executable in memory
127+
binary_path, base_address = find_loaded_module(pid, name_contains="python")
128+
# Step 2: Fallback to shared pythonXY.dll if executable is not found
129+
if binary_path is None:
130+
binary_path, base_address = find_loaded_module(pid, name_contains="python3")
131+
# Step 3: Parse PE section headers to get .PyRuntime RVA
132+
section_rva = parse_pe_section_offset(binary_path, ".PyRuntime")
133+
# Step 4: Compute PyRuntime address in memory
134+
return base_address + section_rva
135+
136+
Reading _Py_DebugOffsets
137+
=========================
138+
139+
Once the address of the ``PyRuntime`` structure has been computed in the target
140+
process, the next step is to read the ``_Py_DebugOffsets`` structure located at
141+
its beginning.
142+
143+
This structure contains version-specific field offsets needed to navigate
144+
interpreter and thread state memory safely.
145+
146+
To read and validate the debug offsets:
147+
148+
1. Read the memory at the address of ``PyRuntime``, up to the size of
149+
``_Py_DebugOffsets``. This structure is located at the very start of the
150+
``PyRuntime`` block.
151+
152+
2. Verify that the contents of the structure are valid. In particular:
153+
154+
- The ``cookie`` field must match the expected debug marker.
155+
- The ``version`` field must match the version of the Python interpreter
156+
used by the calling process (i.e., the debugger or controlling runtime).
157+
- If either the caller or the target process is running a pre-release version
158+
(such as an alpha, beta, or release candidate), then the versions must match
159+
exactly.
160+
- The ``free_threaded`` flag must match between the caller and the target process.
161+
162+
3. If the structure passes validation, the debugger may now safely use the
163+
provided offsets to locate fields in interpreter and thread state structures.
164+
165+
If any validation step fails, the debugger should abort rather than attempting to
166+
access incompatible memory layouts.
167+
168+
An example of how a debugger might read and validate ``_Py_DebugOffsets``:
169+
170+
.. code-block:: python
171+
172+
def read_debug_offsets(pid, py_runtime_addr):
173+
# Step 1: Read memory from the target process at the PyRuntime address
174+
data = read_process_memory(pid, address=py_runtime_addr, size=DEBUG_OFFSETS_SIZE)
175+
# Step 2: Deserialize the raw bytes into a _Py_DebugOffsets structure
176+
debug_offsets = parse_debug_offsets(data)
177+
# Step 3: Validate compatibility
178+
if debug_offsets.cookie != EXPECTED_COOKIE:
179+
raise RuntimeError("Invalid or missing debug cookie")
180+
if debug_offsets.version != LOCAL_PYTHON_VERSION:
181+
raise RuntimeError("Mismatch between caller and target Python versions")
182+
if debug_offsets.free_threaded != LOCAL_FREE_THREADED:
183+
raise RuntimeError("Mismatch in free-threaded configuration")
184+
return debug_offsets
185+
186+
Locating the Interpreter and Thread State
187+
=========================================
188+
189+
After validating the ``_Py_DebugOffsets`` structure, the next step is to locate the
190+
interpreter and thread state objects within the target process. These structures
191+
hold essential runtime context and are required for writing debugger control
192+
information.
193+
194+
- The ``PyInterpreterState`` structure represents a Python interpreter instance.
195+
Each interpreter holds its own module imports, built-in state, and thread list.
196+
Most applications use only one interpreter, but CPython supports creating multiple
197+
interpreters in the same process.
198+
199+
- The ``PyThreadState`` structure represents a thread running within an interpreter.
200+
This is where evaluation state and the control fields used by the debugger live.
201+
202+
To inject and run code remotely, the debugger must locate a valid ``PyThreadState``
203+
to target. Typically, this is the main thread, but in some cases, the debugger may
204+
want to attach to a specific thread by its native thread ID.
205+
206+
To locate a thread:
207+
208+
1. Use the offset ``runtime_state.interpreters_head`` to find the address of the
209+
first interpreter in the ``PyRuntime`` structure. This is the entry point to
210+
the list of active interpreters.
211+
212+
2. Use the offset ``interpreter_state.threads_main`` to locate the main thread
213+
of that interpreter. This is the simplest and most reliable thread to target.
214+
215+
3. Optionally, use ``interpreter_state.threads_head`` to walk the linked list of
216+
all threads. For each ``PyThreadState``, compare the ``native_thread_id``
217+
field (using ``thread_state.native_thread_id``) to find a specific thread.
218+
219+
This is useful when the debugger allows the user to select which thread to inject into,
220+
or when targeting a thread that's actively running.
221+
222+
4. Once a valid ``PyThreadState`` is found, record its address. This will be used
223+
in the next step to write debugger control fields and schedule execution.
224+
225+
An example of locating the main thread:
226+
227+
.. code-block:: python
228+
229+
def find_main_thread_state(pid, py_runtime_addr, debug_offsets):
230+
# Step 1: Read interpreters_head from PyRuntime
231+
interp_head_ptr = py_runtime_addr + debug_offsets.runtime_state.interpreters_head
232+
interp_addr = read_pointer(pid, interp_head_ptr)
233+
if interp_addr == 0:
234+
raise RuntimeError("No interpreter found in the target process")
235+
# Step 2: Read the threads_main pointer from the interpreter
236+
threads_main_ptr = interp_addr + debug_offsets.interpreter_state.threads_main
237+
thread_state_addr = read_pointer(pid, threads_main_ptr)
238+
if thread_state_addr == 0:
239+
raise RuntimeError("Main thread state is not available")
240+
return thread_state_addr
241+
242+
To locate a specific thread by native thread ID:
243+
244+
.. code-block:: python
245+
246+
def find_thread_by_id(pid, interp_addr, debug_offsets, target_tid):
247+
# Start at threads_head and walk the linked list
248+
thread_ptr = read_pointer(
249+
pid, interp_addr + debug_offsets.interpreter_state.threads_head
250+
)
251+
while thread_ptr:
252+
native_tid_ptr = thread_ptr + debug_offsets.thread_state.native_thread_id
253+
native_tid = read_int(pid, native_tid_ptr)
254+
if native_tid == target_tid:
255+
return thread_ptr
256+
thread_ptr = read_pointer(pid, thread_ptr + debug_offsets.thread_state.next)
257+
raise RuntimeError("Thread with the given ID was not found")
258+
259+
Once a valid thread state has been identified, the debugger can use it to modify
260+
control fields and request execution in the next stage of the protocol.
261+
262+
Writing Control Information
263+
===========================
264+
265+
Once a valid thread state has been located, the debugger can write control fields
266+
that instruct the target process to execute a script at the next safe opportunity.
267+
268+
Each thread state contains a ``_PyRemoteDebuggerSupport`` structure, which is used
269+
to coordinate communication between the debugger and the interpreter. The debugger
270+
uses offsets from ``_Py_DebugOffsets`` to locate three key fields:
271+
272+
- ``debugger_script_path``: A buffer where the debugger writes the full path to
273+
a Python source file (``.py``). The file must exist and be readable by the
274+
target process.
275+
276+
- ``debugger_pending_call``: An integer flag. When set to ``1``, it signals
277+
that a script is ready to be executed.
278+
279+
- ``eval_breaker``: A field checked periodically by the evaluation loop. To
280+
notify the interpreter of pending debugger activity, the debugger sets the
281+
``_PY_EVAL_PLEASE_STOP_BIT`` in this field. This causes the interpreter to pause
282+
and check for debugger-related actions before continuing with normal execution.
283+
284+
To safely modify these fields, most debuggers should suspend the process before
285+
writing to memory. This avoids race conditions that may occur if the interpreter
286+
is actively running.
287+
288+
To perform the injection:
289+
290+
1. Write the script path into the ``debugger_script_path`` buffer.
291+
2. Set the ``debugger_pending_call`` flag to ``1``.
292+
3. Read the value of ``eval_breaker``, set the stop bit, and write the updated
293+
value back.
294+
295+
An example implementation might look like:
296+
297+
.. code-block:: python
298+
299+
def inject_script(pid, thread_state_addr, debug_offsets, script_path):
300+
# Base offset to the _PyRemoteDebuggerSupport struct
301+
support_base = (
302+
thread_state_addr +
303+
debug_offsets.debugger_support.remote_debugger_support
304+
)
305+
# 1. Write script path
306+
script_path_ptr = support_base + debug_offsets.debugger_support.debugger_script_path
307+
write_string(pid, script_path_ptr, script_path)
308+
# 2. Set debugger_pending_call = 1
309+
pending_ptr = support_base + debug_offsets.debugger_support.debugger_pending_call
310+
write_int(pid, pending_ptr, 1)
311+
# 3. Set _PY_EVAL_PLEASE_STOP_BIT in eval_breaker
312+
eval_breaker_ptr = thread_state_addr + debug_offsets.debugger_support.eval_breaker
313+
breaker = read_int(pid, eval_breaker_ptr)
314+
# Set the least significant bit (this is _PY_EVAL_PLEASE_STOP_BIT)
315+
breaker |= 1
316+
write_int(pid, eval_breaker_ptr, breaker)
317+
318+
After these writes are complete, the debugger may resume the process (if it was paused).
319+
The interpreter will check ``eval_breaker`` at the next evaluation checkpoint,
320+
detect the pending call, and load and execute the specified Python file. The debugger is responsible
321+
for ensuring that the file remains on disk and readable by the target interpreter
322+
when it is accessed.
323+
324+
Summary
325+
=======
326+
327+
To inject and execute a script in a remote Python process:
328+
329+
1. Locate the ``PyRuntime`` structure in the target process's memory.
330+
2. Read and validate the ``_Py_DebugOffsets`` structure at the start of ``PyRuntime``.
331+
3. Use the offsets to locate a valid ``PyThreadState``.
332+
4. Write the path to a Python script into ``debugger_script_path``.
333+
5. Set ``debugger_pending_call = 1``.
334+
6. Set ``_PY_EVAL_PLEASE_STOP_BIT`` in ``eval_breaker``.
335+
7. Resume the process (if paused). The script will be executed at the next safe eval point.

0 commit comments

Comments
 (0)