This section explains the low-level protocol that allows external code to inject and execute a Python script inside a running CPython process.
This is the mechanism implemented by the :func:`sys.remote_exec` function, which
instructs a remote Python process to execute a .py
file. This section is not about using that
function, instead, it explains how the underlying protocol works so that it can be
reimplemented in any language.
The protocol assumes you already know the process you want to target and the code you want it to run. That’s why it takes two pieces of information:
- The process ID (
pid
) of the Python process you want to interact with. - A path to a Python script file (
.py
) that contains the code to be executed.
Once injected, the script is executed by the target process’s interpreter the next time it reaches a safe evaluation point. This allows tools to trigger code execution remotely without modifying the Python program itself.
In the sections that follow, we’ll walk through each step of this protocol in detail: how to locate the interpreter in memory, how to access internal structures safely, and how to trigger the execution of your script. Where necessary, we’ll highlight differences across platforms (Linux, macOS, Windows), and include example code to help clarify each part of the process.
The PyRuntime
structure holds CPython's global interpreter state and serves as
the entry point to other internal data, including the list of interpreters,
thread states, and debugger support fields.
To interact with a remote Python process, a debugger must first compute the memory
address of the PyRuntime
structure inside the target process. This cannot be
hardcoded or inferred symbolically, since its location depends on how the binary was
mapped into memory by the operating system.
The process for locating PyRuntime
is platform-specific, but follows the same
high-level approach:
- Identify where the Python executable or shared library was loaded in the target process.
- Parse the corresponding binary file on disk to find the offset of the
.PyRuntime
section. - Compute the in-memory address of
PyRuntime
by relocating the section offset to the base address found in step 1.
Each subsection below explains what must be done and provides a short example of how this can be implemented.
Linux (ELF)
To locate the PyRuntime
structure on Linux:
- Inspect the memory mappings of the target process (e.g. from
/proc/<pid>/maps
) to find the memory region where the Python executable or sharedlibpython
library is loaded. Record its base address. - Load the binary file from disk and parse its ELF section headers.
Locate the
.PyRuntime
section and determine its file offset. - Add the section offset to the base address to compute the address of the
PyRuntime
structure in memory.
An example implementation might look like:
def find_py_runtime_linux(pid):
# Step 1: Try to find the Python executable in memory
binary_path, base_address = find_mapped_binary(pid, name_contains="python")
# Step 2: Fallback to shared library if executable is not found
if binary_path is None:
binary_path, base_address = find_mapped_binary(pid, name_contains="libpython")
# Step 3: Parse ELF headers of the binary to get .PyRuntime section offset
section_offset = parse_elf_section_offset(binary_path, ".PyRuntime")
# Step 4: Compute PyRuntime address in memory
return base_address + section_offset
macOS (Mach-O)
To locate the PyRuntime
structure on macOS:
- Obtain a handle to the target process that allows memory inspection.
- Walk the memory regions of the process to identify the one that contains the Python binary or shared library. Record its base address and associated file path.
- Load that binary file from disk and parse the Mach-O headers to find the
__DATA,__PyRuntime
section. - Add the section's offset to the base address of the loaded binary to compute
the address of the
PyRuntime
structure.
An example implementation might look like:
def find_py_runtime_macos(pid):
# Step 1: Get access to the process's memory
handle = get_memory_access_handle(pid)
# Step 2: Try to find the Python executable in memory
binary_path, base_address = find_mapped_binary(handle, name_contains="python")
# Step 3: Fallback to libpython if executable is not found
if binary_path is None:
binary_path, base_address = find_mapped_binary(handle, name_contains="libpython")
# Step 4: Parse Mach-O headers to get __DATA,__PyRuntime section offset
section_offset = parse_macho_section_offset(binary_path, "__DATA", "__PyRuntime")
# Step 5: Compute PyRuntime address in memory
return base_address + section_offset
Windows (PE)
To locate the PyRuntime
structure on Windows:
- Enumerate all modules loaded in the target process.
Identify the module corresponding to
python.exe
orpythonXY.dll
, where X and Y are the major and minor version numbers of the Python version, and record its base address. - Load the binary from disk and parse the PE section headers.
Locate the
.PyRuntime
section and determine its relative virtual address (RVA). - Add the RVA to the module’s base address to compute the full in-memory address
of the
PyRuntime
structure.
An example implementation might look like:
def find_py_runtime_windows(pid):
# Step 1: Try to find the Python executable in memory
binary_path, base_address = find_loaded_module(pid, name_contains="python")
# Step 2: Fallback to shared pythonXY.dll if executable is not found
if binary_path is None:
binary_path, base_address = find_loaded_module(pid, name_contains="python3")
# Step 3: Parse PE section headers to get .PyRuntime RVA
section_rva = parse_pe_section_offset(binary_path, ".PyRuntime")
# Step 4: Compute PyRuntime address in memory
return base_address + section_rva
Once the address of the PyRuntime
structure has been computed in the target
process, the next step is to read the _Py_DebugOffsets
structure located at
its beginning.
This structure contains version-specific field offsets needed to navigate interpreter and thread state memory safely.
To read and validate the debug offsets:
- Read the memory at the address of
PyRuntime
, up to the size of_Py_DebugOffsets
. This structure is located at the very start of thePyRuntime
block. - Verify that the contents of the structure are valid. In particular:
- The
cookie
field must match the expected debug marker. - The
version
field must match the version of the Python interpreter used by the calling process (i.e., the debugger or controlling runtime). - If either the caller or the target process is running a pre-release version (such as an alpha, beta, or release candidate), then the versions must match exactly.
- The
free_threaded
flag must match between the caller and the target process.
- The
- If the structure passes validation, the debugger may now safely use the provided offsets to locate fields in interpreter and thread state structures.
If any validation step fails, the debugger should abort rather than attempting to access incompatible memory layouts.
An example of how a debugger might read and validate _Py_DebugOffsets
:
def read_debug_offsets(pid, py_runtime_addr):
# Step 1: Read memory from the target process at the PyRuntime address
data = read_process_memory(pid, address=py_runtime_addr, size=DEBUG_OFFSETS_SIZE)
# Step 2: Deserialize the raw bytes into a _Py_DebugOffsets structure
debug_offsets = parse_debug_offsets(data)
# Step 3: Validate compatibility
if debug_offsets.cookie != EXPECTED_COOKIE:
raise RuntimeError("Invalid or missing debug cookie")
if debug_offsets.version != LOCAL_PYTHON_VERSION:
raise RuntimeError("Mismatch between caller and target Python versions")
if debug_offsets.free_threaded != LOCAL_FREE_THREADED:
raise RuntimeError("Mismatch in free-threaded configuration")
return debug_offsets
After validating the _Py_DebugOffsets
structure, the next step is to locate the
interpreter and thread state objects within the target process. These structures
hold essential runtime context and are required for writing debugger control
information.
- The
PyInterpreterState
structure represents a Python interpreter instance. Each interpreter holds its own module imports, built-in state, and thread list. Most applications use only one interpreter, but CPython supports creating multiple interpreters in the same process. - The
PyThreadState
structure represents a thread running within an interpreter. This is where evaluation state and the control fields used by the debugger live.
To inject and run code remotely, the debugger must locate a valid PyThreadState
to target. Typically, this is the main thread, but in some cases, the debugger may
want to attach to a specific thread by its native thread ID.
To locate a thread:
Use the offset
runtime_state.interpreters_head
to find the address of the first interpreter in thePyRuntime
structure. This is the entry point to the list of active interpreters.Use the offset
interpreter_state.threads_main
to locate the main thread of that interpreter. This is the simplest and most reliable thread to target.Optionally, use
interpreter_state.threads_head
to walk the linked list of all threads. For eachPyThreadState
, compare thenative_thread_id
field (usingthread_state.native_thread_id
) to find a specific thread.This is useful when the debugger allows the user to select which thread to inject into, or when targeting a thread that's actively running.
Once a valid
PyThreadState
is found, record its address. This will be used in the next step to write debugger control fields and schedule execution.
An example of locating the main thread:
def find_main_thread_state(pid, py_runtime_addr, debug_offsets):
# Step 1: Read interpreters_head from PyRuntime
interp_head_ptr = py_runtime_addr + debug_offsets.runtime_state.interpreters_head
interp_addr = read_pointer(pid, interp_head_ptr)
if interp_addr == 0:
raise RuntimeError("No interpreter found in the target process")
# Step 2: Read the threads_main pointer from the interpreter
threads_main_ptr = interp_addr + debug_offsets.interpreter_state.threads_main
thread_state_addr = read_pointer(pid, threads_main_ptr)
if thread_state_addr == 0:
raise RuntimeError("Main thread state is not available")
return thread_state_addr
To locate a specific thread by native thread ID:
def find_thread_by_id(pid, interp_addr, debug_offsets, target_tid):
# Start at threads_head and walk the linked list
thread_ptr = read_pointer(
pid, interp_addr + debug_offsets.interpreter_state.threads_head
)
while thread_ptr:
native_tid_ptr = thread_ptr + debug_offsets.thread_state.native_thread_id
native_tid = read_int(pid, native_tid_ptr)
if native_tid == target_tid:
return thread_ptr
thread_ptr = read_pointer(pid, thread_ptr + debug_offsets.thread_state.next)
raise RuntimeError("Thread with the given ID was not found")
Once a valid thread state has been identified, the debugger can use it to modify control fields and request execution in the next stage of the protocol.
Once a valid thread state has been located, the debugger can write control fields that instruct the target process to execute a script at the next safe opportunity.
Each thread state contains a _PyRemoteDebuggerSupport
structure, which is used
to coordinate communication between the debugger and the interpreter. The debugger
uses offsets from _Py_DebugOffsets
to locate three key fields:
debugger_script_path
: A buffer where the debugger writes the full path to a Python source file (.py
). The file must exist and be readable by the target process.debugger_pending_call
: An integer flag. When set to1
, it signals that a script is ready to be executed.eval_breaker
: A field checked periodically by the evaluation loop. To notify the interpreter of pending debugger activity, the debugger sets the_PY_EVAL_PLEASE_STOP_BIT
in this field. This causes the interpreter to pause and check for debugger-related actions before continuing with normal execution.
To safely modify these fields, most debuggers should suspend the process before writing to memory. This avoids race conditions that may occur if the interpreter is actively running.
To perform the injection:
- Write the script path into the
debugger_script_path
buffer. - Set the
debugger_pending_call
flag to1
. - Read the value of
eval_breaker
, set the stop bit, and write the updated value back.
An example implementation might look like:
def inject_script(pid, thread_state_addr, debug_offsets, script_path):
# Base offset to the _PyRemoteDebuggerSupport struct
support_base = (
thread_state_addr +
debug_offsets.debugger_support.remote_debugger_support
)
# 1. Write script path
script_path_ptr = support_base + debug_offsets.debugger_support.debugger_script_path
write_string(pid, script_path_ptr, script_path)
# 2. Set debugger_pending_call = 1
pending_ptr = support_base + debug_offsets.debugger_support.debugger_pending_call
write_int(pid, pending_ptr, 1)
# 3. Set _PY_EVAL_PLEASE_STOP_BIT in eval_breaker
eval_breaker_ptr = thread_state_addr + debug_offsets.debugger_support.eval_breaker
breaker = read_int(pid, eval_breaker_ptr)
# Set the least significant bit (this is _PY_EVAL_PLEASE_STOP_BIT)
breaker |= 1
write_int(pid, eval_breaker_ptr, breaker)
After these writes are complete, the debugger may resume the process (if it was paused).
The interpreter will check eval_breaker
at the next evaluation checkpoint,
detect the pending call, and load and execute the specified Python file. The debugger is responsible
for ensuring that the file remains on disk and readable by the target interpreter
when it is accessed.
To inject and execute a script in a remote Python process:
- Locate the
PyRuntime
structure in the target process's memory. - Read and validate the
_Py_DebugOffsets
structure at the start ofPyRuntime
. - Use the offsets to locate a valid
PyThreadState
. - Write the path to a Python script into
debugger_script_path
. - Set
debugger_pending_call = 1
. - Set
_PY_EVAL_PLEASE_STOP_BIT
ineval_breaker
. - Resume the process (if paused). The script will be executed at the next safe eval point.