Restore trace callback when the profiler is disabled #334

TTsangSC · 2025-04-14T14:16:17Z

(Closes #333.)

Summary

This draft PR adds C(-ython) level support to:

Retrieve and cache the trace-callback state (PyThreadState.c_tracefunc and PyThreadSatte.c_traceobj; note that sys.gettrace() only gets the latter and misses the former, hence the need for C-level shenanigans) in LineProfiler.enable()
Restore the trace-callback state in LineProfiler.disable()
Optionally call the existing trace callback from within the profiler callback, controlled by the wrap_trace argument and the ${LINE_PROFILE_WRAP_TRACE} environment variable

(UPDATE 17 Apr) It also adds 5 tests to the new test module tests/test_sys_trace.py:

test_callback_preservation():
Checks the restoration of the trace-callback state after using the profiler.
test_callback_wrapping():
Checks trace-callback interoperability – that the existing trace callback can be used by the profiler's callback.
test_wrapping_throwing_callback():
Checks the continuation of line profiling and the disabling of the existing callback in the event that it errors out.
test_wrapping_line_event_disabling_callback():
Checks the continuation of line profiling and the hiding of line events from the existing frame-local callback in the event that it disables frame line events.
test_wrapping_thread_local_callbacks():
Checks the usage and restoration of thread-local existing trace callbacks by the profiler.

Since the trace-callback state is restored from the cached value in .disable(), the now-unused line_profiler/unset_trace.{c,h} and references thereto in build instructions and docs have also been removed.

Motivation

As noted in #333, LineProfiler currently interacts destructively with sys.settrace(), setting it to None when the profiler is .disable()-ed. Beside being in a way "not clean", this also has multiple unfortunate side effects, such as prohibiting cooperative use with other trace-callback based monitoring tools, such as coverage tools and debuggers.

Notably, the discarding of the trace callback is what has been tanking the coverage that codecov reports, because coverage.py can't see anything after any test calls LineProfiler.enable() in-process.

Effects

(UPDATE 17 Apr: see "Performance impacts" for updated timings.)

Note the substantial increase in coverages in line_profiler.py and profiler_mixin.py:

Run on\Filename	`line_profiler.py`	`profiler_mixin.py`	(Overall)
`main`	67% (66/207 missing)	42% (89/165 missing)	59% (380/1005 missing)
This PR	81% (44/206 missing)	89% (20/165 missing)	72% (252/1007 missing)
This PR w/`LINE_PROFILE_WRAP_TRACE=1`	81% (44/206 missing)	93% (12/165 missing)	73% (244/1007 missing)

Sample output (`main`; click to expand)

(py3.13)  $ run_tests.py 

...

================================== test session starts ==================================

...

---------- coverage: platform darwin, python 3.13.2-final-0 ----------
Name                                                                                     
                        Stmts   Miss Branch BrPart  Cover
-----------------------------------------------------------------------------------------
---------------------------------------------------------
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/__init__.py      
                            6      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/__main__.py      
                            1      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/__ini
t__.py                      0      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/ast_p
rofile_transformer.py      41      5     12      3    85%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/ast_t
ree_profiler.py            41      0     10      0   100%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/autop
rofile.py                  14      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/line_
profiler_utils.py          12      4     10      2    55%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/profm
od_extractor.py            88      7     44      5    91%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/run_m
odule.py                   43      1     18      2    95%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/util_
static.py                 236     99    134     25    52%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/explicit_profiler
.py                        82     40     28      0    51%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/ipython_extension
.py                        69     69     22      0     0%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/line_profiler.py 
                          207     66     92     20    67%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/profiler_mixin.py
                          165     89     54     11    42%
-----------------------------------------------------------------------------------------
---------------------------------------------------------
TOTAL                                                                                    
                         1005    380    424     68    59%
Coverage HTML written to dir htmlcov
Coverage XML written to file coverage.xml

====================== 66 passed, 1 skipped, 2 warnings in 12.53s =======================

...

Sample output (this PR; click to expand)

(3.13-cooperative-tracing)  $ run_tests.py

...

================================== test session starts ==================================

...

___________________ coverage: platform darwin, python 3.13.2-final-0 ____________________

Name                                                                                     
                              Stmts   Miss Branch BrPart  Cover
-----------------------------------------------------------------------------------------
---------------------------------------------------------------
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/__init__.py
                                  6      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/__main__.py
                                  1      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/__init__.py                      0      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/ast_profile_transformer.py      41      5     12      3    85%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/ast_tree_profiler.py            41      0      4      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/autoprofile.py                  14      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/line_profiler_utils.py          12      4     10      2    55%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/profmod_extractor.py            91      7     36      5    91%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/run_module.py                   43      1     12      2    95%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/util_static.py                 236     99    130     25    52%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/explicit_pr
ofiler.py                        82      3     24      6    90%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/ipython_ext
ension.py                        69     69     16      0     0%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/line_profil
er.py                           206     44     72      6    81%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/profiler_mi
xin.py                          165     20     42      3    89%
-----------------------------------------------------------------------------------------
---------------------------------------------------------------
TOTAL                                                                                    
                               1007    252    358     52    72%
Coverage HTML written to dir htmlcov
Coverage XML written to file coverage.xml
============================ 69 passed, 1 skipped in 16.00s =============================

...

Sample output (this PR w/ trace-callback wrapping; click to expand)

(3.13-cooperative-tracing)  $ LINE_PROFILE_WRAP_TRACE=1 run_tests.py

...

================================== test session starts ==================================

...

___________________ coverage: platform darwin, python 3.13.2-final-0 ____________________

Name                                                                                     
                              Stmts   Miss Branch BrPart  Cover
-----------------------------------------------------------------------------------------
---------------------------------------------------------------
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/__init__.py
                                  6      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/__main__.py
                                  1      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/__init__.py                      0      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/ast_profile_transformer.py      41      5     12      3    85%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/ast_tree_profiler.py            41      0      4      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/autoprofile.py                  14      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/line_profiler_utils.py          12      4     10      2    55%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/profmod_extractor.py            91      7     36      5    91%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/run_module.py                   43      1     12      2    95%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/util_static.py                 236     99    130     25    52%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/explicit_pr
ofiler.py                        82      3     24      6    90%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/ipython_ext
ension.py                        69     69     16      0     0%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/line_profil
er.py                           206     44     72      6    81%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/profiler_mi
xin.py                          165     12     42      3    93%
-----------------------------------------------------------------------------------------
---------------------------------------------------------------
TOTAL                                                                                    
                               1007    244    358     52    73%
Coverage HTML written to dir htmlcov
Coverage XML written to file coverage.xml
============================ 69 passed, 1 skipped in 18.54s =============================

...

Caveats

(UPDATE 17 Apr: the optimizations mentioned below have been made.)

As shown by the test-suite output above, this PR may introduce some overhead to the tracing function. However:

The impact on the measured line timings should be minimal, since any interaction with the cached callback is after the collection of timing data.
A considerable part of the apparent overhead may just have to do with how the cached callback is run more often (i.e. not being outright discarded) after this PR.

Running the entire test suite without pytest-cov shows the following results:

Run on\Real time elapsed (s; between 10 runs)	Mean	Best	σ
`main`	7.62	7.42	0.180
This PR	7.65	7.42	0.210
This PR w/`LINE_PROFILE_WRAP_TRACE=1`	7.47	7.34	0.0983

Hence, in the case where we aren't interacting with other tracing facilities, the overhead is negligible and well within normal variations. Still, it may be worth looking into optimizing wrap_trace (as implemented in line_profiler/_line_profiler.pyx::call_c_trace_state_hook_safe()) since that seem to have more severe performance implications. Since it mostly deals with objects available on the C-level (CTraceState, PyFrameObject.f_trace_lines, PyFrameObject.f_trace), it's likely that we can eke out more performance by moving call_c_trace_state_hook_safe() from Cython to C – or just merge it with the C-level call_c_trace_state_hook() function which it calls.

Update 17 Apr

Reorganization

The tests which have to do with sys trace callbacks have been moved from the tests/test_line_profiler.py module to the new tests/test_sys_trace.py.

Handling of edge cases

In order for trace-callback wrapping to be more transparent (i.e. for the wrapped callback to behave more as if it was free-standing):

In normal execution, a trace callback can unset the sys callback, either via erroring out or explicitly calling sys.settrace(None) (or its C equivalent). If a wrapped callback does so, e.g. in coverage.py::coverage/ctracer/tracer.c::CTracer_dealloc() (L87)):
- The existing sys callback is restored so that profiling continues, as was previously done in this PR; but
- The profiler's callback drops the cached callback from that point on, effectively disabling the latter; and
- When the profiler is .disable()-ed, it unsets the sys trace callback instead of restoring it to the (dropped) cached value.
In normal execution, a trace callback can set .f_trace_lines to false in a frame, disabling line events for the frame (which obstructs further line profiling). If a wrapped callback does so; e.g. in coverage.py::coverage/ctracer/tracer.c::CTracer_handle_call() (L548)):
- .f_trace_lines is reset so that profiling continues, as was previously done in this PR; but
- The frame's local trace callable .f_trace is wrapped so that it is no longer invoked for line events.

Moreover, support for multithreaded environments (where each thread can have its own sys trace function) has been added.

Performance impacts

The previous timing info in "Effects" was unduly biased against the PR in that:

It was later noted that the virtual environments hosting the main and PR branches had different dependency versions (e.g. pytest); updating them to be consistent substantially closed the gap.
Since the new tests temper with the sys trace callback, they have to be run in subprocesses to isolate their side effects, resulting in relatively higher overhead.

As such, the venv dependencies have been updated to be consistent (pytest==8.3.5, pytest-cov==6.1.1, coverage==7.8.0), and we now run run_test.py or pytest with -k 'not sys_trace' to deselect the new tests. The updated timings are:

With pytest-cov (run_tests.py -k 'not sys_trace')

Without pytest-cov (pytest -k 'not sys_trace')

Run on\Real time elapsed (s; between 3 runs)	Mean	Best	σ
`main`	15.83	15.77	0.049
This PR	15.87	15.80	0.052
This PR w/`LINE_PROFILE_WRAP_TRACE=1`	15.83	15.80	0.033

Run on\Real time elapsed (s; between 10 runs)	Mean	Best	σ
`main`	7.045	6.97	0.058
This PR	7.043	6.94	0.109
This PR w/`LINE_PROFILE_WRAP_TRACE=1`	7.159	7.0	0.101

Hence the overhead introduced by the PR seems negligible. (The coverage improvements remain the same as in "Effects" above.)

codecov · 2025-04-14T15:01:16Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.32%. Comparing base (771d0b7) to head (edf2daa).
Report is 1 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff             @@
##             main     #334       +/-   ##
===========================================
+ Coverage   63.78%   82.32%   +18.53%     
===========================================
  Files          13       13               
  Lines        1041     1041               
  Branches      228      228               
===========================================
+ Hits          664      857      +193     
+ Misses        316      142      -174     
+ Partials       61       42       -19

see 5 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e35b6cf...edf2daa. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Erotemic · 2025-04-14T15:42:58Z

Because this modifies C code, it's going to take more time for me to carefully review. I think this will be independent of the other changes you were interested in working on, but let me know if any PR is a blocker for other features. The feature I'm most interested in right now, is the one where -p will mark a submodule to be autoprofiled on import (or we could do it the easy way and just pre-import the specified modules in kernprof as an initial pass with plans to improve on it later).

With all of these changes, I'm also starting to think it would be a good idea for me to spend some effort building regression tests that monitor the overhead of line profiler on benchmark use-cases. Right now I currently test it out and check that it doesn't feel sluggish, but that's not sustainable.

TTsangSC · 2025-04-14T15:58:42Z

Fair points, please take your time.

Since we already have provisions for running "setup" code in kernprof, it's probably easy to also hack that for doing pre-imports. I'll give that a try, as well as try to port the AST stuff over in the longer run. Oh and of course the TOML stuff that we talked about yesterday.

line_profiler/_line_profiler.pyx {fetch,restore}_c_trace_state() New helper functions to help with caching the trace callback call_c_trace_state_hook() (FIXME: doesn't seem to work properly) New helper function for calling the old trace callback from within `python_trace_callback()` LineProfiler __init__() New keyword argument `call_existing_trace` (and environment variable `${LINE_PROFILER_CALL_EXISTING_TRACE}`) to control whether `python_trace_callback()` should call the old trace callback enable(), disable() Now managing a cache of the old trace callback, restoring it when the profiler is disabled call_existing_trace New property corresponding to the synonymous argument python_trace_callback() Added conditional switch to call the old trace callback (FIXME) tests/test_line_profiler.py::test_trace_callback_preservation() New test that `sys.gettrace()` is restored after disabling the profiler

line_profiler/_line_profiler.pyx - Reordered definitions (fixes apparent namespacing issue with older Pythons/Cythons) - Removed unused references to `unset_trace()` line_profiler/unset_trace.{c,h} Removed due to being unused

line_profiler/_line_profiler.pyx c_trace_state_has_hook(), get_optional_attr(), set_optional_attr() New C-level helper functions call_c_trace_state_hook_safe() New Cython wrapper around `call_c_trace_state_hook()` which guards against the following actions in the wrapped trace callback which may hinder line profiling: - (Un-)Setting the `sys` trace callback - Setting `frame.f_trace` - Setting `frame.f_trace_lines` Notably, this allows `python_trace_callback()` to wrap around `coverage.tracer.CTracer` without either stepping on the other's toes LineProfiler - Extended docstring - Renamed argument/attribute: `call_existing_trace` -> `wrap_trace` - Renamed environment variable from which the default value for the above is read: `LINE_PROFILER_CALL_EXISTING_TRACE` -> `LINE_PROFILE_WRAP_TRACE` python_trace_callback() - Removed variables that are declared but unused - Updated implementation to use `call_c_trace_state_hook_safe()` instead of `call_c_trace_state_hook()`

line_profiler/_line_profiler.pyx CTraceState fetch_c_trace_state(), restore_c_trace_state() c_trace_state_has_hook(), call_c_trace_state_hook() get_optional_attr(), set_optional_attr() Added documentation call_c_trace_state_hook_safe() No longer restoring `frame.f_trace`, which interferes with `Python/sysmodule.c::trace_trampoline()` and prevents Python-level trace functions from functioning for non-call events tests/test_line_profiler.py test_trace_callback_preservation() - Refactored to always be run in a subprocess and on a temporary Python source file - Updated the run Python source for more informative assertions test_trace_callback_wrapping() New test for using an existing trace callback in conjunction with the profiler

tests/test_line_profiler.py::run_in_subproc() Replaced `tempfile.NamedTemporaryFile()` with `.TemporaryDirectory()` to circumvent double-open issues on Windows

line_profiler/_line_profiler.pyx TraceCallback Renamed from `CTraceState` alloc_callback(), free_callback() New functions for allocating `TraceCallback` objects c_trace_state_has_hook() Merged into `call_cached_callback()` fetch_callback() - Refactored and renamed from `fetch_c_trace_state()` - No longer `malloc()`-ing and returning a pointer, instead assigning to the fields of the pointer passed as the argument restore_callback() - Refactored and renamed from `restore_c_trace_state()` - No longer `free()`-ing the supplied pointer, instead just assigning `NULL` to its fields call_callback() - Merger of `call_c_trace_state_hook[_safe]()` - Now staying entirely in C-land (where possible), avoiding calling the `PyObject` attribute-access functions - Refactored heap-allocated swap `TraceCallback` objects into stack-allocated ones - Added check for disabling of the `sys` trace function -- if that happens, profiling continues (as with before) but the wrapped/cached callback is no longer called - Added check for disabling of frame-local line events -- if that happens, `.f_trace_lines` is restored (as with before) but the frame-local trace callable (`.f_trace`) is wrapped and shielded from line events get_optional_attr(), set_optional_attr() Removed now-unused methods disable_line_events() New Python-level wrapper for shielding a frame-local trace callable from line events LineProfiler.wrap_trace Cythonized definition LineProfiler.{dis,en}able(), python_trace_callback() Now handling per-thread caching of the `sys` callback state

tests/test_line_profiler.py Moved tests that have to do with tracing to another test module tests/test_sys_trace.py New test module housing the test (and facilities therefor) for trace function manipulations: - isolate_test_in_subproc() Refactored from `./test_line_profiler.py::run_in_subproc()` into a decorator, which causes the wrapped test callable to be called in a subprocess - test_trace_callback_preservation(), test_trace_callback_wrapping() Refactored from the synonymous tests in `./test_line_profiler.py` so that: - The test content is inside the function body instead of a string literal - The helper functions and classes are moved into the test module's scope - test_wrapping_throwing_callback() test_wrapping_line_event_disabling_callback() New tests for "problematic" callbacks that unset themselves or attempt to disable line-profiling

line_profiler._line_profiler.pyx::call_callback() - Fixed obscure bug occurring when the function tries to import the `line_profiler._line_profiler` module object when `coverage.py` tracing is on - More verbose error messages because C code doesn't show up in stack traces

tests/test_sys_trace.py test_callback_preservation() test_callback_wrapping() test_wrapping_line_event_disabling_callback() Minor stylistic updates test_wrapping_thread_local_callbacks() New test for wrapping around thread-local tracing functions in a multithreaded environment

line_profiler/_line_profiler.pyx::alloc_callback() Added missing handling for when `malloc()` fails CHANGELOG.rst Added entry

TTsangSC marked this pull request as ready for review April 15, 2025 07:54

TTsangSC force-pushed the cooperative-tracing branch from 649750a to 0b393a4 Compare May 5, 2025 14:24

TTsangSC added 11 commits May 17, 2025 19:23

Fixes in line_profiler/_line_profiler.pyx

e3cf3ef

line_profiler/_line_profiler.pyx - Reordered definitions (fixes apparent namespacing issue with older Pythons/Cythons) - Removed unused references to `unset_trace()` line_profiler/unset_trace.{c,h} Removed due to being unused

Scrubbed references to unset_trace()

9c768bd

Fixing permission oddities on Windows

c7f3804

tests/test_line_profiler.py::run_in_subproc() Replaced `tempfile.NamedTemporaryFile()` with `.TemporaryDirectory()` to circumvent double-open issues on Windows

CHANGELOG entry; fixed edge-case memory bug

edf2daa

line_profiler/_line_profiler.pyx::alloc_callback() Added missing handling for when `malloc()` fails CHANGELOG.rst Added entry

TTsangSC force-pushed the cooperative-tracing branch from 0b393a4 to edf2daa Compare May 17, 2025 17:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore trace callback when the profiler is disabled #334

Restore trace callback when the profiler is disabled #334

TTsangSC commented Apr 14, 2025 •

edited

Loading

codecov bot commented Apr 14, 2025 •

edited

Loading

Erotemic commented Apr 14, 2025

TTsangSC commented Apr 14, 2025

Restore trace callback when the profiler is disabled #334

Are you sure you want to change the base?

Restore trace callback when the profiler is disabled #334

Conversation

TTsangSC commented Apr 14, 2025 • edited Loading

Summary

Motivation

Effects

Caveats

Update 17 Apr

Reorganization

Handling of edge cases

Performance impacts

codecov bot commented Apr 14, 2025 • edited Loading

Codecov Report

Erotemic commented Apr 14, 2025

TTsangSC commented Apr 14, 2025

TTsangSC commented Apr 14, 2025 •

edited

Loading

codecov bot commented Apr 14, 2025 •

edited

Loading