Skip to content

Restore trace callback when the profiler is disabled #334

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

TTsangSC
Copy link
Contributor

@TTsangSC TTsangSC commented Apr 14, 2025

(Closes #333.)

Summary

This draft PR adds C(-ython) level support to:

  • Retrieve and cache the trace-callback state (PyThreadState.c_tracefunc and PyThreadSatte.c_traceobj; note that sys.gettrace() only gets the latter and misses the former, hence the need for C-level shenanigans) in LineProfiler.enable()
  • Restore the trace-callback state in LineProfiler.disable()
  • Optionally call the existing trace callback from within the profiler callback, controlled by the wrap_trace argument and the ${LINE_PROFILE_WRAP_TRACE} environment variable

(UPDATE 17 Apr) It also adds 5 tests to the new test module tests/test_sys_trace.py:

  • test_callback_preservation():
    Checks the restoration of the trace-callback state after using the profiler.
  • test_callback_wrapping():
    Checks trace-callback interoperability – that the existing trace callback can be used by the profiler's callback.
  • test_wrapping_throwing_callback():
    Checks the continuation of line profiling and the disabling of the existing callback in the event that it errors out.
  • test_wrapping_line_event_disabling_callback():
    Checks the continuation of line profiling and the hiding of line events from the existing frame-local callback in the event that it disables frame line events.
  • test_wrapping_thread_local_callbacks():
    Checks the usage and restoration of thread-local existing trace callbacks by the profiler.

Since the trace-callback state is restored from the cached value in .disable(), the now-unused line_profiler/unset_trace.{c,h} and references thereto in build instructions and docs have also been removed.

Motivation

As noted in #333, LineProfiler currently interacts destructively with sys.settrace(), setting it to None when the profiler is .disable()-ed. Beside being in a way "not clean", this also has multiple unfortunate side effects, such as prohibiting cooperative use with other trace-callback based monitoring tools, such as coverage tools and debuggers.

Notably, the discarding of the trace callback is what has been tanking the coverage that codecov reports, because coverage.py can't see anything after any test calls LineProfiler.enable() in-process.

Effects

(UPDATE 17 Apr: see "Performance impacts" for updated timings.)

Note the substantial increase in coverages in line_profiler.py and profiler_mixin.py:

Run on\Filename line_profiler.py profiler_mixin.py (Overall)
main 67% (66/207 missing) 42% (89/165 missing) 59% (380/1005 missing)
This PR 81% (44/206 missing) 89% (20/165 missing) 72% (252/1007 missing)
This PR w/LINE_PROFILE_WRAP_TRACE=1 81% (44/206 missing) 93% (12/165 missing) 73% (244/1007 missing)
Sample output (`main`; click to expand)
(py3.13)  $ run_tests.py 

...

================================== test session starts ==================================

...

---------- coverage: platform darwin, python 3.13.2-final-0 ----------
Name                                                                                     
                        Stmts   Miss Branch BrPart  Cover
-----------------------------------------------------------------------------------------
---------------------------------------------------------
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/__init__.py      
                            6      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/__main__.py      
                            1      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/__ini
t__.py                      0      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/ast_p
rofile_transformer.py      41      5     12      3    85%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/ast_t
ree_profiler.py            41      0     10      0   100%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/autop
rofile.py                  14      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/line_
profiler_utils.py          12      4     10      2    55%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/profm
od_extractor.py            88      7     44      5    91%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/run_m
odule.py                   43      1     18      2    95%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/autoprofile/util_
static.py                 236     99    134     25    52%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/explicit_profiler
.py                        82     40     28      0    51%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/ipython_extension
.py                        69     69     22      0     0%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/line_profiler.py 
                          207     66     92     20    67%
/Users/terence/python_env/line_profiler_dev/line_profiler/line_profiler/profiler_mixin.py
                          165     89     54     11    42%
-----------------------------------------------------------------------------------------
---------------------------------------------------------
TOTAL                                                                                    
                         1005    380    424     68    59%
Coverage HTML written to dir htmlcov
Coverage XML written to file coverage.xml

====================== 66 passed, 1 skipped, 2 warnings in 12.53s =======================

...
Sample output (this PR; click to expand)
(3.13-cooperative-tracing)  $ run_tests.py

...

================================== test session starts ==================================

...

___________________ coverage: platform darwin, python 3.13.2-final-0 ____________________

Name                                                                                     
                              Stmts   Miss Branch BrPart  Cover
-----------------------------------------------------------------------------------------
---------------------------------------------------------------
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/__init__.py
                                  6      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/__main__.py
                                  1      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/__init__.py                      0      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/ast_profile_transformer.py      41      5     12      3    85%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/ast_tree_profiler.py            41      0      4      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/autoprofile.py                  14      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/line_profiler_utils.py          12      4     10      2    55%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/profmod_extractor.py            91      7     36      5    91%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/run_module.py                   43      1     12      2    95%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/util_static.py                 236     99    130     25    52%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/explicit_pr
ofiler.py                        82      3     24      6    90%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/ipython_ext
ension.py                        69     69     16      0     0%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/line_profil
er.py                           206     44     72      6    81%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/profiler_mi
xin.py                          165     20     42      3    89%
-----------------------------------------------------------------------------------------
---------------------------------------------------------------
TOTAL                                                                                    
                               1007    252    358     52    72%
Coverage HTML written to dir htmlcov
Coverage XML written to file coverage.xml
============================ 69 passed, 1 skipped in 16.00s =============================

...
Sample output (this PR w/ trace-callback wrapping; click to expand)
(3.13-cooperative-tracing)  $ LINE_PROFILE_WRAP_TRACE=1 run_tests.py

...

================================== test session starts ==================================

...

___________________ coverage: platform darwin, python 3.13.2-final-0 ____________________

Name                                                                                     
                              Stmts   Miss Branch BrPart  Cover
-----------------------------------------------------------------------------------------
---------------------------------------------------------------
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/__init__.py
                                  6      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/__main__.py
                                  1      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/__init__.py                      0      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/ast_profile_transformer.py      41      5     12      3    85%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/ast_tree_profiler.py            41      0      4      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/autoprofile.py                  14      0      0      0   100%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/line_profiler_utils.py          12      4     10      2    55%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/profmod_extractor.py            91      7     36      5    91%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/run_module.py                   43      1     12      2    95%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/autoprofile
/util_static.py                 236     99    130     25    52%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/explicit_pr
ofiler.py                        82      3     24      6    90%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/ipython_ext
ension.py                        69     69     16      0     0%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/line_profil
er.py                           206     44     72      6    81%
/Users/terence/python_env/line_profiler_dev/cooperative-tracing/line_profiler/profiler_mi
xin.py                          165     12     42      3    93%
-----------------------------------------------------------------------------------------
---------------------------------------------------------------
TOTAL                                                                                    
                               1007    244    358     52    73%
Coverage HTML written to dir htmlcov
Coverage XML written to file coverage.xml
============================ 69 passed, 1 skipped in 18.54s =============================

...

Caveats

(UPDATE 17 Apr: the optimizations mentioned below have been made.)

As shown by the test-suite output above, this PR may introduce some overhead to the tracing function. However:

  • The impact on the measured line timings should be minimal, since any interaction with the cached callback is after the collection of timing data.
  • A considerable part of the apparent overhead may just have to do with how the cached callback is run more often (i.e. not being outright discarded) after this PR.

Running the entire test suite without pytest-cov shows the following results:

Run on\Real time elapsed (s; between 10 runs) Mean Best σ
main 7.62 7.42 0.180
This PR 7.65 7.42 0.210
This PR w/LINE_PROFILE_WRAP_TRACE=1 7.47 7.34 0.0983

Hence, in the case where we aren't interacting with other tracing facilities, the overhead is negligible and well within normal variations. Still, it may be worth looking into optimizing wrap_trace (as implemented in line_profiler/_line_profiler.pyx::call_c_trace_state_hook_safe()) since that seem to have more severe performance implications. Since it mostly deals with objects available on the C-level (CTraceState, PyFrameObject.f_trace_lines, PyFrameObject.f_trace), it's likely that we can eke out more performance by moving call_c_trace_state_hook_safe() from Cython to C – or just merge it with the C-level call_c_trace_state_hook() function which it calls.

Update 17 Apr

Reorganization

The tests which have to do with sys trace callbacks have been moved from the tests/test_line_profiler.py module to the new tests/test_sys_trace.py.

Handling of edge cases

In order for trace-callback wrapping to be more transparent (i.e. for the wrapped callback to behave more as if it was free-standing):

  • In normal execution, a trace callback can unset the sys callback, either via erroring out or explicitly calling sys.settrace(None) (or its C equivalent). If a wrapped callback does so, e.g. in coverage.py::coverage/ctracer/tracer.c::CTracer_dealloc() (L87)):
    • The existing sys callback is restored so that profiling continues, as was previously done in this PR; but
    • The profiler's callback drops the cached callback from that point on, effectively disabling the latter; and
    • When the profiler is .disable()-ed, it unsets the sys trace callback instead of restoring it to the (dropped) cached value.
  • In normal execution, a trace callback can set .f_trace_lines to false in a frame, disabling line events for the frame (which obstructs further line profiling). If a wrapped callback does so; e.g. in coverage.py::coverage/ctracer/tracer.c::CTracer_handle_call() (L548)):
    • .f_trace_lines is reset so that profiling continues, as was previously done in this PR; but
    • The frame's local trace callable .f_trace is wrapped so that it is no longer invoked for line events.

Moreover, support for multithreaded environments (where each thread can have its own sys trace function) has been added.

Performance impacts

The previous timing info in "Effects" was unduly biased against the PR in that:

  • It was later noted that the virtual environments hosting the main and PR branches had different dependency versions (e.g. pytest); updating them to be consistent substantially closed the gap.
  • Since the new tests temper with the sys trace callback, they have to be run in subprocesses to isolate their side effects, resulting in relatively higher overhead.

As such, the venv dependencies have been updated to be consistent (pytest==8.3.5, pytest-cov==6.1.1, coverage==7.8.0), and we now run run_test.py or pytest with -k 'not sys_trace' to deselect the new tests. The updated timings are:

With pytest-cov (run_tests.py -k 'not sys_trace')

Without pytest-cov (pytest -k 'not sys_trace')

Run on\Real time elapsed (s; between 3 runs) Mean Best σ
main 15.83 15.77 0.049
This PR 15.87 15.80 0.052
This PR w/LINE_PROFILE_WRAP_TRACE=1 15.83 15.80 0.033
Run on\Real time elapsed (s; between 10 runs) Mean Best σ
main 7.045 6.97 0.058
This PR 7.043 6.94 0.109
This PR w/LINE_PROFILE_WRAP_TRACE=1 7.159 7.0 0.101

Hence the overhead introduced by the PR seems negligible. (The coverage improvements remain the same as in "Effects" above.)

Copy link

codecov bot commented Apr 14, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.32%. Comparing base (771d0b7) to head (edf2daa).
Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main     #334       +/-   ##
===========================================
+ Coverage   63.78%   82.32%   +18.53%     
===========================================
  Files          13       13               
  Lines        1041     1041               
  Branches      228      228               
===========================================
+ Hits          664      857      +193     
+ Misses        316      142      -174     
+ Partials       61       42       -19     

see 5 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e35b6cf...edf2daa. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Erotemic
Copy link
Member

Because this modifies C code, it's going to take more time for me to carefully review. I think this will be independent of the other changes you were interested in working on, but let me know if any PR is a blocker for other features. The feature I'm most interested in right now, is the one where -p will mark a submodule to be autoprofiled on import (or we could do it the easy way and just pre-import the specified modules in kernprof as an initial pass with plans to improve on it later).

With all of these changes, I'm also starting to think it would be a good idea for me to spend some effort building regression tests that monitor the overhead of line profiler on benchmark use-cases. Right now I currently test it out and check that it doesn't feel sluggish, but that's not sustainable.

@TTsangSC
Copy link
Contributor Author

Fair points, please take your time.

Since we already have provisions for running "setup" code in kernprof, it's probably easy to also hack that for doing pre-imports. I'll give that a try, as well as try to port the AST stuff over in the longer run. Oh and of course the TOML stuff that we talked about yesterday.

@TTsangSC TTsangSC marked this pull request as ready for review April 15, 2025 07:54
@TTsangSC TTsangSC force-pushed the cooperative-tracing branch from 649750a to 0b393a4 Compare May 5, 2025 14:24
TTsangSC added 11 commits May 17, 2025 19:23
line_profiler/_line_profiler.pyx
    {fetch,restore}_c_trace_state()
        New helper functions to help with caching the trace callback
    call_c_trace_state_hook()
        (FIXME: doesn't seem to work properly)
        New helper function for calling the old trace callback from
        within `python_trace_callback()`
    LineProfiler
        __init__()
            New keyword argument `call_existing_trace` (and environment
            variable `${LINE_PROFILER_CALL_EXISTING_TRACE}`) to control
            whether `python_trace_callback()` should call the old trace
            callback
        enable(), disable()
            Now managing a cache of the old trace callback, restoring it
            when the profiler is disabled
        call_existing_trace
            New property corresponding to the synonymous argument
    python_trace_callback()
        Added conditional switch to call the old trace callback (FIXME)

tests/test_line_profiler.py::test_trace_callback_preservation()
    New test that `sys.gettrace()` is restored after disabling the
    profiler
line_profiler/_line_profiler.pyx
    - Reordered definitions (fixes apparent namespacing issue with older
      Pythons/Cythons)
    - Removed unused references to `unset_trace()`

line_profiler/unset_trace.{c,h}
    Removed due to being unused
line_profiler/_line_profiler.pyx
    c_trace_state_has_hook(), get_optional_attr(), set_optional_attr()
        New C-level helper functions
    call_c_trace_state_hook_safe()
        New Cython wrapper around `call_c_trace_state_hook()` which
        guards against the following actions in the wrapped trace
        callback which may hinder line profiling:
        - (Un-)Setting the `sys` trace callback
        - Setting `frame.f_trace`
        - Setting `frame.f_trace_lines`
        Notably, this allows `python_trace_callback()` to wrap around
        `coverage.tracer.CTracer` without either stepping on the other's
        toes
    LineProfiler
        - Extended docstring
        - Renamed argument/attribute:
          `call_existing_trace` -> `wrap_trace`
        - Renamed environment variable from which the default value for
          the above is read:
          `LINE_PROFILER_CALL_EXISTING_TRACE`
          -> `LINE_PROFILE_WRAP_TRACE`
    python_trace_callback()
        - Removed variables that are declared but unused
        - Updated implementation to use `call_c_trace_state_hook_safe()`
          instead of `call_c_trace_state_hook()`
line_profiler/_line_profiler.pyx
    CTraceState
    fetch_c_trace_state(), restore_c_trace_state()
    c_trace_state_has_hook(), call_c_trace_state_hook()
    get_optional_attr(), set_optional_attr()
        Added documentation
    call_c_trace_state_hook_safe()
        No longer restoring `frame.f_trace`, which interferes with
        `Python/sysmodule.c::trace_trampoline()` and prevents
        Python-level trace functions from functioning for non-call
        events

tests/test_line_profiler.py
    test_trace_callback_preservation()
        - Refactored to always be run in a subprocess and on a temporary
          Python source file
        - Updated the run Python source for more informative assertions
    test_trace_callback_wrapping()
        New test for using an existing trace callback in conjunction
        with the profiler
tests/test_line_profiler.py::run_in_subproc()
    Replaced `tempfile.NamedTemporaryFile()` with
    `.TemporaryDirectory()` to circumvent double-open issues on Windows
line_profiler/_line_profiler.pyx
    TraceCallback
        Renamed from `CTraceState`
    alloc_callback(), free_callback()
        New functions for allocating `TraceCallback` objects
    c_trace_state_has_hook()
        Merged into `call_cached_callback()`
    fetch_callback()
        - Refactored and renamed from `fetch_c_trace_state()`
        - No longer `malloc()`-ing and returning a pointer, instead
          assigning to the fields of the pointer passed as the argument
    restore_callback()
        - Refactored and renamed from `restore_c_trace_state()`
        - No longer `free()`-ing the supplied pointer, instead just
          assigning `NULL` to its fields
    call_callback()
        - Merger of `call_c_trace_state_hook[_safe]()`
        - Now staying entirely in C-land (where possible), avoiding
          calling the `PyObject` attribute-access functions
        - Refactored heap-allocated swap `TraceCallback` objects into
          stack-allocated ones
        - Added check for disabling of the `sys` trace function -- if
          that happens, profiling continues (as with before) but the
          wrapped/cached callback is no longer called
        - Added check for disabling of frame-local line events -- if
          that happens, `.f_trace_lines` is restored (as with before)
          but the frame-local trace callable (`.f_trace`) is wrapped and
          shielded from line events
    get_optional_attr(), set_optional_attr()
        Removed now-unused methods
    disable_line_events()
        New Python-level wrapper for shielding a frame-local trace
        callable from line events
    LineProfiler.wrap_trace
        Cythonized definition
    LineProfiler.{dis,en}able(), python_trace_callback()
        Now handling per-thread caching of the `sys` callback state
tests/test_line_profiler.py
    Moved tests that have to do with tracing to another test module

tests/test_sys_trace.py
    New test module housing the test (and facilities therefor) for trace
    function manipulations:
    - isolate_test_in_subproc()
          Refactored from `./test_line_profiler.py::run_in_subproc()`
          into a decorator, which causes the wrapped test callable to be
          called in a subprocess
    - test_trace_callback_preservation(), test_trace_callback_wrapping()
          Refactored from the synonymous tests in
          `./test_line_profiler.py` so that:
          - The test content is inside the function body instead of a
            string literal
          - The helper functions and classes are moved into the test
            module's scope
    - test_wrapping_throwing_callback()
      test_wrapping_line_event_disabling_callback()
          New tests for "problematic" callbacks that unset themselves or
          attempt to disable line-profiling
line_profiler._line_profiler.pyx::call_callback()
    - Fixed obscure bug occurring when the function tries to import the
      `line_profiler._line_profiler` module object when `coverage.py`
      tracing is on
    - More verbose error messages because C code doesn't show up in
      stack traces
tests/test_sys_trace.py
    test_callback_preservation()
    test_callback_wrapping()
    test_wrapping_line_event_disabling_callback()
        Minor stylistic updates
    test_wrapping_thread_local_callbacks()
        New test for wrapping around thread-local tracing functions in a
        multithreaded environment
line_profiler/_line_profiler.pyx::alloc_callback()
    Added missing handling for when `malloc()` fails

CHANGELOG.rst
    Added entry
@TTsangSC TTsangSC force-pushed the cooperative-tracing branch from 0b393a4 to edf2daa Compare May 17, 2025 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Under-reporting of coverages (more generally, lack of monitoring tool interoperability)
2 participants