Skip to content

fix(prof): follow PHP globals model in allocation profiler #3175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
May 8, 2025

Conversation

realFlowControl
Copy link
Member

@realFlowControl realFlowControl commented Apr 1, 2025

Description

Since supporting ZTS versions of PHP, we started storing our profiler state in TLS (thread local storage). The assumption was that in PHP ZTS this would store the state per thread and helped us support ZTS PHP, while in NTS it would still store state in a TLS variable, but as there is only one thread, what gives!

It turns out there is at least one extension out there which does the following:

  • PHP user-land code calls into an \extension\namespace\do_something() function
  • the C-implementation of that PHP function spawns a thread (even in NTS PHP)
  • this other thread might execute a closure that was given to an init function of the extension earlier
  • this closure might now as well allocate memory on the ZendMM
  • As our profiler has installed it's own custom ZendMM, we get called to track the allocation
  • we need to forward to the original ZendMM for the actual allocation
  • to find that, we look it up in the TLS allocation state variable
  • which is not initialized (as that thread is not a PHP thread) so we crash 💥

Even though what that extension is doing is a race condition and violates thread safety, we should still not crash. Basically what the allocation profiler did was to make a race condition that might crash super seldom into a 100% guaranteed crash 😜

PROF-11471

Reviewer checklist

  • Test coverage seems ok.
  • Appropriate labels assigned.

@realFlowControl realFlowControl changed the title follow php globals model fix(prof) follow PHP globals model Apr 1, 2025
@github-actions github-actions bot added profiling Relates to the Continuous Profiler tracing labels Apr 1, 2025
@codecov-commenter
Copy link

codecov-commenter commented Apr 1, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.26%. Comparing base (a1c3353) to head (9b58e58).
Report is 1 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #3175   +/-   ##
=========================================
  Coverage     79.26%   79.26%           
  Complexity     2948     2948           
=========================================
  Files           118      118           
  Lines         11633    11633           
=========================================
  Hits           9221     9221           
  Misses         2412     2412           
Flag Coverage Δ
tracer-php 79.26% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a1c3353...9b58e58. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@pr-commenter
Copy link

pr-commenter bot commented Apr 1, 2025

Benchmarks [ profiler ]

Benchmark execution time: 2025-05-08 18:18:13

Comparing candidate commit 9b58e58 in PR branch florian/zts-global-state with baseline commit a1c3353 in branch master.

Found 4 performance improvements and 0 performance regressions! Performance is the same for 25 metrics, 7 unstable metrics.

scenario:php-profiler-exceptions-with-profiler

  • 🟩 execution_time [-6.256ms; -2.169ms] or [-6.991%; -2.424%]

scenario:php-profiler-exceptions-with-profiler-and-timeline

  • 🟩 execution_time [-4.358ms; -2.571ms] or [-4.902%; -2.891%]

scenario:walk_stack/50

  • 🟩 wall_time [-866.976ns; -854.784ns] or [-5.399%; -5.323%]

scenario:walk_stack/99

  • 🟩 wall_time [-800.878ns; -796.946ns] or [-5.012%; -4.988%]

@realFlowControl realFlowControl force-pushed the florian/zts-global-state branch 18 times, most recently from 71e58f0 to 386c78f Compare April 2, 2025 14:26
@realFlowControl realFlowControl force-pushed the florian/zts-global-state branch from 386c78f to 43c1d6f Compare April 8, 2025 08:31
@realFlowControl realFlowControl force-pushed the florian/zts-global-state branch from 4b03225 to 779fa59 Compare April 29, 2025 10:49
@realFlowControl
Copy link
Member Author

I've committed your suggestion

@realFlowControl realFlowControl marked this pull request as ready for review May 2, 2025 19:28
@realFlowControl realFlowControl requested a review from a team as a code owner May 2, 2025 19:28
@realFlowControl realFlowControl force-pushed the florian/zts-global-state branch from 2bdc1c9 to 0aada77 Compare May 5, 2025 07:44
@morrisonlevi morrisonlevi changed the title fix(prof) follow PHP globals model fix(prof): follow PHP globals model May 6, 2025
Copy link
Collaborator

@morrisonlevi morrisonlevi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version of the patch is much easier to review, thank you! I pushed up some cleanup which shouldn't be controversial.

I think I see a better/cleaner way to do this we should discuss over zoom (although you are welcome to try it while I am sleeping). Basically, I think we can move from UnsafeCell to Cell, mark ZendMMState as Copy, and use a Cell for the pure global version too. Then, we drop the macros.

At a high level, this works because LocalKey has convenience methods since Rust 1.73 (local_key_cell_methods). This avoids needing to use the with dance:

thread_local! {
    static X: Cell<i32> = Cell::new(1);
}

// No `with` stuff needed!
assert_eq!(X.get(), 1);

So for us, this kind of thing:

    #[cfg(php_zts)]
    ZEND_MM_STATE.with(|cell| {
        let zend_mm_state = cell.get();
        zend_mm_state_shutdown(zend_mm_state);
    });

    #[cfg(not(php_zts))]
    unsafe {
        zend_mm_state_shutdown(ptr::addr_of_mut!(ZEND_MM_STATE));
    }

Becomes this (zend_mm_state_shutdown needs to change too):

    ZEND_MM_STATE.set(zend_mm_state_shutdown());

Both ZTS and NTS are handled seamlessly. Or at least, that's the idea.

@morrisonlevi morrisonlevi force-pushed the florian/zts-global-state branch from ad433ef to d32ae2e Compare May 6, 2025 20:25
@morrisonlevi
Copy link
Collaborator

morrisonlevi commented May 6, 2025

@realFlowControl, please review my recent commit. It implements the idea to use Cell instead of UnsafeCell. As I showed you in Slack, the compiler is good at not copying the whole struct just to use one field. At least, in the places I checked ^_^

It also deduplicates initialization_panic, alloc_prof_panic_alloc, alloc_prof_panic_realloc, and alloc_prof_panic_free and does some other cleanup.

@morrisonlevi morrisonlevi changed the title fix(prof): follow PHP globals model fix(prof): follow PHP globals model in allocation profiler May 6, 2025
Honestly, this is mostly just to push a new commit up to the branch.
The gitlab CI seemed to somehow pick up the wrong branch, and I want
to see if this reproduces it.
Copy link
Collaborator

@morrisonlevi morrisonlevi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve, but as I contributed to this, Florian should review my work. Also, probably do some manual testing of it too (just usual stuff, not asking for anything extreme).

@realFlowControl realFlowControl requested review from a team as code owners May 8, 2025 06:37
@realFlowControl realFlowControl enabled auto-merge (squash) May 8, 2025 18:08
@realFlowControl realFlowControl merged commit 6e42e4c into master May 8, 2025
710 of 735 checks passed
@realFlowControl realFlowControl deleted the florian/zts-global-state branch May 8, 2025 18:48
@github-actions github-actions bot added this to the 1.10.0 milestone May 8, 2025
@bwoebi bwoebi modified the milestones: 1.10.0, 1.9.0 May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
profiling Relates to the Continuous Profiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants