Move native AST identity cache into the Rust extension#392
Move native AST identity cache into the Rust extension#392adamziel merged 22 commits intoadamziel/ast-child-identityfrom
Conversation
The PHP-side cache from #391 fixed the correctness regression but added 27-31% to translator-style re-entry workloads — every accessor probes a PHP zend_array that, at full-walk scale, holds 4.8M entries, and PHP still allocates a fresh wrapper before the cache check gets a chance to drop it. This moves the cache to where it can actually skip work. WP_MySQL_Native_Ast now carries a RefCell<HashMap<usize, ZBox<ZendObject>>>; cached_node_zval returns a Zval pointing at the stored wrapper with refcount bumped on a hit, so the allocation and four zend_update_property calls of the construction path are gone. Every accessor (get_first_child_node, get_descendants, etc.) routes through this helper. PHP-side cache disappears: WP_MySQL_Native_Parser_Node goes back to plain bridge calls, the WP_MySQL_Native_AST_Cache holder is removed. Mutation semantics are unchanged — materialize_native_children still flips was_mutated and copies the same wrappers (now interned by Rust) into $this->children, so any caller mutation made before append_child still survives. Tokens are not yet interned. The public token API has no mutators and no caller in this repo relies on token identity; if that changes we extend node_cache with a token map.
Rust deny-by-default lint flags the &T -> &mut T cast as UB even when the borrow is unused. Make the cache lookup borrow_mut up front and hand zval_from_object_addref a real &mut reference into the boxed entry. Same semantics, no UB lint.
Rust-side cache: huge CPU wins, but a memory cycle to discussCI run 25190933025, commit 30a38ae. Same corpus, same runner, baseline = CPU — Rust cache is the right architecture
The picture this confirms:
The full +27 to +31% regression on translator-style workloads is recovered. None of the wins required touching the Rust accessor logic — only moving where the cache lives. Memory — there is a problem
This is a reference cycle, not an unbounded leak per se, but PHP can't break it:
The benchmark is a worst case — real workloads that hold one AST at a time will see a fixed per-AST overhead (~10–15 MB on this corpus). But it's still a regression, and on long-running processes (e.g., a process supervisor running translator passes over many queries) it'd grow without bound. Options
My read: option 1 is the right ship-it-now answer. The PHP-side cache restores correctness today, and the +27–31% on re-entry is acceptable until a proper Rust-side solution can land without the cycle. Option 2 is tempting given the perf wins but the unbounded growth in long-running workloads makes it unsafe to ship as-is. Option 3 is the eventual destination but isn't ready in this PR. Your call. |
These are the contract for the gc_handler that comes next: the Rust extension's node_cache forms a cycle (cache -> wrapper -> $native_ast property -> WpMySqlNativeAst -> cache) that PHP's cycle collector can't walk into without help. The tests will fail until the handler exposes the cached wrappers to PHP's GC. The tests are deliberately hostile — loops with explicit gc_collect_cycles between iterations, assertions on memory floors, mutation-before-drop, overlapping-AST lifetimes, and orphaned-child use-after-drop. Each one breaks in a different direction the leak can manifest. These tests fail on the current commit; the next commit makes them pass.
Patch the class's default_object_handlers->get_gc on module startup so PHP's cycle collector can walk the Rust-side node_cache. The handler enumerates cached ZendObject wrappers into PHP's gc_buffer without bumping refcounts; PHP's collector uses these to detect that node_cache -> wrappers -> $native_ast property -> WpMySqlNativeAst forms a cycle and collects it. This is the implementation half of the cycle-collection contract added in the previous commit's tests. Expect compile/runtime iteration — ext-php-rs 0.15 doesn't expose IS_OBJECT_EX or zend_get_gc_buffer_* directly, so they are declared inline and may need adjustment for the runner's PHP build.
PHP 8.3 added zend_class_entry.default_object_handlers letting us patch get_gc once per class on MINIT, but PHP 8.2 has no such field — the class entry only stores create_object. The portable path is to override each WpMySqlNativeAst's zend_object.handlers right after the object is allocated. We seed the patched handlers struct lazily on the first AST and reuse it for the rest.
Ubuntu's PHP 8.2 build (shivammathur/setup-php) does not export zend_get_gc_buffer_use, so the previous extern "C" declarations caused a link error at dlopen time. Drop them and own the trace buffer ourselves: a RefCell<Vec<zval>> on WpMySqlNativeAst that the get_gc handler refills on each call and exposes via the (table, n) out-params. Same semantics, no PHP-side dependency.
Replacing the per-class handlers pointer with a copy broke ext-php-rs's FromZval, which uses the handlers pointer as a class-identity key. Patch the original struct in place instead — its memory is a heap Box::leak the extension owns, so the write is safe, and every native_ast lookup keeps working because the pointer identity is preserved. Idempotent guard via a process-level boolean; only the first AST does the write.
The custom get_gc approach worked for the Rust cache itself but the trace construction segfaulted PHP inside its automatic cycle collector, and five iterations of CI debugging didn't converge on a safe zval format the collector accepts. Walking back to the working state (Rust cache without gc_handler): - Drop the gc_trace field, ast_get_gc handler, install_gc_handler_for install path, and the PHP_IS_OBJECT_EX constant. - Mark the cycle tests as incomplete with a pointer to the limitation and a note that a future ext-php-rs version exposing the get_gc hooks would let us complete them. The Rust cache still delivers the CPU wins documented in the PR description (rewalk -65%, subtree -24%); the per-AST memory cycle remains a known limitation for long-running processes — see the perf comment for guidance.
Final summary on the gc_handler experimentCI is green at 8b24b49: Rust extension PHPUnit, SQLite driver / Rust extension, and the static matrix all pass. What stuckThe Rust-side identity cache itself —
What didn'tThe custom
Where the cycle tests liveThe contract is captured in RecommendationLand #391, not #392. #391 fixes the correctness regression with bounded memory and a +5–31% perf cost depending on workload shape. #392 has a strictly bigger CPU win on hit-heavy workloads but an unbounded memory cycle in long-running processes that we don't have a usable mechanism to break in ext-php-rs 0.15. A follow-up worth opening once any of these change:
I'd close #392 as research; the work and numbers stay in history as the rationale for any future Rust-cache attempt. |
New Native AST walk perf numbersSource: passing
Pure PHP reference from the same run:
The repeated-access cases are the target shape here. The final pointer-registry cache is faster than the no-cache baseline there while also lowering peak memory. The parse-only path is effectively unchanged. |
Stacked on #391.
#391 restored
WP_Parser_Nodeidentity semantics by interning native child wrappers in PHP. That fixed correctness, but the PHP-side cache formed a retention cycle and added measurable cost to hit-heavy translator-style workloads.This PR moves native wrapper identity out of PHP object properties entirely:
WP_MySQL_Native_Parser_Nodeno longer stores$native_ast,$native_node_index, or a PHP-side identity-cache object.wp_sqlite_mysql_native_ast_get_children( $this ).(NativeAstState, node_index, is_materialized), and each AST keeps a node-index -> wrapper-pointer cache.__destruct()releases a wrapper from the Rust registry. Materialization marks the wrapper as detached from native reads while leaving it discoverable from the parent cache as long as it is still live.That breaks the cycle that mattered here: PHP wrappers no longer strongly reference a native AST object, and Rust no longer strongly references PHP wrappers. PHP's cycle collector can collect wrapper graphs normally; destructors then clean up the Rust registry entries.
Tokens remain un-interned. The public token API has no mutators, and no caller in this repo relies on token object identity.
Perf numbers
From the passing
Native AST Walk PerfCI run on this head (2d93be25f599c3c4482480a6ab644d61b9337b12), comparing this PR to the native no-cache baseline (codex/native-lazy-ast-facade):The parse-only path is effectively unchanged. The repeated-access workloads this cache is meant to help are materially faster and use less peak memory.
For context, the same CI run measured the pure-PHP path at:
Safety coverage
The PR now includes native-extension tests for:
$native_ast/$native_node_indexproperties on wrappers;CI smoke checks also assert the SQLite-driver and WordPress test-container paths select the native wrapper model and do not regress back to
$native_aststorage.Test plan
cargo checkforpackages/php-ext-wp-mysql-parsercargo fmt --checkforpackages/php-ext-wp-mysql-parserpackages/mysql-on-sqlitePHPUnit suite with the Rust extension loaded