refactor(ssz): drop BaseTreeView, use tuple for ContainerTreeView#139
refactor(ssz): drop BaseTreeView, use tuple for ContainerTreeView#139wemeetagain merged 11 commits intomainfrom
Conversation
Summary of ChangesHello @twoeths, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request is a significant refactoring of the ContainerTreeView implementation to align it with the ssz-ts version, decoupling it from BaseTreeView. The changes are extensive and introduce a more self-contained and feature-rich ContainerTreeView.
I've found a couple of critical issues related to memory safety and correctness in the new implementation that need to be addressed. One is a bug in the commit function that could lead to reading uninitialized memory, and the other is an unsafe shallow copy in the set function for composite types, which could cause memory corruption. I've also included a medium-severity suggestion regarding adherence to the project's style guide for struct initialization, which could improve performance and reduce stack usage.
| .pool = self.pool, | ||
| .child_data = .{null} ** ST.chunk_count, | ||
| .original_nodes = .{null} ** ST.chunk_count, | ||
| .root = self.root, |
There was a problem hiding this comment.
if we copy self.root without ref, will we run into double unrefs if the original self.root was unrefed?
There was a problem hiding this comment.
good catch, resolved it via calling init() instead 27d8022
There was a problem hiding this comment.
Pull request overview
This PR refactors the TreeView implementations to better align with the TypeScript ssz-ts library, removing the tightly coupled BaseTreeView abstraction and giving each TreeView type more freedom in its implementation.
Key Changes:
- Removed BaseTreeView and TreeViewData abstractions in favor of isolated TreeView implementations
- Changed TreeView.init() to return pointers (*Self) instead of values (Self) for better memory management
- Implemented ContainerTreeView using tuples to store child data/views instead of hash maps
- Added utility modules for shared functionality (clone_opts.zig, child_nodes.zig, assert.zig)
- Refactored list and array views to track length internally and only update tree on commit()
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test/int/ssz/tree_view/list_composite.zig | Updated to use pointer-based TreeView API, changed getRoot() calls, and fixed deref patterns |
| test/int/ssz/tree_view/list_basic.zig | Updated getAll() to not require allocator parameter, changed to pointer-based API |
| test/int/ssz/tree_view/container.zig | Updated Field() to return pointer types for composite fields, adjusted getRoot() usage |
| test/int/ssz/tree_view/bit_vector.zig | Changed from base_view.data to direct data field access |
| test/int/ssz/tree_view/bit_list.zig | Changed from base_view.data to direct data field access |
| test/int/ssz/tree_view/array_composite.zig | Updated to pointer-based API and removed unnecessary deinit calls for borrowed references |
| test/int/ssz/tree_view/array_basic.zig | Updated getAll() signature and pointer-based API usage |
| src/ssz/tree_view/utils/clone_opts.zig | New utility file defining CloneOpts struct for clone operations |
| src/ssz/tree_view/utils/child_nodes.zig | New utility file with shared child node management functions |
| src/ssz/tree_view/utils/assert.zig | New utility file for compile-time TreeView type assertions |
| src/ssz/tree_view/root.zig | Removed BaseTreeView and TreeViewData exports |
| src/ssz/tree_view/list_composite.zig | Refactored to store chunks directly and track length internally |
| src/ssz/tree_view/list_basic.zig | Refactored to store chunks directly and track length internally |
| src/ssz/tree_view/container.zig | Complete rewrite using tuple-based child storage with test added |
| src/ssz/tree_view/chunks.zig | Refactored BasicPackedChunks and CompositeChunks to be self-contained |
| src/ssz/tree_view/bit_vector.zig | Refactored to store BitArray data directly |
| src/ssz/tree_view/bit_list.zig | Refactored to store BitArray data directly |
| src/ssz/tree_view/bit_array.zig | Refactored BitArray to be self-contained with own fields |
| src/ssz/tree_view/base.zig | Deleted file - BaseTreeView and TreeViewData removed |
| src/ssz/tree_view/array_composite.zig | Refactored to store chunks directly |
| src/ssz/tree_view/array_basic.zig | Refactored to store chunks directly |
| src/ssz/root.zig | Removed BaseTreeView and TreeViewData exports |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
spiral-ladder
left a comment
There was a problem hiding this comment.
Partial review with @wemeetagain on a recorded call
| fn getLength(self: *Self) !usize { | ||
| return self.chunks.getLength(); | ||
| } |
There was a problem hiding this comment.
| fn getLength(self: *Self) !usize { | |
| return self.chunks.getLength(); | |
| } |
Consider removing this entirely since we only call this during init(). Prefer self.chunks.getLength() directly.
| if (self._len > ST.limit) { | ||
| return error.LengthOverLimit; | ||
| } |
There was a problem hiding this comment.
| if (self._len > ST.limit) { | |
| return error.LengthOverLimit; | |
| } | |
| if (self._len >= ST.limit) { | |
| return error.LengthOverLimit; | |
| } |
Seems like a redundant check since every update to self._len should include this check by correctness (which we do already check when we update it above) + this should be inclusive of ST.limit. If we want to keep this, we can turn this into an assertion.
| if (existing) |child_value| { | ||
| return child_value; | ||
| } else { | ||
| const node = try self.root.getNodeAtDepth(self.pool, ST.chunk_depth, field_index); |
There was a problem hiding this comment.
| const node = try self.root.getNodeAtDepth(self.pool, ST.chunk_depth, field_index); | |
| const node = try self.root.getNodeAtDepth(self.pool, ST.chunk_depth, field_index); | |
| errdefer self.pool.unref(node); |
missing errdefer here + other similar areas
There was a problem hiding this comment.
I don't see why we have to errdefer unref() here, the get() api simply returns a borrowed reference to the child the getNodeAtDepth() does not create a new ref at all
| try child_view.commit(); | ||
| const child_changed = if (self.original_nodes[i]) |orig_node| blk: { | ||
| break :blk orig_node != child_view.getRoot(); | ||
| } else true; |
There was a problem hiding this comment.
| try child_view.commit(); | |
| const child_changed = if (self.original_nodes[i]) |orig_node| blk: { | |
| break :blk orig_node != child_view.getRoot(); | |
| } else true; | |
| const child_changed = try child_view.commit(); |
Could we perhaps return a bool here instead of using original_nodes to track if the child changed?
There was a problem hiding this comment.
sounds good, it involves changes across all TreeViews and could avoid having to store child_nodes
will track as a separate issue
| self.child_data[i] = null; | ||
| } | ||
| } | ||
| inline for (0..ST.chunk_count) |i| { | ||
| // these nodes are unref by root | ||
| self.original_nodes[i] = null; |
There was a problem hiding this comment.
| self.child_data[i] = null; | |
| } | |
| } | |
| inline for (0..ST.chunk_count) |i| { | |
| // these nodes are unref by root | |
| self.original_nodes[i] = null; | |
| } | |
| } | |
| inline for (0..ST.chunk_count) |i| { | |
| // these nodes are unref by root |
Do we need to set these to null if we're only calling this at deinit? Plus, could we perhaps just inline this function since deinit() is the only place we use this?
There was a problem hiding this comment.
Do we need to set these to null if we're only calling this at deinit?
yes, deinit is for cleaning everything, plus we don't want to track a dangling pointers there where child TreeViews are also deinited
Plus, could we perhaps just inline this function since deinit() is the only place we use this?
that's part of the step when you look at different TreeView structs so prefer to leave it there
I make them private to make it easier to reason about through see 28c0ca6
There was a problem hiding this comment.
perhaps rename this to vector_basic.zig? Same comment with array_composite.zig
There was a problem hiding this comment.
not part of this work, happy to do it in a separate PR
aa1a82e to
7caaa2a
Compare
|
the key thing about this PR is it provides a way to implement different types of TreeView to improve lodestar, see PR-232 |
## Summary The tree view refactor in #139 correctly killed the heap-allocated `BaseTreeView` indirection, but it overcorrected — it scattered five fields (`allocator`, `pool`, `root`, `children_nodes`, `changed`) into every view type and recovered sharing through `ChildNodes`, a set of free functions that take `anytype` and duck-type their way to the right field names. This PR fixes the abstraction: - **`TreeViewState`** — a concrete struct embedded by value in each chunk-based view (`BasicPackedChunks`, `CompositeChunks`, `BitArray`). Not a pointer, not a trait — just a struct with typed methods. When you see `state: TreeViewState`, you know what state it carries and what operations exist. The compiler gives real errors at the call site, not inside a generic utility that reconstructs the interface from field names. - **`StaticBitSet`** for `ContainerTreeView.changed` — the field count is comptime-known, so the dirty-tracking hashmap (`AutoArrayHashMapUnmanaged(usize, void)`) becomes a zero-allocation bitset. No hashmap init, no hashmap deinit, no allocator needed for dirty tracking. `ContainerTreeView` does **not** use `TreeViewState` — its comptime tuple + fixed arrays are a genuinely different storage shape, and the bitset is the right dirty tracker for it. The split is now explicit in the type system rather than implicit in "which types happen to have the right field names." ### What changed | Before | After | |---|---| | `ChildNodes.getChildNode(self, gindex)` — anytype dispatch | `self.state.getChildNode(gindex)` — concrete method on `TreeViewState` | | `ChildNodes.Change.commit(self)` — anytype dispatch | `self.state.commitNodes()` — concrete method | | `CompositeChunks.commit()` duplicated sort/setNodesGrouped/ref/unref | Flushes child views into `children_nodes`, delegates to `commitNodes()` | | `getLength`/`setLength` on generic `TreeViewState` | Inlined into `BasicPackedChunks`/`CompositeChunks` (list-specific) | | `changed: AutoArrayHashMapUnmanaged(usize, void)` in ContainerTreeView | `changed: StaticBitSet(N)` — zero allocation | ## Test plan - [x] `zig build test:ssz` — all 194 tests pass - [ ] `zig build test:spec_tests -Dpreset=minimal` 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
For the non-container view types that still use TreeViewState, the allocation pattern has been inverted: previously TreeViewData was heap-allocated (pointer) while the view itself was a stack value wrapping it; now TreeViewState is embedded as a value inside the view, and the view itself is heap-allocated (pointer), along with type-specific fields like _len, children_data, etc. This makes it harder to pool allocations with a mempool — the old *TreeViewData was a uniform-sized struct shared across all view types, making it a natural candidate for a fixed-size pool. Now each view type (ListBasicTreeView, BitVectorTreeView, ...) has a different size, requiring per-type pools or arenas instead. Is it necessary to perform this paradigm shift to achieve the related performance optimizations? |
Yeah I think those are all accurate observations. And I suppose tradeoffs to have ssz-type-specific view optimizations and a coherent data model?
Having the comptime tuple-based ContainerTreeView (unique size based on the ssz type) necessarily breaks the reusably sized state paradigm, and lends itself towards converting views (at least ContainerTreeView) to be pointer types. And mixing pointer and value types imo is not feasible. |
Yes, comptime tuple-based ContainerTreeView improve huge performance, if it need paradigm change, we should go with it. |
Motivation
The previous
BaseTreeViewdesign had several structural problems:{.base_view=...}), risking double-free or dangling pointerslength, cached child arrays) could not be stored in child views without hacksContainerNodeStruct(and similar) hard to implementcommit()could not propagate through modified child TreeViews because children were copies, not referencesPart of #78.
Changes
Drop
BaseTreeView— each TreeView is now a self-contained struct. Reusable logic (e.g.getChildNode,setChildNode,getLength,setLength) is shared via standalone utilities, not inheritance.ContainerTreeViewnow uses a comptime tuple instead of a runtime Map. Each field slot holds either a reference to a child TreeView or a native basic-type value. This gives type-level access to the child type at a given index, enabling futureContainerNodeStructTreeView.ArrayBasicTreeView/ArrayCompositeTreeViewnow track:Mapof dirty nodes (committed to cache only oncommit())lengthfield updated lazilyMapof child TreeView references (so parentcommit()propagates correctly)