Skip to content

refactor(ssz): drop BaseTreeView, use tuple for ContainerTreeView#139

Merged
wemeetagain merged 11 commits intomainfrom
te/refactor_treeview_2
Mar 17, 2026
Merged

refactor(ssz): drop BaseTreeView, use tuple for ContainerTreeView#139
wemeetagain merged 11 commits intomainfrom
te/refactor_treeview_2

Conversation

@twoeths
Copy link
Copy Markdown
Collaborator

@twoeths twoeths commented Dec 22, 2025

Motivation

The previous BaseTreeView design had several structural problems:

  • Child TreeViews were returned as shallow copies ({.base_view=...}), risking double-free or dangling pointers
  • Extra per-type fields (e.g. length, cached child arrays) could not be stored in child views without hacks
  • All TreeView types were tightly coupled, making ContainerNodeStruct (and similar) hard to implement
  • Parent commit() could not propagate through modified child TreeViews because children were copies, not references

Part of #78.

Changes

Drop BaseTreeView — each TreeView is now a self-contained struct. Reusable logic (e.g. getChildNode, setChildNode, getLength, setLength) is shared via standalone utilities, not inheritance.

ContainerTreeView now uses a comptime tuple instead of a runtime Map. Each field slot holds either a reference to a child TreeView or a native basic-type value. This gives type-level access to the child type at a given index, enabling future ContainerNodeStructTreeView.

ArrayBasicTreeView / ArrayCompositeTreeView now track:

  • A Map of dirty nodes (committed to cache only on commit())
  • A length field updated lazily
  • For composite: a Map of child TreeView references (so parent commit() propagates correctly)

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @twoeths, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the ContainerTreeView implementation to align with the ssz-ts design principles. The primary goal is to enhance the flexibility and extensibility of TreeView implementations by removing the tight coupling with BaseTreeView. This is achieved by directly managing child TreeView instances or basic type nodes within the container, introducing a change tracking mechanism for efficient updates, and updating the get and set operations accordingly. The changes facilitate future enhancements and better support for complex SSZ types.

Highlights

  • Decoupling from BaseTreeView: The ContainerTreeView implementation has been refactored to remove its tight coupling with BaseTreeView, allowing for more specialized and flexible data structure management.
  • Direct Child Management: Child TreeView instances or basic type Node.Ids are now stored directly within the ContainerTreeView using a dynamically generated tuple (child_data), improving control and performance over child elements.
  • Optimized Commit Mechanism: A new changed hash map tracks modifications to child fields, enabling commit operations to efficiently update only the affected parts of the Merkle tree, rather than recomputing the entire structure.
  • Alignment with ssz-ts: This refactoring brings the Zig implementation of ContainerTreeView closer to the design patterns found in the ssz-ts library, addressing previous limitations and enhancing compatibility.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant refactoring of the ContainerTreeView implementation to align it with the ssz-ts version, decoupling it from BaseTreeView. The changes are extensive and introduce a more self-contained and feature-rich ContainerTreeView.

I've found a couple of critical issues related to memory safety and correctness in the new implementation that need to be addressed. One is a bug in the commit function that could lead to reading uninitialized memory, and the other is an unsafe shallow copy in the set function for composite types, which could cause memory corruption. I've also included a medium-severity suggestion regarding adherence to the project's style guide for struct initialization, which could improve performance and reduce stack usage.

Comment thread src/ssz/tree_view/container.zig
Comment thread src/ssz/tree_view/container.zig
Comment thread src/ssz/tree_view/container.zig Outdated
@twoeths twoeths marked this pull request as ready for review December 24, 2025 08:22
@twoeths twoeths requested a review from a team as a code owner January 4, 2026 08:12
Comment thread src/ssz/tree_view/utils/child_nodes.zig Outdated
Comment thread src/ssz/tree_view/utils/clone_opts.zig
Comment thread src/ssz/tree_view/array_composite.zig
Comment thread src/ssz/tree_view/bit_list.zig
Comment thread src/ssz/tree_view/bit_vector.zig
Comment thread src/ssz/tree_view/bit_array.zig Outdated
Comment thread src/ssz/tree_view/container.zig Outdated
.pool = self.pool,
.child_data = .{null} ** ST.chunk_count,
.original_nodes = .{null} ** ST.chunk_count,
.root = self.root,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we copy self.root without ref, will we run into double unrefs if the original self.root was unrefed?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, resolved it via calling init() instead 27d8022

Comment thread src/ssz/tree_view/container.zig Outdated
Comment thread src/ssz/tree_view/list_basic.zig
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the TreeView implementations to better align with the TypeScript ssz-ts library, removing the tightly coupled BaseTreeView abstraction and giving each TreeView type more freedom in its implementation.

Key Changes:

  • Removed BaseTreeView and TreeViewData abstractions in favor of isolated TreeView implementations
  • Changed TreeView.init() to return pointers (*Self) instead of values (Self) for better memory management
  • Implemented ContainerTreeView using tuples to store child data/views instead of hash maps
  • Added utility modules for shared functionality (clone_opts.zig, child_nodes.zig, assert.zig)
  • Refactored list and array views to track length internally and only update tree on commit()

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/int/ssz/tree_view/list_composite.zig Updated to use pointer-based TreeView API, changed getRoot() calls, and fixed deref patterns
test/int/ssz/tree_view/list_basic.zig Updated getAll() to not require allocator parameter, changed to pointer-based API
test/int/ssz/tree_view/container.zig Updated Field() to return pointer types for composite fields, adjusted getRoot() usage
test/int/ssz/tree_view/bit_vector.zig Changed from base_view.data to direct data field access
test/int/ssz/tree_view/bit_list.zig Changed from base_view.data to direct data field access
test/int/ssz/tree_view/array_composite.zig Updated to pointer-based API and removed unnecessary deinit calls for borrowed references
test/int/ssz/tree_view/array_basic.zig Updated getAll() signature and pointer-based API usage
src/ssz/tree_view/utils/clone_opts.zig New utility file defining CloneOpts struct for clone operations
src/ssz/tree_view/utils/child_nodes.zig New utility file with shared child node management functions
src/ssz/tree_view/utils/assert.zig New utility file for compile-time TreeView type assertions
src/ssz/tree_view/root.zig Removed BaseTreeView and TreeViewData exports
src/ssz/tree_view/list_composite.zig Refactored to store chunks directly and track length internally
src/ssz/tree_view/list_basic.zig Refactored to store chunks directly and track length internally
src/ssz/tree_view/container.zig Complete rewrite using tuple-based child storage with test added
src/ssz/tree_view/chunks.zig Refactored BasicPackedChunks and CompositeChunks to be self-contained
src/ssz/tree_view/bit_vector.zig Refactored to store BitArray data directly
src/ssz/tree_view/bit_list.zig Refactored to store BitArray data directly
src/ssz/tree_view/bit_array.zig Refactored BitArray to be self-contained with own fields
src/ssz/tree_view/base.zig Deleted file - BaseTreeView and TreeViewData removed
src/ssz/tree_view/array_composite.zig Refactored to store chunks directly
src/ssz/tree_view/array_basic.zig Refactored to store chunks directly
src/ssz/root.zig Removed BaseTreeView and TreeViewData exports

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/ssz/tree_view/list_basic.zig Outdated
Comment thread src/ssz/tree_view/container.zig
Copy link
Copy Markdown
Collaborator

@spiral-ladder spiral-ladder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review with @wemeetagain on a recorded call

Comment thread src/ssz/tree_view/utils/child_nodes.zig Outdated
Comment thread src/ssz/tree_view/list_basic.zig Outdated
Comment on lines +195 to +197
fn getLength(self: *Self) !usize {
return self.chunks.getLength();
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn getLength(self: *Self) !usize {
return self.chunks.getLength();
}

Consider removing this entirely since we only call this during init(). Prefer self.chunks.getLength() directly.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in ee71eb2

Comment thread src/ssz/tree_view/list_basic.zig Outdated
Comment on lines +204 to +206
if (self._len > ST.limit) {
return error.LengthOverLimit;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (self._len > ST.limit) {
return error.LengthOverLimit;
}
if (self._len >= ST.limit) {
return error.LengthOverLimit;
}

Seems like a redundant check since every update to self._len should include this check by correctness (which we do already check when we update it above) + this should be inclusive of ST.limit. If we want to keep this, we can turn this into an assertion.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in cd9e0ff

if (existing) |child_value| {
return child_value;
} else {
const node = try self.root.getNodeAtDepth(self.pool, ST.chunk_depth, field_index);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const node = try self.root.getNodeAtDepth(self.pool, ST.chunk_depth, field_index);
const node = try self.root.getNodeAtDepth(self.pool, ST.chunk_depth, field_index);
errdefer self.pool.unref(node);

missing errdefer here + other similar areas

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we have to errdefer unref() here, the get() api simply returns a borrowed reference to the child the getNodeAtDepth() does not create a new ref at all

Comment thread src/ssz/tree_view/container.zig
Comment thread src/ssz/tree_view/container.zig Outdated
Comment on lines +156 to +159
try child_view.commit();
const child_changed = if (self.original_nodes[i]) |orig_node| blk: {
break :blk orig_node != child_view.getRoot();
} else true;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
try child_view.commit();
const child_changed = if (self.original_nodes[i]) |orig_node| blk: {
break :blk orig_node != child_view.getRoot();
} else true;
const child_changed = try child_view.commit();

Could we perhaps return a bool here instead of using original_nodes to track if the child changed?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, it involves changes across all TreeViews and could avoid having to store child_nodes
will track as a separate issue

Comment on lines +122 to +127
self.child_data[i] = null;
}
}
inline for (0..ST.chunk_count) |i| {
// these nodes are unref by root
self.original_nodes[i] = null;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.child_data[i] = null;
}
}
inline for (0..ST.chunk_count) |i| {
// these nodes are unref by root
self.original_nodes[i] = null;
}
}
inline for (0..ST.chunk_count) |i| {
// these nodes are unref by root

Do we need to set these to null if we're only calling this at deinit? Plus, could we perhaps just inline this function since deinit() is the only place we use this?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to set these to null if we're only calling this at deinit?

yes, deinit is for cleaning everything, plus we don't want to track a dangling pointers there where child TreeViews are also deinited

Plus, could we perhaps just inline this function since deinit() is the only place we use this?

that's part of the step when you look at different TreeView structs so prefer to leave it there
I make them private to make it easier to reason about through see 28c0ca6

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps rename this to vector_basic.zig? Same comment with array_composite.zig

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not part of this work, happy to do it in a separate PR

@twoeths twoeths marked this pull request as draft February 17, 2026 08:07
@twoeths twoeths force-pushed the te/refactor_treeview_2 branch from aa1a82e to 7caaa2a Compare March 3, 2026 11:42
@twoeths twoeths marked this pull request as ready for review March 10, 2026 09:55
@twoeths
Copy link
Copy Markdown
Collaborator Author

twoeths commented Mar 10, 2026

the key thing about this PR is it provides a way to implement different types of TreeView to improve lodestar, see PR-232

@twoeths twoeths changed the title refactor: align TreeView implementations to ssz-ts refactor(ssz): drop BaseTreeView, use tuple for ContainerTreeView Mar 10, 2026
Comment thread src/ssz/tree_view/array_basic.zig Outdated
Comment thread src/ssz/tree_view/utils/child_nodes.zig Outdated
This was referenced Mar 16, 2026
## Summary

The tree view refactor in #139 correctly killed the heap-allocated
`BaseTreeView` indirection, but it overcorrected — it scattered five
fields (`allocator`, `pool`, `root`, `children_nodes`, `changed`) into
every view type and recovered sharing through `ChildNodes`, a set of
free functions that take `anytype` and duck-type their way to the right
field names.

This PR fixes the abstraction:

- **`TreeViewState`** — a concrete struct embedded by value in each
chunk-based view (`BasicPackedChunks`, `CompositeChunks`, `BitArray`).
Not a pointer, not a trait — just a struct with typed methods. When you
see `state: TreeViewState`, you know what state it carries and what
operations exist. The compiler gives real errors at the call site, not
inside a generic utility that reconstructs the interface from field
names.

- **`StaticBitSet`** for `ContainerTreeView.changed` — the field count
is comptime-known, so the dirty-tracking hashmap
(`AutoArrayHashMapUnmanaged(usize, void)`) becomes a zero-allocation
bitset. No hashmap init, no hashmap deinit, no allocator needed for
dirty tracking.

`ContainerTreeView` does **not** use `TreeViewState` — its comptime
tuple + fixed arrays are a genuinely different storage shape, and the
bitset is the right dirty tracker for it. The split is now explicit in
the type system rather than implicit in "which types happen to have the
right field names."

### What changed

| Before | After |
|---|---|
| `ChildNodes.getChildNode(self, gindex)` — anytype dispatch |
`self.state.getChildNode(gindex)` — concrete method on `TreeViewState` |
| `ChildNodes.Change.commit(self)` — anytype dispatch |
`self.state.commitNodes()` — concrete method |
| `CompositeChunks.commit()` duplicated sort/setNodesGrouped/ref/unref |
Flushes child views into `children_nodes`, delegates to `commitNodes()`
|
| `getLength`/`setLength` on generic `TreeViewState` | Inlined into
`BasicPackedChunks`/`CompositeChunks` (list-specific) |
| `changed: AutoArrayHashMapUnmanaged(usize, void)` in ContainerTreeView
| `changed: StaticBitSet(N)` — zero allocation |

## Test plan

- [x] `zig build test:ssz` — all 194 tests pass
- [ ] `zig build test:spec_tests -Dpreset=minimal`


🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@GrapeBaBa
Copy link
Copy Markdown
Contributor

GrapeBaBa commented Mar 17, 2026

For the non-container view types that still use TreeViewState, the allocation pattern has been inverted: previously TreeViewData was heap-allocated (pointer) while the view itself was a stack value wrapping it; now TreeViewState is embedded as a value inside the view, and the view itself is heap-allocated (pointer), along with type-specific fields like _len, children_data, etc.

This makes it harder to pool allocations with a mempool — the old *TreeViewData was a uniform-sized struct shared across all view types, making it a natural candidate for a fixed-size pool. Now each view type (ListBasicTreeView, BitVectorTreeView, ...) has a different size, requiring per-type pools or arenas instead.

Is it necessary to perform this paradigm shift to achieve the related performance optimizations?

@wemeetagain
Copy link
Copy Markdown
Member

transformed from ... treeviewstate pointer ... to the view itself as a pointer
allocated memory may be more than before
it is easier to pool the treeviewstate pointers using a mempool before

Yeah I think those are all accurate observations. And I suppose tradeoffs to have ssz-type-specific view optimizations and a coherent data model?

Now each view type has a different size
Is it necessary to perform this paradigm shift to achieve the related performance optimizations?

Having the comptime tuple-based ContainerTreeView (unique size based on the ssz type) necessarily breaks the reusably sized state paradigm, and lends itself towards converting views (at least ContainerTreeView) to be pointer types. And mixing pointer and value types imo is not feasible.

@GrapeBaBa
Copy link
Copy Markdown
Contributor

transformed from ... treeviewstate pointer ... to the view itself as a pointer
allocated memory may be more than before
it is easier to pool the treeviewstate pointers using a mempool before

Yeah I think those are all accurate observations. And I suppose tradeoffs to have ssz-type-specific view optimizations and a coherent data model?

Now each view type has a different size
Is it necessary to perform this paradigm shift to achieve the related performance optimizations?

Having the comptime tuple-based ContainerTreeView (unique size based on the ssz type) necessarily breaks the reusably sized state paradigm, and lends itself towards converting views (at least ContainerTreeView) to be pointer types. And mixing pointer and value types imo is not feasible.

Yes, comptime tuple-based ContainerTreeView improve huge performance, if it need paradigm change, we should go with it.

@wemeetagain wemeetagain merged commit ce6370e into main Mar 17, 2026
12 checks passed
@wemeetagain wemeetagain deleted the te/refactor_treeview_2 branch March 17, 2026 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants