Skip to content

tablets: store TabletInfo and TabletEntry as values#834

Open
dkropachev wants to merge 3 commits intoscylladb:masterfrom
dkropachev:dk/tablet-info-list-values
Open

tablets: store TabletInfo and TabletEntry as values#834
dkropachev wants to merge 3 commits intoscylladb:masterfrom
dkropachev:dk/tablet-info-list-values

Conversation

@dkropachev
Copy link
Copy Markdown
Collaborator

@dkropachev dkropachev commented Apr 9, 2026

Summary

  • replace *TabletInfo with TabletInfo across the tablets code path
  • replace *TabletEntry with TabletEntry across the tablets code path
  • update the queue, builders, metadata integration, tests, and benchmarks to use value semantics consistently
  • add a GC-focused benchmark for measuring retained heap objects, retained heap bytes, and GC pause on a large long-lived CowTabletList

Verification

  • go test ./...
  • go test -tags unit ./tablets

GC benchmark

Compared 0c623bd (before) vs 5d43487 (after) with the included benchmark:

go test -tags unit ./tablets -run '^$' -bench '^BenchmarkCowTabletListGCShape$' -benchmem -benchtime=32x -count 5

Results:

  • heap_objects: 200.6k -> 100.6k (-49.88%, significant)
  • heap_bytes: 14.09M -> 12.44M (-11.74%, significant)
  • gc_pause_ns: 164.0k -> 141.0k (directionally better)
  • sec/op: 294.7µs -> 235.3µs (directionally better)
  • gc_cycles: unchanged (6 -> 6)

The GC benchmark is included in this PR as tablets/gcshape_bench_test.go.

@dkropachev dkropachev changed the title tablets: store TabletInfoList as values tablets: store TabletInfo and TabletEntry as values Apr 9, 2026
@dkropachev dkropachev marked this pull request as draft April 9, 2026 15:27
@dkropachev dkropachev force-pushed the dk/tablet-info-list-values branch from 9a74854 to fd3677b Compare April 9, 2026 17:38
@dkropachev dkropachev force-pushed the dk/tablet-info-list-values branch from 4677e79 to 9873b4b Compare April 9, 2026 19:44
@dkropachev dkropachev requested a review from nikagra April 9, 2026 19:45
@dkropachev dkropachev marked this pull request as ready for review April 9, 2026 19:45
@dkropachev dkropachev self-assigned this Apr 10, 2026
@dkropachev
Copy link
Copy Markdown
Collaborator Author

@sylwiaszunejko , @nikagra , please take a look

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the tablets routing metadata path to use value semantics (TabletInfo/TabletEntry stored as values rather than pointers) to reduce heap allocations/GC pressure, and adds a benchmark intended to measure retained-heap/GC characteristics for a large long-lived CowTabletList.

Changes:

  • Switched tablets core types and APIs from pointer-based (*TabletInfo, *TabletEntry) to value-based (TabletInfo, TabletEntry) across queueing, builders, lookups, and metadata integration.
  • Updated unit tests and benchmarks to the new (value, ok) lookup patterns and value-based slices.
  • Added a GC-shape benchmark for CowTabletList and updated benchmarks to the new APIs.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tablets/tablets.go Core refactor to value semantics across tablet info/entry types, queue ops, and lookup APIs.
tablets/tablets_test.go Updated findEntryForToken tests to new (entry, ok) return form.
tablets/cow_tablet_list_test.go Updated COW list tests for value-based add/find/bulk-add behavior and non-pointer mutation expectations.
tablets/tablet_utils.go Updated tablet generation helpers to return TabletInfoList values.
tablets/tablets_bench_test.go Added GC-shape benchmark and updated existing benchmarks to value-based APIs.
metadata_scylla.go Updated metadata describer tablet ingestion API to accept TabletInfo values.
conn.go Updated tablet hint unmarshalling to return TabletInfo values and propagate into metadata.
Comments suppressed due to low confidence (1)

tablets/tablets.go:409

  • TabletInfo.toEntry() assigns replicas: t.replicas without cloning. Now that tablets are passed around by value, callers may reasonably assume the driver takes an independent copy; mutating the original TabletInfo.replicas slice (or its elements) after calling AddTablet/BulkAddTablets could corrupt internal state and create data races. Consider either documenting the immutability requirement on the write APIs, or defensively cloning the replica slice at ingestion time (tradeoff: extra allocations).
// toEntry converts a TabletInfo to a TabletEntry.
func (t TabletInfo) toEntry() TabletEntry {
	return TabletEntry{
		replicas:   t.replicas,
		firstToken: t.firstToken,
		lastToken:  t.lastToken,
	}

Comment thread tablets/tablets.go
Comment on lines 342 to 350
// findEntryForToken performs a binary search within [l, r) to find the entry
// covering the given token. Returns nil if no such entry exists.
func (t TabletEntryList) findEntryForToken(token int64, l int, r int) *TabletEntry {
// covering the given token. Returns false if no such entry exists.
func (t TabletEntryList) findEntryForToken(token int64, l int, r int) (TabletEntry, bool) {
if l < 0 || r > len(t) || l > r {
return nil
return TabletEntry{}, false
}
if l == r {
return nil
return TabletEntry{}, false
}
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

findEntryForToken claims to search within the half-open range [l, r), but after the binary search it can return t[l] even when l == originalR (e.g., when searching a subrange and the token is beyond all entries in that subrange). This can return an entry outside the requested bounds. Consider preserving the original r and returning (TabletEntry{}, false) when the computed index equals the original upper bound before accessing t[l].

Copilot uses AI. Check for mistakes.
Comment thread tablets/tablets.go
Comment on lines 196 to 206
@@ -202,25 +202,25 @@ type TabletEntry struct {
lastToken int64
}

type TabletEntryList []*TabletEntry
type TabletEntryList []TabletEntry

Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing TabletInfoList/TabletEntryList from slices of pointers to slices of values is a breaking change for any external code importing github.com/gocql/gocql/tablets (and for any gocql public APIs that expose these types). If this package is intended to have a stable public API, consider providing compatibility shims (e.g., new value-based types under new names and/or wrapper helpers) or clearly scoping this change to a major-version bump.

Copilot uses AI. Check for mistakes.
Comment thread tablets/tablets.go
Comment on lines +886 to +889
// FindTabletForToken locates the tablet covering the given token. Returns false if not found.
func (c *CowTabletList) FindTabletForToken(keyspace, table string, token int64) (TabletEntry, bool) {
if c == nil {
return nil
return TabletEntry{}, false
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FindTabletForToken now returns (TabletEntry, bool) instead of *TabletEntry, which is another breaking API change for external callers. If backwards compatibility is required, consider keeping the old signature as a deprecated wrapper (or adding a second method) to ease migration.

Copilot uses AI. Check for mistakes.
b.ResetTimer()
for i := 0; i < b.N; i++ {
for j := 0; j < lookupBatchSize; j++ {
token := int64((i*lookupBatchSize + j) % tabletsPerTable)
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In BenchmarkCowTabletListGCShape, token := int64((i*lookupBatchSize + j) % tabletsPerTable) performs the multiplication/modulo in int. On 32-bit platforms (or very large b.N), i*lookupBatchSize can overflow, skewing results and potentially causing negative modulo behavior. Consider doing the arithmetic in int64 (or using uint64) before the cast.

Suggested change
token := int64((i*lookupBatchSize + j) % tabletsPerTable)
token := (int64(i)*int64(lookupBatchSize) + int64(j)) % tokenRangeCount

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants