perf: optimize column metadata parsing and readTypeInfo() by mykaul · Pull Request #780 · scylladb/gocql

mykaul · 2026-03-16T18:08:37Z

Summary

Optimize column metadata parsing by reading keyspace/table once per result set (when flagGlobalTableSpec is set) instead of per-column, and use skipString() to skip redundant per-column keyspace/table reads in readColWithSpec()
Add a fast-path in readTypeInfo() for simple native types (the common case), avoiding the switch statement overhead for types that need no further parsing
Fix a bug in prepared batch metadata parsing where metadata was only parsed from the last statement instead of all statements

Extracted from PR #699 (rec_perf_improvements branch). Part 3 of 3 — the other two are:

PR perf: streamline readHeader() function #778 (streamline_readHeader)
PR perf: eliminate reflection overhead in RowData() and type instantiation #779 (rowdata_reflection_elimination)

Commits

frame: optimize column metadata parsing by reading keyspace/table once — When the global table spec flag is set in result metadata, read keyspace and table name once and share them across all columns instead of re-reading per column. Add skipString() helper. Reduces per-column allocations.
Optimize readTypeInfo() with fast-path for simple native types — For the most common type info kinds (varchar, int, bigint, blob, etc.), return a NativeType directly without falling through the full switch. This avoids unnecessary branching for the ~15 simple types that need no sub-type parsing.
Fix prepared batch metadata parsing — When preparing a batch, parse result metadata from all statements (not just the last one) so that column metadata is correctly captured for multi-table batches. Includes a unit test covering this case.

Benchmark Results

Benchmarks measured with a temporary metadata_bench_test.go exercising parseResultMetadata and parsePreparedMetadata with synthetic CQL frames.

`perColSpec` (per-column keyspace/table — biggest improvement)

Benchmark	Before (ns/op)	After (ns/op)	Before (B/op)	After (B/op)	Before (allocs)	After (allocs)
10cols perColSpec	~2,600	~1,000	1,424	1,136	41	23
50cols perColSpec	~7,300	~3,500	7,056	5,488	201	103

~44% fewer allocations and ~55-60% faster for per-column spec results.

The reduction in allocations comes from skipString() which avoids allocating strings that are immediately discarded (the per-column keyspace/table names when global spec is not set are still read but now skipped efficiently).

`globalSpec` (global table spec — modest improvement from readTypeInfo fast-path)

Benchmark	Before (ns/op)	After (ns/op)	Before (B/op)	After (B/op)
10cols globalSpec	~660	~720	1,136	1,136
50cols globalSpec	~3,100	~3,400	5,488	5,488
100cols globalSpec	~9,000	~5,700	10,560	10,560

For globalSpec, allocations are unchanged (already optimal). The 100-column case shows clear improvement from the readTypeInfo fast-path; smaller sizes are within noise.

PreparedMetadata

Benchmark	Before (ns/op)	After (ns/op)
ParsePreparedMetadata	~1,300	~1,270

Within noise — prepared metadata parsing was already efficient; the fix in commit 3 is a correctness change, not a performance one.

Testing

go test -tags unit -count=1 ./... — all tests pass
go vet ./... — clean
New test TestPrepareBatchMetadataMultipleKeyspaceTables covers the batch metadata fix

Notes

The globalSpec optimization in commit 1 assumes CQL protocol v3+ where the global table spec flag is standard. All supported protocol versions (v3, v4, v5) use this flag.

Copilot

Pull request overview

This PR optimizes result/prepare metadata parsing on the receive path in the GoCQL driver by reducing redundant string allocations and adding a fast-path in type parsing, while also adding a unit test for prepared batch metadata involving multiple keyspaces/tables.

Changes:

Optimize column metadata parsing by reusing keyspace/table where possible and adding skipString() to avoid unnecessary allocations.
Add a fast-path in readTypeInfo() for common “simple” native CQL types.
Add a unit test and test-server support for preparing a BEGIN BATCH ... APPLY BATCH statement with variables spanning multiple keyspaces/tables.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`frame.go`	Adds `readTypeInfo()` fast-path, introduces `skipString()`, and refactors column metadata parsing via `readColWithSpec()` plus metadata struct changes.
`conn_test.go`	Adds unit test for batch prepare metadata across multiple keyspaces/tables and updates test server `PREPARE` handling to support batch statements.

Copilot

Pull request overview

This PR improves metadata parsing performance and correctness by reducing redundant keyspace/table reads, adding a fast-path for common native types in readTypeInfo(), and fixing prepared batch metadata parsing to read all statements’ metadata.

Changes:

Cache keyspace/table at the result-metadata level and skip redundant per-column reads via skipString().
Add a fast-path in readTypeInfo() for simple scalar native types.
Fix prepared batch metadata parsing to capture column metadata across all batch statements, with a new unit test.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
frame.go	Adds type-info fast-path, optimizes metadata parsing by caching keyspace/table and skipping redundant per-column strings, and introduces `skipString()` plus prepared-metadata tracking for mixed table specs.
conn_test.go	Adds a unit test for prepared batch metadata across multiple keyspace/tables and extends the test server’s prepare handling to emit batch-metadata responses.

mykaul · 2026-03-17T17:54:36Z

Addressed review feedback:

Fixed (this push):

Grammar fix in skipString() panic message: "require" → "requires" (line 1554)
Strengthened the !globalSpec comment in parseResultMetadata() to explicitly explain why the single-table assumption is safe: CQL SELECT results always come from a single table. Also clarified that parsePreparedMetadata() intentionally does NOT use this optimization (it uses readPerColumnSpec for the result-set part, where prepared batch metadata may reference multiple tables).

Already fixed in prior revision:

Magic number 0x0015 was already replaced with uint16(TypeDuration) in a previous push.

Not changing (style preference):

The readColWithSpec high-arity signature suggestion: while a struct could reduce parameters, the current approach is straightforward and all call sites are within the same file. The function is internal and the parameters are well-documented. Refactoring to a struct would add indirection without clear benefit for 2 call sites.

Not a bug (Copilot false positive):

The repeated concern about !globalSpec mixed-table metadata in ROWS results: CQL protocol ROWS responses always contain columns from a single table (SELECT cannot join tables). The per-column encoding exists in the protocol spec but is never exercised for ROWS in practice. The parsePreparedMetadata() path correctly handles the multi-table case for prepared batches.

Copilot

Pull request overview

This PR improves receive-path efficiency and correctness in frame parsing for the GoCQL driver by reducing redundant allocations/reads in result metadata parsing, adding a fast-path for common type info parsing, and fixing prepared batch metadata handling (with a new unit test).

Changes:

Optimize column metadata parsing by reusing keyspace/table at the result-metadata level and skipping redundant per-column strings via skipString() and readColWithSpec().
Add a fast-path in readTypeInfo() for simple native scalar types to avoid extra branching/work.
Fix/cover prepared batch metadata parsing for multi-table batches via a new unit test + test-server behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`frame.go`	Adds `readTypeInfo()` fast-path, introduces `skipString()`, refactors column parsing into `readColWithSpec()`, and adjusts prepared/result metadata parsing behavior.
`conn_test.go`	Adds a unit test for multi-keyspace/table prepared batch metadata and updates the test server’s PREPARE handling to return appropriate synthetic metadata.

Copilot

Pull request overview

This PR improves receive-path performance in frame metadata parsing and fixes correctness for prepared batch metadata, especially when multiple keyspace/table pairs are involved.

Changes:

Optimize column metadata parsing by reusing keyspace/table across columns (and skipping redundant per-column strings without allocating) when possible.
Add a fast-path in readTypeInfo() for native types that require no additional parsing.
Fix prepared batch metadata handling to correctly retain per-column keyspace/table information across mixed-table batches, with new unit coverage.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
`frame.go`	Adds `skipString()`, reworks column spec parsing to avoid redundant allocations, adds `readTypeInfo()` fast-path, and improves prepared metadata handling for mixed batches.
`frame_test.go`	Adds a unit test validating per-column table spec parsing + `skipString()` alignment (buffer fully consumed).
`conn_test.go`	Adds a unit test ensuring prepared batch metadata preserves per-column keyspace/table for mixed-table batches; extends test server PREPARE handling to support it.

…hort-circuit tracker

Read keyspace/table once before the column loop instead of per-column, add skipString() to skip wire strings without allocation, and handle the !globalSpec path correctly for prepared batch metadata where columns may reference different tables. - Add keyspaceTableTracker for prepared batch per-column spec - Replace readCol() with readColWithSpec() supporting both paths - Remove duplicate keyspace/table fields from resultMetadata - Add TestPrepareBatchMetadataMultipleKeyspaceTables - Add TestParseResultMetadata_PerColumnSpec Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Add early return for native type IDs (0x0001 through TypeDuration), skipping the TypeCustom check and the collection/UDT/tuple switch. Also convert single-case type switch to a type assertion. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

mykaul · 2026-04-02T14:37:30Z

Error:
Failed to truncate test table:
exceptions::invalid_request_exception
(Another global topology request is ongoing, please retry.)
Verdict: Flaky test — unrelated to the PR changes.
Key reasons:

The error is a transient Scylla server-side issue (topology contention during parallel test execution), not a driver bug.
The PR only modifies frame parsing code (frame.go) and test server logic (conn_test.go/frame_test.go) — it doesn't touch serialization tests or truncation logic.
All other integration test jobs (Scylla LATEST, PRIOR, LTS-LATEST, Cassandra 4/5-LATEST) pass.
The same job passes on the latest master commit.

mykaul marked this pull request as draft March 16, 2026 18:09

mykaul requested a review from Copilot March 16, 2026 18:10

Copilot started reviewing on behalf of mykaul March 16, 2026 18:11 View session

Copilot AI reviewed Mar 16, 2026

View reviewed changes

Comment thread frame.go

Comment thread frame.go Outdated

mykaul requested a review from Copilot March 16, 2026 19:46

Copilot started reviewing on behalf of mykaul March 16, 2026 19:46 View session

Copilot AI reviewed Mar 16, 2026

View reviewed changes

Comment thread frame.go Outdated

Comment thread frame.go Outdated

Comment thread frame.go

Comment thread frame.go Outdated

mykaul force-pushed the optimize_column_metadata branch from 23c8dfd to 467d091 Compare March 17, 2026 17:54

mykaul requested a review from Copilot March 18, 2026 07:42

Copilot started reviewing on behalf of mykaul March 18, 2026 07:42 View session

Copilot AI reviewed Mar 18, 2026

View reviewed changes

Comment thread frame.go Outdated

Comment thread frame.go

mykaul force-pushed the optimize_column_metadata branch from 467d091 to 52764ca Compare March 24, 2026 19:00

mykaul force-pushed the optimize_column_metadata branch from 52764ca to 9fc5382 Compare April 1, 2026 19:23

mykaul requested a review from Copilot April 1, 2026 19:24

Copilot started reviewing on behalf of mykaul April 1, 2026 19:25 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

Comment thread frame.go

Comment thread frame.go Outdated

mykaul added a commit to mykaul/gocql that referenced this pull request Apr 1, 2026

fix: address PR scylladb#780 review feedback — correct comments and s…

459700b

…hort-circuit tracker

mykaul added 2 commits April 1, 2026 23:05

mykaul force-pushed the optimize_column_metadata branch from 459700b to c94bbe1 Compare April 1, 2026 20:05

dkropachev approved these changes Apr 3, 2026

View reviewed changes

dkropachev marked this pull request as ready for review April 3, 2026 11:33

dkropachev approved these changes Apr 3, 2026

View reviewed changes

dkropachev merged commit 04c41b0 into scylladb:master Apr 3, 2026
15 of 16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimize column metadata parsing and readTypeInfo()#780

perf: optimize column metadata parsing and readTypeInfo()#780
dkropachev merged 2 commits intoscylladb:masterfrom
mykaul:optimize_column_metadata

mykaul commented Mar 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mykaul commented Mar 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

mykaul commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mykaul commented Mar 16, 2026

Summary

Commits

Benchmark Results

perColSpec (per-column keyspace/table — biggest improvement)

globalSpec (global table spec — modest improvement from readTypeInfo fast-path)

PreparedMetadata

Testing

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mykaul commented Mar 17, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

mykaul commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`perColSpec` (per-column keyspace/table — biggest improvement)

`globalSpec` (global table spec — modest improvement from readTypeInfo fast-path)