Second-round review after PR #517 (merged 2026-04-21). Five parallel architect agents did a fresh pass across core parsing, foundation, public API, advanced features, and cross-cutting infra. This is a delta document: what landed cleanly, what slipped through, what's newly surfaced.
Previous review: docs/ARCHITECT_REVIEW_2026-04-16.md.
PR #517 delivered all 11 claimed fixes to a solid baseline. But:
- One class-C1 defect was missed: subquery leaks in
PutExpression(see §1). - The new infrastructure is unused:
pkg/sql/dialect.Capabilitieshas zero production callers; legacyParsedoesn't wrap errors in the new sentinels; README still shows the old Parse pattern. You built the engine but left the signage. - Structural debt grew:
ast.gois +136 lines,pool.go+500 lines,gosqlx.gois now a 694-line candidate for the next god-file. 6 of 9 pre-v2.0 items are still open. - CI health is operationally fragile: two follow-up commits (timeout bump, staticcheck fix) after #517 are symptoms of an un-sharded race suite.
Net assessment: PR #517 was additive, not transformative. The "strangler fig" pattern needs actual strangling to follow through.
Defect: pkg/sql/ast/pool.go lines 1344-1455 — every expression type that embeds a *SelectStatement subquery sets e.Subquery = nil without releasing the statement first.
Affected paths:
InExpression(line 1344-1358)SubqueryExpression(line 1360-1362)ExistsExpression(line 1404-1406)AnyExpression(line 1408-1415)AllExpression(line 1417-1424)ArrayConstructorExpression.Subquery(line 1446-1455)
Every WHERE id IN (SELECT ...), EXISTS (SELECT ...), col = ANY(SELECT ...) leaks the full inner SelectStatement including Columns, From, Where, Joins, Windows. Same defect class C1/C2 fixed, just at expression level. pool_leak_test.go doesn't cover these constructs — that's why round-1 tests didn't catch it.
Fix: call releaseStatement(e.Subquery) before e.Subquery = nil in all six sites. Add test: parse SELECT x FROM t WHERE id IN (SELECT y FROM u WHERE z = 1) 1000 times, assert stable heap.
Severity: CRITICAL — same production perf claim at stake.
| Item | Status | Notes |
|---|---|---|
| C1/C2 statement-level leaks | Solid | releaseTableReference helper factoring is right; no double-release risk |
| C3 metrics DoS | Solid | Lock-free atomic buckets; 4 allocs/op GetStats proven |
C5 linter ast.Walk migration |
Solid | Consistent pattern; L029 latent bug found & fixed in flight |
| C6 Children() coverage | Partial | Test has structural gap (see §3) |
| H6 errors immutability | Solid | All callers audited; return-value pattern universal |
| H7 structured parser errors | Solid | Zero fmt.Errorf remaining in parser |
| H8 config loader | Solid | Schema complete; walk-up safe; ** glob properly implemented |
| H9 LSP type switch | Solid | 15 statement kinds; default fallback correct |
| H10 LSP real ranges | Solid | Semicolon fallback only hits rare DDL-no-Pos case |
| H11 keyword conflicts | Solid | keywordsEquivalent is the right semantic gate; resolutions preserve runtime |
Eight of eleven fixes are production-quality. Three have material issues addressed below.
pkg/sql/ast/children_coverage_test.go:286-288 explicitly admits: "concrete-typed fields like *CommonTableExpr cannot accept a bare sentinel." But the AST references concrete pointer types throughout (*WithClause, *TopClause, *MergeWhenClause, *OnConflict, *ConnectByClause, *SampleClause). The test only exercises interface-typed fields. Plus childrenCoverageAllowlist (line 233) is declared but never consulted — it's misleading documentation.
Fix: (a) generate zero-value concrete mocks via reflection for each concrete pointer type, or (b) register a per-type fixture table. And either wire the allowlist into the test skip logic or delete it.
pkg/sql/ast/pool.go:541 — no case for statement types defined in dml.go (*Select, *Insert, *Update, *Delete — the legacy duplicates), nor for *PragmaStatement wrapper variants. If the parser ever stores these in AST.Statements, ReleaseAST silently drops without returning to pool. Low severity because they may never be pooled today, but the dispatch should mirror pool declarations.
pool.go:1257 — workQueue := make([]Expression, 0, 32) runs on every call. In the hot path, that's one heap alloc per PutExpression, called transitively 10-100× per parse. Pool the work queue itself via sync.Pool, or use a fixed-size stack array with spillover.
pkg/metrics/metrics.go:408 — retained "to serialize rare reset paths" but reset() uses atomic stores that already race-safe against concurrent increments. The mutex protects nothing. Delete.
Round-1 fix changed map keys from err.Error() strings to ErrorCode strings. Grafana/dashboard users bound to the old keys are silently broken. Needs a CHANGELOG note and ideally a deprecation window with dual-emission.
Round 1 introduced new infrastructure that hasn't been adopted:
Grep confirms: p.Capabilities() / p.DialectTyped() / p.IsSnowflake() etc. used nowhere in parser production code. Meanwhile p.dialect string is referenced 88 times (up from 72 at round 1) across 17 files. The old pattern is growing; the new one is read-only.
Parser now has three representations coexisting:
p.dialect string— actual storage, 88 readskeywords.DialectXxxstring constants — cast asstring(keywords.DialectOracle)in 60+ placesdialect.Dialecttyped — zero production usage
Recommendation: Stop doing big-bang migrations. Cache p.dialectTyped dialect.Dialect in the Parser struct populated by WithDialect. Migrate 3-5 pure-capability sites per release (QUALIFY, PREWHERE, ARRAY JOIN, ILIKE, BRACKET_QUOTING). Add a CI grep gate: new production code may not introduce p.dialect ==; must use p.Is<Dialect>() or p.Capabilities(). Without this gate, v2.0 arrives with more, not fewer, string comparisons.
gosqlx.ErrSyntax, ErrTokenize, ErrTimeout, ErrUnsupportedDialect work for ParseTree / ParseReader. Legacy Parse / ParseWithContext / ParseWithDialect still return fmt.Errorf("tokenization failed: %w", err) without the sentinel. So:
ast, err := gosqlx.Parse(sql)
if errors.Is(err, gosqlx.ErrSyntax) { ... } // NEVER MATCHESFix: 4-line retrofit per legacy function, purely additive to the error chain. Unifies the story.
README.md lines 63-90: canonical example is still the old Parse + type-assertion pattern. Zero mentions of ParseTree, Tree, WithDialect, ErrSyntax, FormatTree. Package doc.go lists legacy functions as "primary entry points" — Tree isn't mentioned at all.
This is the single largest DX leak remaining. Round-1's criticism ("new users reach for Parse and type-assert") is still architecturally true on the surface.
pkg/compatibility/api_stability_test.go covers only legacy surface (ast.Node, pools, token types). The Tree API is unprotected. If someone refactors ParseTree to drop ctx or change Option to a struct, compat suite won't flag it.
Round-1 added Tree; usability audit surfaces what's missing for real workflows:
- No typed walkers —
WalkSelects(func(*ast.SelectStatement) bool),WalkExpressions(...). Users still writeif stmt, ok := n.(*ast.SelectStatement); okdance. sqlparser-rs and vitess both ship typed visitors. - No
Rewrite(pre, post)— closes parity gap with vitess. - No
Tree.Clone()for copy-on-write experiments. - No
Tree.Subqueries() []*Tree/Tree.CTEs()— common SQL analysis need. Release()is a documented no-op — aspirational for future pooling, but creates a training hazard today.- Tree carries full source string — 1MB SQL doubles memory.
Fixing (1) and (2) takes the Tree from "viable" to "competitive."
pkg/gosqlx/reader.go:67 — io.ReadAll(r) unconditionally. A 100MB SQL dump allocates 200MB (ReadAll + string conversion). Real-world exposures: migration files, data dumps, HTTP POST bodies. Add WithMaxBytes(n) + ErrTooLarge sentinel.
Correctly handles: single quotes, double-quoted identifiers, line comments, block comments. Misses:
- PostgreSQL dollar-quoted strings (
$$...;...$$) — will split on inner;, producing garbage - MySQL/ClickHouse backtick identifiers containing
; - SQL Server bracketed identifiers containing
; - PostgreSQL E-string backslash escapes (
E'\'') - PostgreSQL nested block comments (
/* /* nested */ */)
For a library advertising 8 dialects this is a shipping hole. Expose SplitStatements(sql, dialect) with dialect-aware handling.
Line counts after PR #517:
| File | Before | After | Δ |
|---|---|---|---|
| pkg/sql/ast/ast.go | 2327 | 2463 | +136 |
| pkg/sql/ast/pool.go | ~1500 | 2030 | +500 |
| pkg/sql/ast/sql.go | 1853 | 1853 | 0 |
| pkg/sql/tokenizer/tokenizer.go | 1842 | 1842 | 0 |
| pkg/sql/parser/parser.go | 1186 | 1195 | +9 |
| pkg/gosqlx/gosqlx.go | — | 694 | new |
Every core file exceeds the 400-line ceiling in the project's own coding-style.md by 3-6×. PR #517 added to ast.go and pool.go rather than splitting. gosqlx.go is trending toward god-file status.
Natural split seams:
ast.go→ast_statements.go+ast_expressions.go+ast_literals.go+ast_clauses.gopool.go→pool.go(declarations) +pool_statement_release.go+pool_expression_release.gosql.go(String()/SQL() serializer) along the same node-category axis
Do this before v2.0 breaking changes — refactoring 4KLOC while also breaking APIs is a merge-conflict nightmare.
PR #517 follow-ups (e0f0992 increase race detector timeout to 120s, c01edeb resolve staticcheck and race detector failures) are patches on a bigger problem:
task test:raceruns the entire tree under-racewith 3-5× overhead. No sharding.pool_leak_test.gousesruntime.GC()+ heap measurement — will be flaky on shared runners. Gate behind-shortor build tag.- Task install not cached across jobs — each of 4 race/cbinding jobs reinstalls it.
perf-regressionstillcontinue-on-error: truewith 60-65% tolerance — decorative.
Expect another timeout bump within 1-2 sprints unless split into test:race:fast + test:race:integration.
| # | Item | Round 1 | Round 2 |
|---|---|---|---|
| 1 | God-file splits | Open | Worse (+136L) |
| 2 | ConversionResult.PositionMapping removal | Open | Open (still Deprecated) |
| 3 | Merge/delete pkg/sql/token | Open | Open (still imported 18× ) |
| 4 | Move non-API packages to internal/ | Open | Open (no internal/ at root) |
| 5 | DialectRegistry replacing keywords switch | Open | Open |
| 6 | gosqlx.Tree opaque wrapper | Open | Done |
| 7 | Functional options | Open | Done |
| 8 | Structured errors in parser | Open | Done (H7) |
| 9 | Logger interface injection | Open | Open (fmt.Println in 41 files) |
Progress: 3 of 9 complete. 1 worse. 5 unchanged.
Ordered by leverage:
Sprint A — "Fix the misses" (3-4 days)
- Subquery leak in
PutExpression(§1) — critical Children()coverage test gap (§3.1)releaseStatementdispatch completeness (§3.2)- Retrofit legacy error wrappers with sentinels (§4.2)
- Delete vestigial
metrics.errorsMutex(§3.4)
Sprint B — "Close the narrative" (3-4 days)
6. README rewrite leading with ParseTree example
7. doc.go rewrite promoting Tree as primary entry
8. docs/MIGRATION.md Tree migration section + deprecation timeline
9. Add Tree API to pkg/compatibility/api_stability_test.go
Sprint C — "Tree competitive" (1 week)
10. Typed walkers (WalkSelects, WalkExpressions, generics-based)
11. Tree.Rewrite(pre, post) for transformation
12. Tree.Clone() for COW workflows
13. Dialect-aware SplitStatements (dollar-quoting, backticks, brackets)
14. WithMaxBytes + ErrTooLarge on ParseReader
Sprint D — "Begin the strangling" (1 week)
15. Cache p.dialectTyped field in Parser
16. Migrate 5 pure-capability sites to p.Capabilities() (QUALIFY, PREWHERE, ARRAY JOIN, ILIKE, BRACKET_QUOTING)
17. Add CI grep gate forbidding new p.dialect == in production code
18. Allocate workQueue from pool in putExpressionImpl (§3.3)
Sprint E — "Structural debt" (1-2 weeks)
19. Split ast.go by node category
20. Split pool.go by responsibility
21. Shard task test:race into fast + integration
22. tools/tools.go for dev-tool pinning
23. Delete examples/cmd/cmd committed binary
Sprints A+B are 1 week. Add C and you have v1.16 with a credible adoption story. D+E belong in v1.17 or a dedicated structural PR.
Trend: net-better on DX surface (Tree, options, sentinels), net-worse on structure (god files grew, 2-model coexistence).
What PR #517 really was: a correctness & API-expansion PR. It wasn't a refactor. Treating it as if it closed the architectural debt would be wrong — every structural punch-list item except three is still open, and the new code compounds some of it.
The "adoption still stuck in round 1" observation from the public API agent is the single most important line in this review. The Tree API exists but the library's outer layer (README, doc.go, compat tests) still treats legacy Parse as canonical. Until that flips, the adoption story hasn't moved.
For HN launch / v1.15 release: do Sprint A + B minimum. That's one week of work, and it turns "we added Tree" into "Tree is how you use this library."