sumak-bench

Compile-time microbenchmark harness: sumak vs drizzle-orm vs kysely.

The harness only measures how long it takes each library to turn a builder expression into { sql, params } — the network round-trip is deliberately excluded so the numbers reflect the library's own overhead rather than whatever Postgres is doing. Query complexity is matched across libraries (same columns, predicates, joins) so the comparison is like-for-like.

Prisma is intentionally not included: Prisma is a code-gen + engine layer rather than a pure query builder, so a compile-time comparison would be a category error.

Run

pnpm install
pnpm vitest bench --run bench/compile.bench.ts

Smoke test

Before changing or adding a scenario, run the smoke test:

pnpm vitest run bench/_scenarios.test.ts

It snapshots every scenario's compiled SQL across all three libraries and asserts that WHERE-bearing queries actually carry their parameters through. It exists because for >7 months the bench was running with a silent-no-op bug where sumak's typed builder accepted .where("col", "=", val) (kysely's three-arg form) at runtime, dropped the operator and value, and produced SQL without a WHERE clause — making every WHERE scenario unfairly favorable to sumak. The smoke test would have caught that the moment it landed.

Scenarios (48 total)

Cross-library compile-throughput benchmarks. The smoke test in bench/_scenarios.test.ts asserts every scenario's SQL is structurally equivalent across sumak / kysely / drizzle so the bench compares like-for- like work.

name	shape
select-all	`SELECT * FROM users`
select-where-eq	`SELECT id, name FROM users WHERE id = $1`
select-where-and	`SELECT * FROM posts WHERE author_id = $1 AND published > $2`
join-2-tables	`SELECT … FROM posts JOIN users ON posts.author_id = users.id`
insert-values	`INSERT INTO users (id, name, email, created_at) VALUES (...)`
update-where	`UPDATE users SET name = $1 WHERE id = $2`
delete-where	`DELETE FROM users WHERE id = $1`
select-where-or	`SELECT id, name FROM users WHERE id = $1 OR name = $2`
select-where-in-small	`SELECT * FROM users WHERE id IN ($1..$5)`
select-where-in-large	`SELECT * FROM users WHERE id IN ($1..$100)`
select-order-limit	`SELECT * FROM users ORDER BY name ASC LIMIT 10 OFFSET 20`
select-aggregate	`SELECT COUNT(*) AS total, MAX(id) AS hi, AVG(id) AS avg FROM users`
select-group-having	`SELECT author_id, COUNT(*) FROM posts GROUP BY author_id HAVING …`
select-distinct	`SELECT DISTINCT name FROM users`
left-join-3-tables	`SELECT … FROM comments LEFT JOIN posts LEFT JOIN users`
select-subquery-in	`SELECT * FROM posts WHERE author_id IN (SELECT id FROM users …)`
insert-many-100	`INSERT INTO users VALUES (…), (…) × 100`
select-where-deep-and	5-clause AND chain on posts
select-order-desc-limit	`SELECT * FROM posts ORDER BY published DESC LIMIT 20`
cte-single	`WITH active AS (SELECT id, name FROM users WHERE id > 0) SELECT * FROM users`
cte-with-join	CTE definition + INNER JOIN against the CTE
select-union	`SELECT id, name FROM users UNION SELECT id, name FROM users`
select-union-all	same with `UNION ALL`
window-row-number	`SELECT id, ROW_NUMBER() OVER (PARTITION BY author_id ORDER BY id) FROM posts`
upsert-do-update	`INSERT … ON CONFLICT (email) DO UPDATE SET name = ?`
insert-returning	`INSERT … RETURNING id, name`
select-case-when	`SELECT id, CASE WHEN published > 0 THEN 'published' ELSE 'draft' END FROM posts`
select-exists-subquery	`SELECT … FROM users WHERE EXISTS (SELECT … FROM posts WHERE …)`
select-count-distinct	`SELECT COUNT(DISTINCT author_id) FROM posts`
select-window-rank	`SELECT id, RANK() OVER (PARTITION BY author_id ORDER BY id) FROM posts`
select-percentile	`SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY published) FROM posts`
select-named-window	`SELECT … OVER w, … OVER w FROM posts WINDOW w AS (PARTITION BY … ORDER BY …)`
select-json-value	`SELECT JSON_VALUE(body, '$.name') FROM posts`
select-is-json	`SELECT * FROM posts WHERE body IS JSON`
select-count-any-value	`SELECT author_id, ANY_VALUE(title) FROM posts GROUP BY author_id`
merge-not-matched-by-source-bench	`MERGE INTO users USING comments … WHEN MATCHED … WHEN NOT MATCHED … WHEN NOT MATCHED BY SOURCE`
select-regex-replace	`SELECT REGEXP_REPLACE(name, '[^a-z]', '', 'g') FROM users`
select-extract-month	`SELECT EXTRACT(MONTH FROM created_at) FROM users`
select-date-trunc	`SELECT DATE_TRUNC('day', created_at) FROM users`
select-stddev-group	`SELECT author_id, STDDEV(published) FROM posts GROUP BY author_id`
select-position	`SELECT POSITION('@' IN email) FROM users`
select-array-length	`SELECT array_length(body, 1) FROM posts`
select-power	`SELECT POWER(published, 2) FROM posts`

The last seven (select-window-rank through merge-not-matched-by-source-bench) cover the SQL features added in PRs #142–151: SQL:2003 named WINDOW, SQL:2003 ordered-set aggregates (PERCENTILE_CONT), SQL:2016 JSON_VALUE and IS JSON, SQL:2023 ANY_VALUE, and the SQL:2008/2023 three-branch MERGE with WHEN NOT MATCHED BY SOURCE. Where competitors lack a first- class API for the feature (PERCENTILE_CONT, named WINDOW, JSON_VALUE, IS JSON, ANY_VALUE, MERGE) they fall back to raw SQL templates — sumak's typed builder is doing more work, but the AST it builds is more analyzable downstream (plugins, transformers, audit hooks). The bench keeps that tradeoff visible.

The trailing seven (select-regex-replace through select-power) cover scalar / aggregate function builders shipped after the previous bench wave (PRs #155, #156, #162, #164, #165, #166): REGEXP_REPLACE, EXTRACT, DATE_TRUNC, STDDEV, POSITION-IN, PG array_length, and POWER. Each exercises a typed builder where sumak emits a dedicated AST node (extract-field, position-IN keyword form, inlined pattern / unit literals) and the competitors fall back to raw template literals.

Plugin overhead microbench

bench/plugin.bench.ts measures the per-compile cost of each built-in plugin against a baseline (no plugins). softDelete and multiTenant add the most (~2.4× over baseline) because they rewrite the SELECT AST; camelCase and subjectType are nearly free because they only transform result rows. See PR #112 for the full table.

Results (post-`.where()` fix, 2026-05-18)

Compile throughput on an Apple M-series laptop, Node 24, vitest 4.1. Numbers drift across machines — treat the relative ordering as the signal, not the absolute hz.

scenario	sumak (hz)	kysely (hz)	drizzle (hz)	sumak vs kysely	sumak vs drizzle
`select-all`	729,186	671,765	49,520	1.09×	14.72×
`select-where-eq`	270,550	295,165	44,452	0.92×	6.09×
`select-where-and`	164,992	259,477	22,942	0.64×	7.19×
`join-2-tables`	215,362	141,255	28,054	1.52×	7.68×
`insert-values`	304,936	221,208	40,223	1.38×	7.58×
`update-where`	279,549	324,720	44,504	0.86×	6.28×
`delete-where`	329,836	454,393	88,985	0.73×	3.71×
`select-where-or`	192,942	145,543	31,883	1.33×	6.05×
`select-where-in-small`	188,276	191,000	12,300	0.98×	15.34×
`select-where-in-large`	61,142	77,400	1,150	0.79×	53.02×
`select-order-limit`	429,450	251,116	35,800	1.71×	11.99×
`select-aggregate`	211,892	157,000	40,400	1.35×	5.24×
`select-group-having`	179,372	141,200	33,700	1.27×	5.32×
`select-distinct`	508,317	406,700	80,900	1.25×	6.28×
`left-join-3-tables`	130,461	98,800	20,900	1.32×	6.25×
`select-subquery-in`	179,720	182,000	17,300	0.99×	10.40×
`insert-many-100`	11,774	8,011	608	1.47×	19.36×
`select-where-deep-and`	76,021	139,950	15,600	0.54×	4.87×
`select-order-desc-limit`	373,047	298,050	30,700	1.25×	12.15×
`cte-single`	224,000	178,795	16,956	1.25×	13.21×
`cte-with-join`	147,157	97,078	14,047	1.52×	10.48×
`select-union`	320,438	208,701	27,115	1.54×	11.82×
`select-union-all`	308,801	206,165	27,813	1.50×	11.10×

Where sumak wins (≥ 15 of 23): select-all, select-where-eq, join-2-tables, insert-values, select-where-or, select-where-in-small, select-order-limit, select-aggregate, select-group-having, select-distinct, left-join-3-tables, select-subquery-in, insert-many-100, select-order-desc-limit, cte-single, cte-with-join, select-union, select-union-all. (select-where-eq and select-where-in-small flipped to sumak-winning after the PR #104 identity-preserving recurse landed.)

Where kysely wins (5 of 23): select-where-and (1.35×), update-where (1.09×), delete-where (1.28×), select-where-in-large (1.25×), select-where-deep-and (1.46×). All five involve binary WHERE-chain traversal; closing them to parity needs a flat n-ary logical_op AST node so the visitor / printer don't have to walk a left-leaning binary tree. Backlog item A2 — invasive, deferred past v0.0.15.

Against drizzle, sumak is 4.87×–47× faster across the board — drizzle's template-literal-heavy internal representation costs a lot per call.

What changed in this baseline

Re-baselined on 2026-05-18 after fixing a silent-no-op bug where sumak's typed builder was accepting kysely-style .where("col", "=", val) at runtime and dropping the predicate — every WHERE-bearing scenario was generating sumak SQL without a WHERE clause. Pre-fix numbers had sumak winning every WHERE scenario by 1.5–3.5×, but it was comparing SELECT * FROM users to the competitors' parameterized SELECT … WHERE id = $1.

The fix is a runtime guard (unwrapPredicate in src/ast/typed-expression.ts) — .where("col", "=", 1) now throws a TypeError pointing at the callback form. The bench scenarios were rewritten to use the callback form .where(({col}) => col.eq(1)), and a snapshot smoke test (bench/_scenarios.test.ts) was added so the same silent-divergence can't happen again.

Per-compile wall time

Same numbers, inverted to nanoseconds per compile — useful for sanity-checking whether the compile path is ever going to show up on a trace.

scenario	sumak	kysely	drizzle
`select-all`	1.4 µs	1.5 µs	20.2 µs
`select-where-eq`	3.7 µs	3.4 µs	22.5 µs
`join-2-tables`	4.6 µs	7.1 µs	35.6 µs
`select-where-in-large`	16.4 µs	12.9 µs	869.4 µs

Even the slowest sumak scenario (select-where-in-large, ~28µs) compiles two orders of magnitude below a local Postgres round-trip (~1ms). Compile cost is not where your end-to-end latency lives — but it is what shows up on a Lambda cold start.

Why compile-time only?

Query builders live or die on the hot path between the TypeScript call and the SQL string. A benchmark that also spins up a real database would measure Postgres plus the network, not the library. On a Lambda cold path or a serverless edge runtime, compile time is the dominant overhead and the right thing to optimise.

The harness is not a substitute for end-to-end performance testing against a real database — it's a regression guard for sumak's compiler. Numbers drift between machines; treat only the relative ordering as signal.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sumak-bench

Run

Smoke test

Scenarios (48 total)

Plugin overhead microbench

Results (post-`.where()` fix, 2026-05-18)

What changed in this baseline

Per-compile wall time

Why compile-time only?

Uh oh!

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

sumak-bench

Run

Smoke test

Scenarios (48 total)

Plugin overhead microbench

Results (post-.where() fix, 2026-05-18)

What changed in this baseline

Per-compile wall time

Why compile-time only?

Results (post-`.where()` fix, 2026-05-18)