Skip to content

Commit f0fe232

Browse files
leifericfclaude
andcommitted
prim: route reduce through an unboxed int-acc accumulator
When (reduce <op> [init] coll) is called with one of the canonical numeric reducers (+, *, -, bit-and, bit-or, bit-xor) and a tagged-int accumulator, the walker now stays in long long arithmetic across the entire walk, falling back to reduce_step only on overflow or the first non-int element. Shared by reduce_int_range (now covers all six kinds via hoisted per-kind inner loops), reduce_vec_apply (vec and set reduces), reduce_pipeline_walk (fused map/filter/take chains), and prim_reduce's generic seq_iter fallback. vec / set / list reduces at 100k size: -24 to -49% vs v0.163.0. Pipeline-reduce rows are flat (per-stage apply_callable dispatch already dominates the boxing cost on tagged-int operands). Range reduce stays at the pre-existing floor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 33d7dd2 commit f0fe232

3 files changed

Lines changed: 393 additions & 84 deletions

File tree

CHANGELOG.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,53 @@
11
# Changelog
22

3+
## v0.164.0 — Unboxed Int-Acc Reducer Fast Lane
4+
5+
Adds an unboxed `long long` accumulator path to the canonical numeric
6+
reducers (`+`, `*`, `-`, `bit-and`, `bit-or`, `bit-xor`). When
7+
`(reduce <op> [init] coll)` is invoked with a tagged-int accumulator
8+
and the collection iterates tagged-int elements, the inner walker
9+
runs entirely in `long long` arithmetic, falling back to the generic
10+
`reduce_step` path on the first overflow or non-int element so the
11+
numeric tower stays Clojure-correct.
12+
13+
The unboxed-acc machinery is plumbed through every walker entry
14+
point a `(reduce ...)` call can reach:
15+
16+
- `reduce_int_range` — already specialised for `+`; now also covers
17+
`*`, `-`, `bit-and`, `bit-or`, `bit-xor` via per-kind hoisted inner
18+
loops so the switch is loop-invariant.
19+
- `reduce_vec_apply` / `reduce_vec_trie_walk` — vector and set
20+
reduces (the latter routes through `key_order`'s vector spine).
21+
- `reduce_pipeline_walk` — fused `(reduce f (->> src (map ...)
22+
(filter ...) (take ...)))` chains.
23+
- `prim_reduce`'s generic `seq_iter` fallback for cons / lazy /
24+
custom collection shapes.
25+
26+
Shared by a single `reduce_ctx_t` struct + `reduce_ctx_init` /
27+
`reduce_ctx_step` / `reduce_ctx_finalize` helpers. The `reduce_step`
28+
primitive is retained as the box-mode fallback so the numeric tower
29+
(BigInt promotion, float coercion, user reducers) is unchanged.
30+
31+
### Measured impact (v0.163.0 → v0.164.0)
32+
33+
| Bench | v0.163.0 | v0.164.0 | Δ |
34+
|---|---:|---:|---:|
35+
| `(reduce + vec-100k)` | 505 µs | 264 µs | **-48%** |
36+
| `(reduce + list-100k)` | 856 µs | 654 µs | **-24%** |
37+
| `(reduce + set-100k)` | 466 µs | 235 µs | **-49%** |
38+
| `(reduce + vec-1k)` | 10.6 µs | 9.1 µs | -14% |
39+
| `(reduce + (range 1m))` | 252 µs | 251 µs | flat (already optimal via `reduce_int_range`) |
40+
| `(reduce bit-or 0 vec-1k)` | 12.4 µs | 10.3 µs | -17% |
41+
| `(reduce + (map inc (range 100k)))` | 6.92 ms | 7.03 ms | flat (per-stage overhead dominates) |
42+
43+
Pipeline-reduce rows stay near baseline because tagged-int boxing
44+
is already free in current mino — the per-stage `apply_callable`
45+
dispatch is the residual cost there, not the reducer-side box. The
46+
remaining pipeline-reduce ceiling is a separate cycle's target.
47+
48+
Matrix neutral elsewhere within 2% noise. ASan clean. 1659 tests,
49+
7690 assertions all green.
50+
351
## v0.163.0 — IC Resolve Path Consolidation
452

553
Pure refactor. Consolidates the three IC-cache consumers in the

src/mino.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
* rebuilding the runtime) is available at runtime via mino_version_string().
2828
*/
2929
#define MINO_VERSION_MAJOR 0
30-
#define MINO_VERSION_MINOR 163
30+
#define MINO_VERSION_MINOR 164
3131
#define MINO_VERSION_PATCH 0
3232

3333
/*

0 commit comments

Comments
 (0)