Skip to content

Commit ea1a386

Browse files
authored
docs: session wrapup -- elreal facade (#1079) + constant perf (#1061 Ph3b) (#1089)
CHANGELOG: elreal class facade epic (#1079, Phases 1-5, closed) under Added; online-e / Abel-euler_gamma constant perf (#1061 Phase 3b, #1087/#1088) under Changed. docs/sessions/2026-06-21_elreal_facade_and_constant_perf.md: session log. Docs-only.
1 parent 79272eb commit ea1a386

2 files changed

Lines changed: 156 additions & 0 deletions

File tree

CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111

12+
* **elreal class facade -- plug-in arithmetic number system + lazy state-machine API (Epic [#1079] CLOSED)** -- `elreal` (McCleeary LFPERA lazy exact-real over the `ZBCL<FpType>` block co-list) previously had only free functions; it now has a `class elreal<FpType=double>` with the standard Universal facade, so it drops into templated/plug-in kernels like every other number system while keeping its lazy incremental-precision edge. Five phases:
13+
- **Phase 1 -- facade scaffold** ([#1080](https://github.com/stillwater-sc/universal/pull/1080)) -- `elreal_impl.hpp` (native ctors, conversions, fully-lazy `+ - * /` storing unforced streams, unary `-`, free operators, depth-bounded `== < ...`, `abs`/`fabs`), `traits/elreal_traits.hpp`, `elreal_fwd.hpp` + umbrella + aliases `elreal64`/`elreal32`. Design decisions: runtime `_depth` member + `.precision()` + thread-local default with an RAII `elreal_precision_guard`; evaluation forced only at a boundary (conversion / compare / I/O); comparison is depth-bounded (exact ordering when the difference's leading limb is nonzero -- exact equality of distinct irrationals is undecidable)
14+
- **Phase 2 -- lazy API hardening** ([#1081](https://github.com/stillwater-sc/universal/pull/1081)) -- counter-instrumented memoisation regression (pull depth `d` then `d+1`, assert the tail is reused not recomputed); precision-honest `approx<T>` summed *in* `T`
15+
- **Phase 3 -- traits / limits / manipulators** ([#1082](https://github.com/stillwater-sc/universal/pull/1082)) -- `numeric_limits` (precision-parametrised against the nominal default), `attributes.hpp` (`sign` / `scale` returning `int64_t` to avoid narrowing the unbounded `integer<256>` exponent / `significand`), `manipulators.hpp` (`type_tag` / `to_components` / `to_binary` / `to_triple` emitting the significand in `[1,2)` / `operator<<` / `operator>>`). CodeRabbit fixes: `to_triple` significand, `denorm_min()==min()` under `denorm_absent`
16+
- **Phase 4 -- math facade** ([#1083](https://github.com/stillwater-sc/universal/pull/1083)) -- `mathlib.hpp` lifts the ZBCL-level `math/*.hpp` to class `elreal`: unary `sqrt`/`exp`/`log`/`sin`/`cos`/`tan`/`asin`/`acos`/`atan`/`sinh`/`cosh`/`tanh` (refine to the operand's `precision()`), binary `pow`/`hypot`, and `elreal_pi`/`e`/`ln2`/`ln10`/`log2_10`/`sqrt2`/`sqrt3`/`sqrt5`/`phi`/`euler_gamma` constants
17+
- **Phase 5 -- non-finite state + conversion/logic/arithmetic suites** ([#1084](https://github.com/stillwater-sc/universal/pull/1084)) -- per the settled policy decision, class `elreal` gained a dedicated IEEE-style non-finite classification (`elreal_class {finite, pinf, ninf, qnan}` + `_cls` member; this revised the Phase 3 finite-only `numeric_limits`). `operator=(double)` classifies `NaN`/`+-Inf`; the tag propagates by IEEE-754 rules through arithmetic (`inf+finite=inf`, `inf-inf=nan`, `inf*0=nan`, `x/0=+-inf`, `0/0=nan`, `finite/inf=0`, NaN propagates), comparison (new `elreal_order_of`: NaN unordered, `-inf < finite < +inf`), conversion, unary minus, and `abs`. `numeric_limits` `has_infinity`/`has_quiet_NaN` now true; free `isnan`/`isinf`/`isfinite`/`signbit`; manipulators render `nan`/`+-inf`. Class-level `conversion/`, `logic/`, `arithmetic/` suites + `el_conversion`/`el_logic` CMake targets ([#1079](https://github.com/stillwater-sc/universal/issues/1079))
1218
* **bfloat16 exponent-manipulation functions** -- `ldexp`, `frexp`, `scalbn`, `logb` (returns `bfloat16`) and `ilogb` (returns `int`) added to `bfloat16/math/functions/exponent.hpp`, completing the `<cmath>` exponent family that `cfloat<>` already exposed. They marshal through `float` rather than `double`: a bfloat16 is the high 16 bits of an IEEE float32 and shares its 8-bit exponent field, so `bfloat16 -> float` is exact, covers bfloat16's entire exponent range, and is cheaper than widening to double (per PR #1015 review). New `static/float/bfloat16/math/exponent.cpp` verifies the functions by their mathematical properties (frexp/ldexp roundtrip, `|fraction|` in `[0.5,1)`, `scalbn == ldexp`, `ilogb == frexp_exp-1`) plus IEEE special cases (`+/-0`, `+/-inf`, `NaN`). Unblocks adding bfloat16 to the elreal Phase 4 FpType sweep ([#941](https://github.com/stillwater-sc/universal/issues/941), [#1015](https://github.com/stillwater-sc/universal/pull/1015), [#1017](https://github.com/stillwater-sc/universal/pull/1017))
1319
* **ereal parse-cost benchmark** -- `benchmark/performance/arithmetic/ereal/parse.cpp` tracks `ereal<2|8|19>::parse` cost across 32..440-digit strings with a catastrophic-regression guard (fails only if a 320-digit parse exceeds 2 s, vs the >120 s blowup it guards against and the ~5 ms actual cost), and counts any `parse()` failures on valid input. Closes the remaining acceptance item from parse-complexity issue #913 ([#1013](https://github.com/stillwater-sc/universal/issues/1013), [#1016](https://github.com/stillwater-sc/universal/pull/1016))
1420
* **Epic [#835] CLOSED -- decimal string parsing API across all number systems** -- the elastic family (`einteger`, `edecimal`, `erational`, `efloat`, `ereal`) now has a working `parse()` API that accepts decimal, scientific notation, and -- where applicable -- hex / binary / octal / p/q forms, plus `nan` / `inf` / `infinity` token routing. `operator>>` consistently sets `failbit` on parse failure and guards against extraction failure across every type. Foundations were laid earlier ([#838](https://github.com/stillwater-sc/universal/issues/838) `scan_decimal_float`, [#841](https://github.com/stillwater-sc/universal/issues/841) `decimal_to_binary::convert` with mantissa + binary_scale + guard/sticky, [#848](https://github.com/stillwater-sc/universal/issues/848)/[#851](https://github.com/stillwater-sc/universal/issues/851) distillation algorithm). Per-number-system PRs landed in sequence:
@@ -62,6 +68,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
6268

6369
### Changed
6470

71+
* **elreal math-constant performance -- online `e`, faster `euler_gamma` ([#1061] Phase 3b)** -- the two remaining eager constant generators in the elreal math layer (surfaced while profiling the Phase 4 math facade: `e` ~16 s, `euler_gamma` ~11 s at depth 16).
72+
- **`e_zbcl` made online** ([#1087](https://github.com/stillwater-sc/universal/pull/1087)) -- `e = sum 1/n!` has the same shape as the already-online exp/atan series, so it now uses the streaming form (a lazy term co-list `term_n = term_{n-1}/n` via exact integer div, each term significance-windowed with `take_while_above`, folded by `infsum`). **16165 ms -> 83 ms at depth 16 (~195x)**, value-identical to the 320-digit reference. The same PR dropped `euler_gamma`'s redundant Pass-1 peak-finding recurrence in favour of the analytic Stirling peak `2n*log2(e) - log2(2*pi*n)`, and re-enabled the `elreal_e()` check in the Phase 4 math test now that e is fast
73+
- **`euler_gamma_zbcl` Abel reduction** ([#1088](https://github.com/stillwater-sc/universal/pull/1088)) -- Brent-McMillan B1's `A(n) = sum_k w_k H_k` (a full per-term `ZBCL x ZBCL` multiply + per-term harmonic maintenance) was replaced by Abel summation `sum_k w_k H_k == sum_{k>=0} tail_k/(k+1)` with `tail_k = sum_{j>k} w_j = B - B_k`: all-positive accumulation, no harmonic numbers, per-term cost now a single-block scalar division instead of a full multiply. **depth 8: 4408 -> 1890 ms (~2.3x); depth 16: 10986 -> 3578 ms (~3.1x)**, validated to 305 digits vs the 320-digit reference (`REGRESSION_LEVEL_4`) on gcc + clang. Adds `docs/design/elreal-euler-gamma.md` (algorithm, derivation, the convergent-vs-naive cancellation analysis, and the deferred binary-splitting alternative). Brent-McMillan remains inherently O(n) products; the asymptotic `O(M(D) log^2 D)` binary-splitting rewrite (einteger P/Q/B/T) is tracked as a separate effort ([#1061](https://github.com/stillwater-sc/universal/issues/1061))
6574
* **CI build time restored (~20 min -> ~8 min) -- ccache eviction** -- the cmake CI `ccache` was keyed per-`matrix.artifact` with no `save` gating, so every PR run saved a ~500 MB cache; under GitHub's 10 GB per-repo LRU limit a burst of CI evicted the warm caches, dropping the hit rate to ~1% and turning cached ~1 min builds into ~9 min cold rebuilds. Fix: `save: only on pushes to main` (PR runs restore but no longer create caches) plus `max-size: 1G`, on both ccache blocks. Diagnosed from the build-step duration (55 s -> 566 s) and ccache hit/miss stats; not a code regression (CI_LITE does not even build ereal) ([#1009](https://github.com/stillwater-sc/universal/issues/1009), [#1010](https://github.com/stillwater-sc/universal/pull/1010))
6675
* **ereal mathlib regression tests tiered by level** -- the property-fuzz ran a heavy fixed count (x100/x50) only at L1, the sanity tier that CI's Debug-instrumented jobs (ASan/UBSan/coverage, ~40x slower at `-O0`) run -- so those jobs took ~1 h. The fuzz count now scales with the regression level (L1 smoke x15-20, up to x2000 at L4), keeping CI fast while preserving (and extending) stress-tier coverage; hand-curated correctness tests unchanged ([#1007](https://github.com/stillwater-sc/universal/issues/1007), [#1008](https://github.com/stillwater-sc/universal/pull/1008))
6776
* **Epic [#723] CLOSED -- constexpr support across Universal number systems** -- the umbrella Epic tracking full constexpr promotion across the library is complete: all 32 sub-issues (5 Tier-1 primary types, 22 Tier-2 additional fixed-size types, 5 Tier-3 elastic types) are closed. Final close-out: sub-Epic [#745](https://github.com/stillwater-sc/universal/issues/745) (zfpblock umbrella) closed after PRs [#814](https://github.com/stillwater-sc/universal/pull/814) (accessor subset), [#830](https://github.com/stillwater-sc/universal/pull/830) (codec), and [#831](https://github.com/stillwater-sc/universal/pull/831) (zfparray container) landed. Universal now fulfills the "plug-in" promise: any expression in any fixed-size Universal type can be evaluated at compile time, drop-in parity with `int`/`float`/`double` in any constexpr context. Cross-cutting prerequisites all complete: `blockbinary` arithmetic (#716), `blocksignificand`/`blocktriple` arithmetic (#718/#719), `blockdecimal` arithmetic (#729/#730), `floatcascade` (#728/#739/#742), `twoSum`/`twoProd` (#727/#738), `sw::math::constexpr_math` providing `cm::log2`/`cm::exp2` (Epic #763, replacing the original #423)
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
# Development Session: elreal Class Facade (Epic #1079) + Constant Performance (#1061 Phase 3b)
2+
3+
**Date:** 2026-06-18 .. 2026-06-21
4+
**Branch:** per-phase feature branches off `main` (all merged)
5+
**Focus:** Give `elreal` (McCleeary LFPERA lazy exact-real) the standard Universal
6+
plug-in facade, and make the last two eager math constants fast.
7+
**Status:** Complete -- Epic #1079 closed; #1061 Phase 3b constant work merged.
8+
9+
## Session Overview
10+
11+
`elreal` is the library's lazy exact-real type: a real is a (possibly infinite)
12+
co-list of `block<FpType>` (the `ZBCL`), and arithmetic is online -- each consumer
13+
pull drives exactly as much work as the requested precision needs. Before this
14+
session it existed only as free functions over `ZBCL`. This session built the
15+
`class elreal<FpType>` facade so it plugs into templated kernels like every other
16+
number system, then closed out the remaining constant-generator performance gap.
17+
18+
Two threads:
19+
20+
1. **Epic #1079 -- the class facade** (Phases 1-5; Phases 1-2 landed at the very
21+
start, Phases 3-5 are the bulk of this session). Five PRs, all merged.
22+
2. **#1061 Phase 3b -- eager-constant conversion** (online `e`, Abel-summed
23+
`euler_gamma`). Two PRs, both merged.
24+
25+
All work validated on gcc + clang (Release), gcc Debug-with-assertions for the new
26+
suites, cppcheck (Codacy gate), ASCII guard, and -- for the constants -- the
27+
320-digit mpmath oracle at `REGRESSION_LEVEL_4`.
28+
29+
### Goals Achieved
30+
31+
- Phase 1: facade scaffold -- `class elreal`, native ctors/conversions, lazy
32+
operators, depth-bounded comparison (#1080)
33+
- Phase 2: lazy API hardening -- memoisation regression, precision-honest approx (#1081)
34+
- Phase 3: `numeric_limits` / `attributes` / `manipulators` (#1082)
35+
- Phase 4: `mathlib.hpp` math facade -- functions + constants (#1083)
36+
- Phase 5: dedicated IEEE non-finite state + conversion/logic/arithmetic suites (#1084)
37+
- #1061 Phase 3b: online `e_zbcl` (~195x); Abel-summed `euler_gamma` (~2.3-3.1x) (#1087, #1088)
38+
- Epic #1079 closed; `docs/design/elreal-euler-gamma.md` written
39+
40+
## Architecture Decisions
41+
42+
### The facade is elastic, not trivial
43+
44+
`elreal` holds a `ZBCL` (a `shared_ptr`-backed memoised co-list), so it is **not**
45+
trivially copyable. The #925 hardware-shareable triviality rule applies to the
46+
static `block`, not to this elastic facade -- same posture as `ereal`/`einteger`.
47+
The facade is modelled on `ereal_impl.hpp` (the eager Priest/Shewchuk sibling).
48+
49+
### Three settled facade design decisions (Phase 1)
50+
51+
| Concern | Decision |
52+
|---------|----------|
53+
| Precision | runtime `_depth` member + `.precision()` + thread-local default with an RAII `elreal_precision_guard` scoped override |
54+
| Operators | **fully lazy** -- `+ - * /` store unforced `add`/`mul_online`/`div_online` streams; evaluation happens only at a boundary (conversion / compare / I/O / explicit `approx`) |
55+
| Comparison | **depth-bounded** -- compare `a-b` to the deeper of the two depths; exact ordering when the difference's leading limb is nonzero (exact equality of distinct irrationals is undecidable) |
56+
57+
### Non-finite policy (Phase 5) -- a deliberately reversed decision
58+
59+
Phase 3 shipped a **finite-only** `numeric_limits` (the LFPERA model is exact over
60+
the finite reals). For Phase 5 the user chose the larger option: a **dedicated
61+
IEEE-style non-finite state** (`elreal_class {finite, pinf, ninf, qnan}` + a `_cls`
62+
member), which *revised* the Phase 3 contract. Rationale: plug-in kernels can
63+
produce `NaN`/`+-Inf` (`x/0`, `log` of a negative, conversion overflow), and a
64+
predictable IEEE classification beats silently mapping them to 0. The tag
65+
propagates by IEEE-754 rules through arithmetic, comparison (`elreal_order_of`:
66+
NaN unordered, `-inf < finite < +inf`), conversion, unary minus, and `abs`.
67+
68+
### euler_gamma: Brent-McMillan + Abel, binary splitting deferred
69+
70+
`euler_gamma` has no elementary series, so it uses Brent-McMillan B1
71+
(`gamma = A/B - ln(n)`). The dominant cost was the `A = sum_k w_k H_k`
72+
accumulation. Abel summation rewrites it as `A = sum_{k>=0} tail_k/(k+1)`
73+
(`tail_k = B - B_k`), replacing the per-term full multiply with a single-block
74+
scalar division and removing the harmonic numbers entirely. The asymptotically
75+
better approach (binary splitting with `einteger` P/Q/B/T) was **investigated and
76+
deliberately deferred**: it needs an arbitrary-precision-integer dependency the
77+
elreal math layer does not have, a new `einteger -> ZBCL` bridge, and a
78+
harmonic-aware splitting tuple, with no repo precedent. Full reasoning and the
79+
algorithm are in `docs/design/elreal-euler-gamma.md`.
80+
81+
## Notable Bugs / Gotchas Surfaced
82+
83+
- **CodeRabbit (Critical), Phase 5** -- `to_triple(-inf)` printed `"(-, -inf)"`:
84+
the sign was emitted twice (the `(sign, ...)` prefix plus a signed tag). Fixed
85+
with an unsigned-tag option on `nonfinite_tag`; regression checks added.
86+
- **`a + (-a)` is value-zero but not `iszero()`** -- lazy exact cancellation yields
87+
a zero-*valued* stream, not the structurally-empty canonical zero. `iszero()` is
88+
a cheap structural test (the right semantics for a semi-decidable lazy real), so
89+
it is correctly `false` there. A test assertion was relaxed and documented.
90+
- **cppcheck `duplicateCondition` folds mirror operators** -- `a<b` and `b>a`
91+
normalise to the same condition, so the logic suite's `if (...) ++n;` checks read
92+
as duplicates. Rewrote as `n += !(...)` accumulation statements on opaque
93+
(volatile-seeded) operands -- exercises every operator with no duplicate
94+
if-conditions.
95+
- **Codacy `noExplicitConstructor` counts per-instantiation/per-ctor-edit** -- the
96+
intentional implicit plug-in ctors (same as ereal/cfloat/posit) get re-counted
97+
when a new instantiation appears (first `elreal<float>` test) or the ctor region
98+
is edited. Accepted as non-blocking (Codacy is not a required check); making them
99+
`explicit` would break plug-in semantics.
100+
- **The 320-digit constant oracle is `REGRESSION_LEVEL_4`-gated** -- a default
101+
(`LEVEL_1`) build compiles the high-precision check out and the test "PASSes" as
102+
a no-op. The euler_gamma cancellation analysis was only truly validated after
103+
rebuilding with `-DREGRESSION_LEVEL_4=1` (305 digits, gcc + clang). Recorded so
104+
the next person is not fooled by the no-op pass.
105+
- **Eager constants were pathologically slow, not the functions** -- profiling the
106+
Phase 4 facade showed every transcendental *function* < 100 ms, but `e_zbcl(32)`
107+
~16 s and `euler_gamma_zbcl` ~11 s, because those two *constants* had not been
108+
moved onto the online series path. (`exp(1)` via the online path is 1 ms.)
109+
110+
## Performance Results
111+
112+
| Item | Before | After | Speedup |
113+
|------|-------:|------:|--------:|
114+
| `e_zbcl(16)` | 16165 ms | 83 ms | ~195x |
115+
| `euler_gamma(8)` | 4408 ms | 1890 ms | ~2.3x |
116+
| `euler_gamma(16)` | 10986 ms | 3578 ms | ~3.1x |
117+
118+
All value-identical to the 320-digit reference (e: 307 digits, euler_gamma: 305).
119+
120+
## Pull Requests
121+
122+
| PR | Title | Merge |
123+
|----|-------|-------|
124+
| [#1080](https://github.com/stillwater-sc/universal/pull/1080) | facade scaffold (#1079 Phase 1) | 22857fb7 |
125+
| [#1081](https://github.com/stillwater-sc/universal/pull/1081) | lazy API hardening (#1079 Phase 2) | d679ebe8 |
126+
| [#1082](https://github.com/stillwater-sc/universal/pull/1082) | numeric_limits/attributes/manipulators (#1079 Phase 3) | 1ccdf224 |
127+
| [#1083](https://github.com/stillwater-sc/universal/pull/1083) | math facade (#1079 Phase 4) | 6444f6b2 |
128+
| [#1084](https://github.com/stillwater-sc/universal/pull/1084) | non-finite state + suites (#1079 Phase 5) | 823dff27 |
129+
| [#1087](https://github.com/stillwater-sc/universal/pull/1087) | online e_zbcl + drop gamma peak pass (#1061 Ph3b) | ace597bd |
130+
| [#1088](https://github.com/stillwater-sc/universal/pull/1088) | Abel-summed euler_gamma (#1061 Ph3b) | 79272eb5 |
131+
132+
## Process Notes
133+
134+
- Per-phase workflow: branch off `main`, draft PR -> fast tier (gcc+clang CI_LITE)
135+
-> resolve CodeRabbit/Codacy -> `gh pr ready` -> full tier (11 platforms +
136+
ASan/UBSan + Coverage + Clang-Tidy) -> admin-squash-merge `--delete-branch` ->
137+
sync main -> check off the phase on #1079.
138+
- When a phase started before the prior merged, it branched off the prior tip and
139+
was rebased onto `main` (`git rebase --onto main <prior-tip> <branch>`) after the
140+
prior squash-merge.
141+
142+
## Follow-on Work (tracked, not blocking)
143+
144+
- **Binary-splitting `euler_gamma`** -- the asymptotic `O(M(D) log^2 D)` win;
145+
needs einteger integration + an einteger->ZBCL bridge (see the design doc).
146+
- **`to_hex` + high-precision decimal printer** for elreal manipulators (deferred
147+
in Phase 3, as in `ereal` where both are `tbd` stubs).

0 commit comments

Comments
 (0)