docs: session wrapup -- elreal facade (#1079) + constant perf (#1061 Ph3b) (#1089)

Ravenwater · web-flow · commit ea1a38623ead · 2026-06-21T21:22:22.000-04:00
CHANGELOG: elreal class facade epic (#1079, Phases 1-5, closed) under Added; online-e / Abel-euler_gamma constant perf (#1061 Phase 3b, #1087/#1088) under Changed. docs/sessions/2026-06-21_elreal_facade_and_constant_perf.md: session log. Docs-only.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -9,6 +9,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Added
 
+* **elreal class facade -- plug-in arithmetic number system + lazy state-machine API (Epic [#1079] CLOSED)** -- `elreal` (McCleeary LFPERA lazy exact-real over the `ZBCL<FpType>` block co-list) previously had only free functions; it now has a `class elreal<FpType=double>` with the standard Universal facade, so it drops into templated/plug-in kernels like every other number system while keeping its lazy incremental-precision edge.  Five phases:
+  - **Phase 1 -- facade scaffold** ([#1080](https://github.com/stillwater-sc/universal/pull/1080)) -- `elreal_impl.hpp` (native ctors, conversions, fully-lazy `+ - * /` storing unforced streams, unary `-`, free operators, depth-bounded `== < ...`, `abs`/`fabs`), `traits/elreal_traits.hpp`, `elreal_fwd.hpp` + umbrella + aliases `elreal64`/`elreal32`.  Design decisions: runtime `_depth` member + `.precision()` + thread-local default with an RAII `elreal_precision_guard`; evaluation forced only at a boundary (conversion / compare / I/O); comparison is depth-bounded (exact ordering when the difference's leading limb is nonzero -- exact equality of distinct irrationals is undecidable)
+  - **Phase 2 -- lazy API hardening** ([#1081](https://github.com/stillwater-sc/universal/pull/1081)) -- counter-instrumented memoisation regression (pull depth `d` then `d+1`, assert the tail is reused not recomputed); precision-honest `approx<T>` summed *in* `T`
+  - **Phase 3 -- traits / limits / manipulators** ([#1082](https://github.com/stillwater-sc/universal/pull/1082)) -- `numeric_limits` (precision-parametrised against the nominal default), `attributes.hpp` (`sign` / `scale` returning `int64_t` to avoid narrowing the unbounded `integer<256>` exponent / `significand`), `manipulators.hpp` (`type_tag` / `to_components` / `to_binary` / `to_triple` emitting the significand in `[1,2)` / `operator<<` / `operator>>`).  CodeRabbit fixes: `to_triple` significand, `denorm_min()==min()` under `denorm_absent`
+  - **Phase 4 -- math facade** ([#1083](https://github.com/stillwater-sc/universal/pull/1083)) -- `mathlib.hpp` lifts the ZBCL-level `math/*.hpp` to class `elreal`: unary `sqrt`/`exp`/`log`/`sin`/`cos`/`tan`/`asin`/`acos`/`atan`/`sinh`/`cosh`/`tanh` (refine to the operand's `precision()`), binary `pow`/`hypot`, and `elreal_pi`/`e`/`ln2`/`ln10`/`log2_10`/`sqrt2`/`sqrt3`/`sqrt5`/`phi`/`euler_gamma` constants
+  - **Phase 5 -- non-finite state + conversion/logic/arithmetic suites** ([#1084](https://github.com/stillwater-sc/universal/pull/1084)) -- per the settled policy decision, class `elreal` gained a dedicated IEEE-style non-finite classification (`elreal_class {finite, pinf, ninf, qnan}` + `_cls` member; this revised the Phase 3 finite-only `numeric_limits`).  `operator=(double)` classifies `NaN`/`+-Inf`; the tag propagates by IEEE-754 rules through arithmetic (`inf+finite=inf`, `inf-inf=nan`, `inf*0=nan`, `x/0=+-inf`, `0/0=nan`, `finite/inf=0`, NaN propagates), comparison (new `elreal_order_of`: NaN unordered, `-inf < finite < +inf`), conversion, unary minus, and `abs`.  `numeric_limits` `has_infinity`/`has_quiet_NaN` now true; free `isnan`/`isinf`/`isfinite`/`signbit`; manipulators render `nan`/`+-inf`.  Class-level `conversion/`, `logic/`, `arithmetic/` suites + `el_conversion`/`el_logic` CMake targets ([#1079](https://github.com/stillwater-sc/universal/issues/1079))
 * **bfloat16 exponent-manipulation functions** -- `ldexp`, `frexp`, `scalbn`, `logb` (returns `bfloat16`) and `ilogb` (returns `int`) added to `bfloat16/math/functions/exponent.hpp`, completing the `<cmath>` exponent family that `cfloat<>` already exposed.  They marshal through `float` rather than `double`: a bfloat16 is the high 16 bits of an IEEE float32 and shares its 8-bit exponent field, so `bfloat16 -> float` is exact, covers bfloat16's entire exponent range, and is cheaper than widening to double (per PR #1015 review).  New `static/float/bfloat16/math/exponent.cpp` verifies the functions by their mathematical properties (frexp/ldexp roundtrip, `|fraction|` in `[0.5,1)`, `scalbn == ldexp`, `ilogb == frexp_exp-1`) plus IEEE special cases (`+/-0`, `+/-inf`, `NaN`).  Unblocks adding bfloat16 to the elreal Phase 4 FpType sweep ([#941](https://github.com/stillwater-sc/universal/issues/941), [#1015](https://github.com/stillwater-sc/universal/pull/1015), [#1017](https://github.com/stillwater-sc/universal/pull/1017))
 * **ereal parse-cost benchmark** -- `benchmark/performance/arithmetic/ereal/parse.cpp` tracks `ereal<2|8|19>::parse` cost across 32..440-digit strings with a catastrophic-regression guard (fails only if a 320-digit parse exceeds 2 s, vs the >120 s blowup it guards against and the ~5 ms actual cost), and counts any `parse()` failures on valid input.  Closes the remaining acceptance item from parse-complexity issue #913 ([#1013](https://github.com/stillwater-sc/universal/issues/1013), [#1016](https://github.com/stillwater-sc/universal/pull/1016))
 * **Epic [#835] CLOSED -- decimal string parsing API across all number systems** -- the elastic family (`einteger`, `edecimal`, `erational`, `efloat`, `ereal`) now has a working `parse()` API that accepts decimal, scientific notation, and -- where applicable -- hex / binary / octal / p/q forms, plus `nan` / `inf` / `infinity` token routing.  `operator>>` consistently sets `failbit` on parse failure and guards against extraction failure across every type.  Foundations were laid earlier ([#838](https://github.com/stillwater-sc/universal/issues/838) `scan_decimal_float`, [#841](https://github.com/stillwater-sc/universal/issues/841) `decimal_to_binary::convert` with mantissa + binary_scale + guard/sticky, [#848](https://github.com/stillwater-sc/universal/issues/848)/[#851](https://github.com/stillwater-sc/universal/issues/851) distillation algorithm).  Per-number-system PRs landed in sequence:
@@ -62,6 +68,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Changed
 
+* **elreal math-constant performance -- online `e`, faster `euler_gamma` ([#1061] Phase 3b)** -- the two remaining eager constant generators in the elreal math layer (surfaced while profiling the Phase 4 math facade: `e` ~16 s, `euler_gamma` ~11 s at depth 16).
+  - **`e_zbcl` made online** ([#1087](https://github.com/stillwater-sc/universal/pull/1087)) -- `e = sum 1/n!` has the same shape as the already-online exp/atan series, so it now uses the streaming form (a lazy term co-list `term_n = term_{n-1}/n` via exact integer div, each term significance-windowed with `take_while_above`, folded by `infsum`).  **16165 ms -> 83 ms at depth 16 (~195x)**, value-identical to the 320-digit reference.  The same PR dropped `euler_gamma`'s redundant Pass-1 peak-finding recurrence in favour of the analytic Stirling peak `2n*log2(e) - log2(2*pi*n)`, and re-enabled the `elreal_e()` check in the Phase 4 math test now that e is fast
+  - **`euler_gamma_zbcl` Abel reduction** ([#1088](https://github.com/stillwater-sc/universal/pull/1088)) -- Brent-McMillan B1's `A(n) = sum_k w_k H_k` (a full per-term `ZBCL x ZBCL` multiply + per-term harmonic maintenance) was replaced by Abel summation `sum_k w_k H_k == sum_{k>=0} tail_k/(k+1)` with `tail_k = sum_{j>k} w_j = B - B_k`: all-positive accumulation, no harmonic numbers, per-term cost now a single-block scalar division instead of a full multiply.  **depth 8: 4408 -> 1890 ms (~2.3x); depth 16: 10986 -> 3578 ms (~3.1x)**, validated to 305 digits vs the 320-digit reference (`REGRESSION_LEVEL_4`) on gcc + clang.  Adds `docs/design/elreal-euler-gamma.md` (algorithm, derivation, the convergent-vs-naive cancellation analysis, and the deferred binary-splitting alternative).  Brent-McMillan remains inherently O(n) products; the asymptotic `O(M(D) log^2 D)` binary-splitting rewrite (einteger P/Q/B/T) is tracked as a separate effort ([#1061](https://github.com/stillwater-sc/universal/issues/1061))
 * **CI build time restored (~20 min -> ~8 min) -- ccache eviction** -- the cmake CI `ccache` was keyed per-`matrix.artifact` with no `save` gating, so every PR run saved a ~500 MB cache; under GitHub's 10 GB per-repo LRU limit a burst of CI evicted the warm caches, dropping the hit rate to ~1% and turning cached ~1 min builds into ~9 min cold rebuilds.  Fix: `save: only on pushes to main` (PR runs restore but no longer create caches) plus `max-size: 1G`, on both ccache blocks.  Diagnosed from the build-step duration (55 s -> 566 s) and ccache hit/miss stats; not a code regression (CI_LITE does not even build ereal) ([#1009](https://github.com/stillwater-sc/universal/issues/1009), [#1010](https://github.com/stillwater-sc/universal/pull/1010))
 * **ereal mathlib regression tests tiered by level** -- the property-fuzz ran a heavy fixed count (x100/x50) only at L1, the sanity tier that CI's Debug-instrumented jobs (ASan/UBSan/coverage, ~40x slower at `-O0`) run -- so those jobs took ~1 h.  The fuzz count now scales with the regression level (L1 smoke x15-20, up to x2000 at L4), keeping CI fast while preserving (and extending) stress-tier coverage; hand-curated correctness tests unchanged ([#1007](https://github.com/stillwater-sc/universal/issues/1007), [#1008](https://github.com/stillwater-sc/universal/pull/1008))
 * **Epic [#723] CLOSED -- constexpr support across Universal number systems** -- the umbrella Epic tracking full constexpr promotion across the library is complete: all 32 sub-issues (5 Tier-1 primary types, 22 Tier-2 additional fixed-size types, 5 Tier-3 elastic types) are closed.  Final close-out: sub-Epic [#745](https://github.com/stillwater-sc/universal/issues/745) (zfpblock umbrella) closed after PRs [#814](https://github.com/stillwater-sc/universal/pull/814) (accessor subset), [#830](https://github.com/stillwater-sc/universal/pull/830) (codec), and [#831](https://github.com/stillwater-sc/universal/pull/831) (zfparray container) landed.  Universal now fulfills the "plug-in" promise: any expression in any fixed-size Universal type can be evaluated at compile time, drop-in parity with `int`/`float`/`double` in any constexpr context.  Cross-cutting prerequisites all complete: `blockbinary` arithmetic (#716), `blocksignificand`/`blocktriple` arithmetic (#718/#719), `blockdecimal` arithmetic (#729/#730), `floatcascade` (#728/#739/#742), `twoSum`/`twoProd` (#727/#738), `sw::math::constexpr_math` providing `cm::log2`/`cm::exp2` (Epic #763, replacing the original #423)
diff --git a/docs/sessions/2026-06-21_elreal_facade_and_constant_perf.md b/docs/sessions/2026-06-21_elreal_facade_and_constant_perf.md
@@ -0,0 +1,147 @@
+# Development Session: elreal Class Facade (Epic #1079) + Constant Performance (#1061 Phase 3b)
+
+**Date:** 2026-06-18 .. 2026-06-21
+**Branch:** per-phase feature branches off `main` (all merged)
+**Focus:** Give `elreal` (McCleeary LFPERA lazy exact-real) the standard Universal
+plug-in facade, and make the last two eager math constants fast.
+**Status:** Complete -- Epic #1079 closed; #1061 Phase 3b constant work merged.
+
+## Session Overview
+
+`elreal` is the library's lazy exact-real type: a real is a (possibly infinite)
+co-list of `block<FpType>` (the `ZBCL`), and arithmetic is online -- each consumer
+pull drives exactly as much work as the requested precision needs. Before this
+session it existed only as free functions over `ZBCL`. This session built the
+`class elreal<FpType>` facade so it plugs into templated kernels like every other
+number system, then closed out the remaining constant-generator performance gap.
+
+Two threads:
+
+1. **Epic #1079 -- the class facade** (Phases 1-5; Phases 1-2 landed at the very
+   start, Phases 3-5 are the bulk of this session). Five PRs, all merged.
+2. **#1061 Phase 3b -- eager-constant conversion** (online `e`, Abel-summed
+   `euler_gamma`). Two PRs, both merged.
+
+All work validated on gcc + clang (Release), gcc Debug-with-assertions for the new
+suites, cppcheck (Codacy gate), ASCII guard, and -- for the constants -- the
+320-digit mpmath oracle at `REGRESSION_LEVEL_4`.
+
+### Goals Achieved
+
+- Phase 1: facade scaffold -- `class elreal`, native ctors/conversions, lazy
+  operators, depth-bounded comparison (#1080)
+- Phase 2: lazy API hardening -- memoisation regression, precision-honest approx (#1081)
+- Phase 3: `numeric_limits` / `attributes` / `manipulators` (#1082)
+- Phase 4: `mathlib.hpp` math facade -- functions + constants (#1083)
+- Phase 5: dedicated IEEE non-finite state + conversion/logic/arithmetic suites (#1084)
+- #1061 Phase 3b: online `e_zbcl` (~195x); Abel-summed `euler_gamma` (~2.3-3.1x) (#1087, #1088)
+- Epic #1079 closed; `docs/design/elreal-euler-gamma.md` written
+
+## Architecture Decisions
+
+### The facade is elastic, not trivial
+
+`elreal` holds a `ZBCL` (a `shared_ptr`-backed memoised co-list), so it is **not**
+trivially copyable. The #925 hardware-shareable triviality rule applies to the
+static `block`, not to this elastic facade -- same posture as `ereal`/`einteger`.
+The facade is modelled on `ereal_impl.hpp` (the eager Priest/Shewchuk sibling).
+
+### Three settled facade design decisions (Phase 1)
+
+| Concern | Decision |
+|---------|----------|
+| Precision | runtime `_depth` member + `.precision()` + thread-local default with an RAII `elreal_precision_guard` scoped override |
+| Operators | **fully lazy** -- `+ - * /` store unforced `add`/`mul_online`/`div_online` streams; evaluation happens only at a boundary (conversion / compare / I/O / explicit `approx`) |
+| Comparison | **depth-bounded** -- compare `a-b` to the deeper of the two depths; exact ordering when the difference's leading limb is nonzero (exact equality of distinct irrationals is undecidable) |
+
+### Non-finite policy (Phase 5) -- a deliberately reversed decision
+
+Phase 3 shipped a **finite-only** `numeric_limits` (the LFPERA model is exact over
+the finite reals). For Phase 5 the user chose the larger option: a **dedicated
+IEEE-style non-finite state** (`elreal_class {finite, pinf, ninf, qnan}` + a `_cls`
+member), which *revised* the Phase 3 contract. Rationale: plug-in kernels can
+produce `NaN`/`+-Inf` (`x/0`, `log` of a negative, conversion overflow), and a
+predictable IEEE classification beats silently mapping them to 0. The tag
+propagates by IEEE-754 rules through arithmetic, comparison (`elreal_order_of`:
+NaN unordered, `-inf < finite < +inf`), conversion, unary minus, and `abs`.
+
+### euler_gamma: Brent-McMillan + Abel, binary splitting deferred
+
+`euler_gamma` has no elementary series, so it uses Brent-McMillan B1
+(`gamma = A/B - ln(n)`). The dominant cost was the `A = sum_k w_k H_k`
+accumulation. Abel summation rewrites it as `A = sum_{k>=0} tail_k/(k+1)`
+(`tail_k = B - B_k`), replacing the per-term full multiply with a single-block
+scalar division and removing the harmonic numbers entirely. The asymptotically
+better approach (binary splitting with `einteger` P/Q/B/T) was **investigated and
+deliberately deferred**: it needs an arbitrary-precision-integer dependency the
+elreal math layer does not have, a new `einteger -> ZBCL` bridge, and a
+harmonic-aware splitting tuple, with no repo precedent. Full reasoning and the
+algorithm are in `docs/design/elreal-euler-gamma.md`.
+
+## Notable Bugs / Gotchas Surfaced
+
+- **CodeRabbit (Critical), Phase 5** -- `to_triple(-inf)` printed `"(-, -inf)"`:
+  the sign was emitted twice (the `(sign, ...)` prefix plus a signed tag). Fixed
+  with an unsigned-tag option on `nonfinite_tag`; regression checks added.
+- **`a + (-a)` is value-zero but not `iszero()`** -- lazy exact cancellation yields
+  a zero-*valued* stream, not the structurally-empty canonical zero. `iszero()` is
+  a cheap structural test (the right semantics for a semi-decidable lazy real), so
+  it is correctly `false` there. A test assertion was relaxed and documented.
+- **cppcheck `duplicateCondition` folds mirror operators** -- `a<b` and `b>a`
+  normalise to the same condition, so the logic suite's `if (...) ++n;` checks read
+  as duplicates. Rewrote as `n += !(...)` accumulation statements on opaque
+  (volatile-seeded) operands -- exercises every operator with no duplicate
+  if-conditions.
+- **Codacy `noExplicitConstructor` counts per-instantiation/per-ctor-edit** -- the
+  intentional implicit plug-in ctors (same as ereal/cfloat/posit) get re-counted
+  when a new instantiation appears (first `elreal<float>` test) or the ctor region
+  is edited. Accepted as non-blocking (Codacy is not a required check); making them
+  `explicit` would break plug-in semantics.
+- **The 320-digit constant oracle is `REGRESSION_LEVEL_4`-gated** -- a default
+  (`LEVEL_1`) build compiles the high-precision check out and the test "PASSes" as
+  a no-op. The euler_gamma cancellation analysis was only truly validated after
+  rebuilding with `-DREGRESSION_LEVEL_4=1` (305 digits, gcc + clang). Recorded so
+  the next person is not fooled by the no-op pass.
+- **Eager constants were pathologically slow, not the functions** -- profiling the
+  Phase 4 facade showed every transcendental *function* < 100 ms, but `e_zbcl(32)`
+  ~16 s and `euler_gamma_zbcl` ~11 s, because those two *constants* had not been
+  moved onto the online series path. (`exp(1)` via the online path is 1 ms.)
+
+## Performance Results
+
+| Item | Before | After | Speedup |
+|------|-------:|------:|--------:|
+| `e_zbcl(16)` | 16165 ms | 83 ms | ~195x |
+| `euler_gamma(8)` | 4408 ms | 1890 ms | ~2.3x |
+| `euler_gamma(16)` | 10986 ms | 3578 ms | ~3.1x |
+
+All value-identical to the 320-digit reference (e: 307 digits, euler_gamma: 305).
+
+## Pull Requests
+
+| PR | Title | Merge |
+|----|-------|-------|
+| [#1080](https://github.com/stillwater-sc/universal/pull/1080) | facade scaffold (#1079 Phase 1) | 22857fb7 |
+| [#1081](https://github.com/stillwater-sc/universal/pull/1081) | lazy API hardening (#1079 Phase 2) | d679ebe8 |
+| [#1082](https://github.com/stillwater-sc/universal/pull/1082) | numeric_limits/attributes/manipulators (#1079 Phase 3) | 1ccdf224 |
+| [#1083](https://github.com/stillwater-sc/universal/pull/1083) | math facade (#1079 Phase 4) | 6444f6b2 |
+| [#1084](https://github.com/stillwater-sc/universal/pull/1084) | non-finite state + suites (#1079 Phase 5) | 823dff27 |
+| [#1087](https://github.com/stillwater-sc/universal/pull/1087) | online e_zbcl + drop gamma peak pass (#1061 Ph3b) | ace597bd |
+| [#1088](https://github.com/stillwater-sc/universal/pull/1088) | Abel-summed euler_gamma (#1061 Ph3b) | 79272eb5 |
+
+## Process Notes
+
+- Per-phase workflow: branch off `main`, draft PR -> fast tier (gcc+clang CI_LITE)
+  -> resolve CodeRabbit/Codacy -> `gh pr ready` -> full tier (11 platforms +
+  ASan/UBSan + Coverage + Clang-Tidy) -> admin-squash-merge `--delete-branch` ->
+  sync main -> check off the phase on #1079.
+- When a phase started before the prior merged, it branched off the prior tip and
+  was rebased onto `main` (`git rebase --onto main <prior-tip> <branch>`) after the
+  prior squash-merge.
+
+## Follow-on Work (tracked, not blocking)
+
+- **Binary-splitting `euler_gamma`** -- the asymptotic `O(M(D) log^2 D)` win;
+  needs einteger integration + an einteger->ZBCL bridge (see the design doc).
+- **`to_hex` + high-precision decimal printer** for elreal manipulators (deferred
+  in Phase 3, as in `ereal` where both are `tbd` stubs).