|
| 1 | +# Development Session: elreal Class Facade (Epic #1079) + Constant Performance (#1061 Phase 3b) |
| 2 | + |
| 3 | +**Date:** 2026-06-18 .. 2026-06-21 |
| 4 | +**Branch:** per-phase feature branches off `main` (all merged) |
| 5 | +**Focus:** Give `elreal` (McCleeary LFPERA lazy exact-real) the standard Universal |
| 6 | +plug-in facade, and make the last two eager math constants fast. |
| 7 | +**Status:** Complete -- Epic #1079 closed; #1061 Phase 3b constant work merged. |
| 8 | + |
| 9 | +## Session Overview |
| 10 | + |
| 11 | +`elreal` is the library's lazy exact-real type: a real is a (possibly infinite) |
| 12 | +co-list of `block<FpType>` (the `ZBCL`), and arithmetic is online -- each consumer |
| 13 | +pull drives exactly as much work as the requested precision needs. Before this |
| 14 | +session it existed only as free functions over `ZBCL`. This session built the |
| 15 | +`class elreal<FpType>` facade so it plugs into templated kernels like every other |
| 16 | +number system, then closed out the remaining constant-generator performance gap. |
| 17 | + |
| 18 | +Two threads: |
| 19 | + |
| 20 | +1. **Epic #1079 -- the class facade** (Phases 1-5; Phases 1-2 landed at the very |
| 21 | + start, Phases 3-5 are the bulk of this session). Five PRs, all merged. |
| 22 | +2. **#1061 Phase 3b -- eager-constant conversion** (online `e`, Abel-summed |
| 23 | + `euler_gamma`). Two PRs, both merged. |
| 24 | + |
| 25 | +All work validated on gcc + clang (Release), gcc Debug-with-assertions for the new |
| 26 | +suites, cppcheck (Codacy gate), ASCII guard, and -- for the constants -- the |
| 27 | +320-digit mpmath oracle at `REGRESSION_LEVEL_4`. |
| 28 | + |
| 29 | +### Goals Achieved |
| 30 | + |
| 31 | +- Phase 1: facade scaffold -- `class elreal`, native ctors/conversions, lazy |
| 32 | + operators, depth-bounded comparison (#1080) |
| 33 | +- Phase 2: lazy API hardening -- memoisation regression, precision-honest approx (#1081) |
| 34 | +- Phase 3: `numeric_limits` / `attributes` / `manipulators` (#1082) |
| 35 | +- Phase 4: `mathlib.hpp` math facade -- functions + constants (#1083) |
| 36 | +- Phase 5: dedicated IEEE non-finite state + conversion/logic/arithmetic suites (#1084) |
| 37 | +- #1061 Phase 3b: online `e_zbcl` (~195x); Abel-summed `euler_gamma` (~2.3-3.1x) (#1087, #1088) |
| 38 | +- Epic #1079 closed; `docs/design/elreal-euler-gamma.md` written |
| 39 | + |
| 40 | +## Architecture Decisions |
| 41 | + |
| 42 | +### The facade is elastic, not trivial |
| 43 | + |
| 44 | +`elreal` holds a `ZBCL` (a `shared_ptr`-backed memoised co-list), so it is **not** |
| 45 | +trivially copyable. The #925 hardware-shareable triviality rule applies to the |
| 46 | +static `block`, not to this elastic facade -- same posture as `ereal`/`einteger`. |
| 47 | +The facade is modelled on `ereal_impl.hpp` (the eager Priest/Shewchuk sibling). |
| 48 | + |
| 49 | +### Three settled facade design decisions (Phase 1) |
| 50 | + |
| 51 | +| Concern | Decision | |
| 52 | +|---------|----------| |
| 53 | +| Precision | runtime `_depth` member + `.precision()` + thread-local default with an RAII `elreal_precision_guard` scoped override | |
| 54 | +| Operators | **fully lazy** -- `+ - * /` store unforced `add`/`mul_online`/`div_online` streams; evaluation happens only at a boundary (conversion / compare / I/O / explicit `approx`) | |
| 55 | +| Comparison | **depth-bounded** -- compare `a-b` to the deeper of the two depths; exact ordering when the difference's leading limb is nonzero (exact equality of distinct irrationals is undecidable) | |
| 56 | + |
| 57 | +### Non-finite policy (Phase 5) -- a deliberately reversed decision |
| 58 | + |
| 59 | +Phase 3 shipped a **finite-only** `numeric_limits` (the LFPERA model is exact over |
| 60 | +the finite reals). For Phase 5 the user chose the larger option: a **dedicated |
| 61 | +IEEE-style non-finite state** (`elreal_class {finite, pinf, ninf, qnan}` + a `_cls` |
| 62 | +member), which *revised* the Phase 3 contract. Rationale: plug-in kernels can |
| 63 | +produce `NaN`/`+-Inf` (`x/0`, `log` of a negative, conversion overflow), and a |
| 64 | +predictable IEEE classification beats silently mapping them to 0. The tag |
| 65 | +propagates by IEEE-754 rules through arithmetic, comparison (`elreal_order_of`: |
| 66 | +NaN unordered, `-inf < finite < +inf`), conversion, unary minus, and `abs`. |
| 67 | + |
| 68 | +### euler_gamma: Brent-McMillan + Abel, binary splitting deferred |
| 69 | + |
| 70 | +`euler_gamma` has no elementary series, so it uses Brent-McMillan B1 |
| 71 | +(`gamma = A/B - ln(n)`). The dominant cost was the `A = sum_k w_k H_k` |
| 72 | +accumulation. Abel summation rewrites it as `A = sum_{k>=0} tail_k/(k+1)` |
| 73 | +(`tail_k = B - B_k`), replacing the per-term full multiply with a single-block |
| 74 | +scalar division and removing the harmonic numbers entirely. The asymptotically |
| 75 | +better approach (binary splitting with `einteger` P/Q/B/T) was **investigated and |
| 76 | +deliberately deferred**: it needs an arbitrary-precision-integer dependency the |
| 77 | +elreal math layer does not have, a new `einteger -> ZBCL` bridge, and a |
| 78 | +harmonic-aware splitting tuple, with no repo precedent. Full reasoning and the |
| 79 | +algorithm are in `docs/design/elreal-euler-gamma.md`. |
| 80 | + |
| 81 | +## Notable Bugs / Gotchas Surfaced |
| 82 | + |
| 83 | +- **CodeRabbit (Critical), Phase 5** -- `to_triple(-inf)` printed `"(-, -inf)"`: |
| 84 | + the sign was emitted twice (the `(sign, ...)` prefix plus a signed tag). Fixed |
| 85 | + with an unsigned-tag option on `nonfinite_tag`; regression checks added. |
| 86 | +- **`a + (-a)` is value-zero but not `iszero()`** -- lazy exact cancellation yields |
| 87 | + a zero-*valued* stream, not the structurally-empty canonical zero. `iszero()` is |
| 88 | + a cheap structural test (the right semantics for a semi-decidable lazy real), so |
| 89 | + it is correctly `false` there. A test assertion was relaxed and documented. |
| 90 | +- **cppcheck `duplicateCondition` folds mirror operators** -- `a<b` and `b>a` |
| 91 | + normalise to the same condition, so the logic suite's `if (...) ++n;` checks read |
| 92 | + as duplicates. Rewrote as `n += !(...)` accumulation statements on opaque |
| 93 | + (volatile-seeded) operands -- exercises every operator with no duplicate |
| 94 | + if-conditions. |
| 95 | +- **Codacy `noExplicitConstructor` counts per-instantiation/per-ctor-edit** -- the |
| 96 | + intentional implicit plug-in ctors (same as ereal/cfloat/posit) get re-counted |
| 97 | + when a new instantiation appears (first `elreal<float>` test) or the ctor region |
| 98 | + is edited. Accepted as non-blocking (Codacy is not a required check); making them |
| 99 | + `explicit` would break plug-in semantics. |
| 100 | +- **The 320-digit constant oracle is `REGRESSION_LEVEL_4`-gated** -- a default |
| 101 | + (`LEVEL_1`) build compiles the high-precision check out and the test "PASSes" as |
| 102 | + a no-op. The euler_gamma cancellation analysis was only truly validated after |
| 103 | + rebuilding with `-DREGRESSION_LEVEL_4=1` (305 digits, gcc + clang). Recorded so |
| 104 | + the next person is not fooled by the no-op pass. |
| 105 | +- **Eager constants were pathologically slow, not the functions** -- profiling the |
| 106 | + Phase 4 facade showed every transcendental *function* < 100 ms, but `e_zbcl(32)` |
| 107 | + ~16 s and `euler_gamma_zbcl` ~11 s, because those two *constants* had not been |
| 108 | + moved onto the online series path. (`exp(1)` via the online path is 1 ms.) |
| 109 | + |
| 110 | +## Performance Results |
| 111 | + |
| 112 | +| Item | Before | After | Speedup | |
| 113 | +|------|-------:|------:|--------:| |
| 114 | +| `e_zbcl(16)` | 16165 ms | 83 ms | ~195x | |
| 115 | +| `euler_gamma(8)` | 4408 ms | 1890 ms | ~2.3x | |
| 116 | +| `euler_gamma(16)` | 10986 ms | 3578 ms | ~3.1x | |
| 117 | + |
| 118 | +All value-identical to the 320-digit reference (e: 307 digits, euler_gamma: 305). |
| 119 | + |
| 120 | +## Pull Requests |
| 121 | + |
| 122 | +| PR | Title | Merge | |
| 123 | +|----|-------|-------| |
| 124 | +| [#1080](https://github.com/stillwater-sc/universal/pull/1080) | facade scaffold (#1079 Phase 1) | 22857fb7 | |
| 125 | +| [#1081](https://github.com/stillwater-sc/universal/pull/1081) | lazy API hardening (#1079 Phase 2) | d679ebe8 | |
| 126 | +| [#1082](https://github.com/stillwater-sc/universal/pull/1082) | numeric_limits/attributes/manipulators (#1079 Phase 3) | 1ccdf224 | |
| 127 | +| [#1083](https://github.com/stillwater-sc/universal/pull/1083) | math facade (#1079 Phase 4) | 6444f6b2 | |
| 128 | +| [#1084](https://github.com/stillwater-sc/universal/pull/1084) | non-finite state + suites (#1079 Phase 5) | 823dff27 | |
| 129 | +| [#1087](https://github.com/stillwater-sc/universal/pull/1087) | online e_zbcl + drop gamma peak pass (#1061 Ph3b) | ace597bd | |
| 130 | +| [#1088](https://github.com/stillwater-sc/universal/pull/1088) | Abel-summed euler_gamma (#1061 Ph3b) | 79272eb5 | |
| 131 | + |
| 132 | +## Process Notes |
| 133 | + |
| 134 | +- Per-phase workflow: branch off `main`, draft PR -> fast tier (gcc+clang CI_LITE) |
| 135 | + -> resolve CodeRabbit/Codacy -> `gh pr ready` -> full tier (11 platforms + |
| 136 | + ASan/UBSan + Coverage + Clang-Tidy) -> admin-squash-merge `--delete-branch` -> |
| 137 | + sync main -> check off the phase on #1079. |
| 138 | +- When a phase started before the prior merged, it branched off the prior tip and |
| 139 | + was rebased onto `main` (`git rebase --onto main <prior-tip> <branch>`) after the |
| 140 | + prior squash-merge. |
| 141 | + |
| 142 | +## Follow-on Work (tracked, not blocking) |
| 143 | + |
| 144 | +- **Binary-splitting `euler_gamma`** -- the asymptotic `O(M(D) log^2 D)` win; |
| 145 | + needs einteger integration + an einteger->ZBCL bridge (see the design doc). |
| 146 | +- **`to_hex` + high-precision decimal printer** for elreal manipulators (deferred |
| 147 | + in Phase 3, as in `ereal` where both are `tbd` stubs). |
0 commit comments