perf(elreal): Phase L.1 -- depth-1 refinement for division by Ravenwater · Pull Request #917 · stillwater-sc/universal

Ravenwater · 2026-05-22T03:05:41Z

Summary

Phase L.1 of follow-up epic #903. Lifts the "elreal / is depth-0 only" caveat that has held since Phase G.

Algorithm

Let `r = a/b` be the true value. At the leading doubles:

```
r = c0 + Δa/b0 - (a0/b0^2) Δb + (a - b*c0)/b0 + O(eps^2)
```

where the IEEE residual `a - b*c0` is recoverable exactly from the leading doubles via EFTs:

```
two_prod(b0, c0) -> (prod_hi, prod_err) with b0*c0 = prod_hi + prod_err
two_diff(a0, prod_hi) -> (diff_hi, diff_err)
ieee_residual = (diff_hi + diff_err) - prod_err
```

The depth-1 component is then `c1 = (ieee_residual + a.at(1) - c0 * b.at(1)) / b0`, which fits the existing `gen_binary_linear` variant alternative with `constant = ieee_residual / b0`, `ca = 1/b0`, `cb = -c0/b0`. No new generator shape needed; just populate `gen_binary_linear` in `operator/`.

Files

`include/sw/universal/number/elreal/elreal_impl.hpp` -- depth-1 generator added to `operator/` (~40 lines changed).
`docs/number-systems/elreal.md` -- known-limitations entry on "depth-1 ceiling" updated.
`docs/algorithmic-details/lazy-real-arithmetic.md` -- Section 6 depth-1 generator table updated with the `/` row's actual formula.
`docs/algorithmic-details/elreal-performance-baseline.md` -- division cost-shape section rewritten for post-L.1.
`docs/algorithmic-details/multi-component-arithmetic.md` -- section 7.1 picker table updated: `elreal /` now beats `ereal /` by ~ 19x at matched precision.
`docs/multi-component/exact-lazy-arithmetic.md` -- "depth-0-only for /" caveat removed.

Validation

1/3 sanity: c0 = 0.333...331 (IEEE rounded), c1 = 1.85e-17 (the exact positive residual). Sum c0+c1 ≈ 1/3 to ~ 32 digits.
All 30 elreal regression tests PASS under gcc 13.3 and clang 18.1.
Phase J oracle sweep PASS under both compilers.
`benchmark_elreal_performance`: division now ~ 13 Mops/s, the same range as +/-/* (was an artifact ~ 1 Gops/s pre-L.1 from gcc inlining the entire depth-0-only operator away).

Picker shift

Op	`elreal` post-L.1	`ereal<2>`	Winner
`/`	13 Mops/s	680 Kops/s	`elreal` (~ 19x)

With L.1 in place, `elreal` matches `ereal<2>` at depth 1 across the four elementary arithmetic operators, AND beats it on division by ~ 19x. `ereal /` runs the iterative `expansion_quotient` algorithm; `elreal /` produces depth-1 in a single generator-emplace.

What this PR does NOT do

Newton iteration for depth 2+. That's Phase L.2.
Depth-2+ for sqrt. Also Phase L.2 (sqrt already has depth 1 via gen_sqrt).

Part of #906 (Phase L of follow-up epic #903).

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- elreal division now uses depth‑1 refinement, improving division precision and yielding ~13–16 Mops/s; elreal remains the only provider for sqrt/exp/log at ~36–43 Mops/s. Comparison clarifies other implementations’ iterative division behavior (~680 Kops/s).
Documentation
- Updated algorithmic and performance docs to reflect Phase L.1 depth‑1 behavior and defer deeper (depth‑2+) refinement to Phase L.2.

… + IEEE residual Phase L.1 of follow-up epic #903 (#906). Lifts the "elreal / is depth-0 only" caveat that has held since Phase G. Algorithm --------- Let r = a/b be the true value. At the leading doubles: r = c0 + Δa/b0 - (a0/b0^2) Δb + (a - b*c0)/b0 + O(eps^2) where the IEEE residual `a - b*c0` is recoverable exactly from the leading doubles via EFTs: two_prod(b0, c0) -> (prod_hi, prod_err) with b0*c0 = prod_hi + prod_err two_diff(a0, prod_hi) -> (diff_hi, diff_err) ieee_residual = (diff_hi + diff_err) - prod_err The depth-1 component is then c1 = (ieee_residual + a.at(1) - c0 * b.at(1)) / b0 which fits the existing gen_binary_linear variant alternative with: constant = ieee_residual / b0 ca = 1/b0 cb = -c0/b0 No new generator shape needed; just populate gen_binary_linear in operator/. The PR is small (~ 40 lines changed in elreal_impl.hpp). Validation ---------- - 1/3 produces c0 = 0.333...331 (IEEE rounded) and c1 = 1.85e-17, the exact positive residual. Sanity check: c0 + c1 ~= 1/3 to ~ 32 digits. - All 30 elreal regression tests PASS under gcc 13.3 and clang 18.1. - Phase J oracle sweep PASS under both compilers. - benchmark_elreal_performance: division now in the same 13 Mops/s range as +/-/* (it was an artifact ~ 1 Gops/s pre-L.1 from gcc inlining the entire depth-0-only operator away). Doc updates ----------- - docs/number-systems/elreal.md: known-limitations entry on "depth-1 ceiling" updated to mention `/` is now included alongside the other operators. - docs/algorithmic-details/lazy-real-arithmetic.md: Section 6 depth-1 generator table updated with the / row's actual formula. - docs/algorithmic-details/elreal-performance-baseline.md: division cost-shape section rewritten to describe the post-L.1 reality (1 Gops/s artifact is gone; throughput now ~ 13 Mops/s, the same range as the other binary operators). - docs/algorithmic-details/multi-component-arithmetic.md: section 7.1 picker table updated -- `elreal /` now beats `ereal<N> /` by ~ 19x at matched precision (was apples-to-oranges pre-L.1). - docs/multi-component/exact-lazy-arithmetic.md: design narrative no longer says "depth-0-only for /". What this does NOT do --------------------- - Newton iteration for depth 2+ -- that's Phase L.2. - Depth-2+ for sqrt -- also Phase L.2 (sqrt already has depth 1). Part of #906 (Phase L of follow-up epic #903). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-22T03:05:59Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c9ea929a-3d23-4451-a76c-98329cfce62c

📥 Commits

Reviewing files that changed from the base of the PR and between 35e8d5d and 8ed3622.

📒 Files selected for processing (1)

include/sw/universal/number/elreal/elreal_impl.hpp

📝 Walkthrough

Walkthrough

This PR implements depth-1 refinement for elreal division and updates documentation across the codebase. The operator/ now computes an IEEE residual via EFT primitives and installs a gen_binary_linear generator for depth-1 correction, replacing the prior single-double limitation. Documentation updates reflect the Phase L.1 completion, new throughput metrics (~13–16 Mops/s), and deferral of deeper Newton refinement to Phase L.2.

Changes

Phase L.1 Depth-1 Division Refinement

Layer / File(s)	Summary
Depth-1 division generator implementation `include/sw/universal/number/elreal/elreal_impl.hpp`	`operator/` now computes leading quotient and conditionally installs a `gen_binary_linear` depth-1 correction generator using EFT-derived residuals and operand depth-1 Taylor partials, replacing the prior leading-only behavior.
Algorithmic specification for depth-1 division `docs/algorithmic-details/lazy-real-arithmetic.md`	The depth-1 generator formula is documented as combining IEEE division residual with Taylor-partial operand corrections. Phase L.1 covers depth-1 for arithmetic operators; Phase L.2 defers deeper Newton refinement and lazy-division walking.
Performance metrics and design narrative updates `docs/algorithmic-details/elreal-performance-baseline.md`, `docs/algorithmic-details/multi-component-arithmetic.md`, `docs/number-systems/elreal.md`, `docs/multi-component/exact-lazy-arithmetic.md`	Performance tables and narratives updated to show `elreal /` at depth-1 post-L.1 (~13–16 Mops/s) and expanded comparison rows; roadmap reference changed to Phase L.2 / Phase M epic (`#903`) for depth-2+ refinement.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Epic: elreal follow-up work -- depth-2+ refinement, range reduction, perf, oracle sweep #903: Phase L.2/M follow-up epic for depth-2+ Newton refinement, lazy-division walking, and lazily-loaded transcendental pi support—tracks the deferral of deeper refinement started in this PR.
elreal Phase L: Depth-2+ arithmetic refinement -- Newton for / and sqrt #906: Phase L.2 depth-2+ refinement for division/sqrt via Newton iteration—directly follows Phase L.1's depth-1 generator completion.

Possibly related PRs

stillwater-sc/universal#916: Refactored tagged-union lazy_generator and gen_binary_linear wiring leveraged by this PR.
stillwater-sc/universal#885: Prior Phase C division groundwork related to / operator behavior.
stillwater-sc/universal#884: Introduced generator materialization protocol (at()/refine_to()) used by the depth-1 division generator.

Suggested labels

enhancement

Poem

🐰 A rabbit refines the division with care,
Installing generators, layer by layer,
EFT residuals and Taylor dreams,
Depth-1 precision in elegant schemes,
Phase L.1 hops toward the next frontier! 🌙

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main change: introducing depth-1 refinement for the division operator in elreal as part of Phase L.1, which is the primary implementation focus of this PR.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/elreal-phase-l1

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@include/sw/universal/number/elreal/elreal_impl.hpp`:
- Around line 949-977: Compute the depth-1 coefficients (ieee_residual/b0,
1.0/b0, -c0/b0) and verify each is finite before assigning result._generator =
gen_binary_linear; if any coefficient is not finite (e.g. due to tiny non-zero
b0 producing inf/NaN) do not install the generator and simply return result
as-is. Update the operator/… division code around variables c0, b0, prod_err,
ieee_residual and the gen_binary_linear assignment to perform std::isfinite
checks on the three coefficients (or on ca/cb/constant) and only set
result._generator when all three pass. Ensure this also prevents
evaluate_generator(gen_binary_linear) from receiving inf/NaN multipliers
(affecting elreal::at(1) usage).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9e9a647c-7ef1-467e-b9c1-9b9070ee1268

📥 Commits

Reviewing files that changed from the base of the PR and between e666ed5 and 35e8d5d.

📒 Files selected for processing (6)

docs/algorithmic-details/elreal-performance-baseline.md
docs/algorithmic-details/lazy-real-arithmetic.md
docs/algorithmic-details/multi-component-arithmetic.md
docs/multi-component/exact-lazy-arithmetic.md
docs/number-systems/elreal.md
include/sw/universal/number/elreal/elreal_impl.hpp

Add finite-check guard on the depth-1 coefficients of operator/. CodeRabbit caught the edge case: if b0 is a denormal whose reciprocal overflows to inf, the computed ca = 1/b0, cb = -c0/b0, and constant = ieee_residual/b0 can each become non-finite even though c0 = a0/b0 itself was finite. Without a guard, installing the generator would propagate inf/NaN into every depth-1 walk that touches at(1). Fix: precompute ca, cb, cconst into named locals; std::isfinite check all three; bail out to depth-0-only if any is non-finite. The bail-out preserves the leading double (which is correct per IEEE-754) and returns 0.0 for at(k >= 1). Verified: - denorm / denorm: at(0) = 1.0 (correct), at(1) = 0.0 (would have been inf without the guard). - Normal case 2.0/3.0: at(0) = 0.666...663, at(1) = +3.7e-17 (correct positive residual). Behavior unchanged for finite-coefficient cases. - All sampled elreal tests + Phase J oracle sweep PASS under gcc 13.3 and clang 18.1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Ravenwater · 2026-05-22T03:16:46Z

Addressed the CodeRabbit nitpick in 8ed3622.

Edge case caught: if b0 is a denormal whose reciprocal overflows to inf, the computed ca = 1/b0, cb = -c0/b0, and cconst = ieee_residual/b0 can each become non-finite even though c0 = a0/b0 itself was finite. Without a guard, installing the generator would propagate inf/NaN into every depth-1 walk that touches at(1).

Fix: precompute ca, cb, cconst into named locals; std::isfinite check all three; bail out to depth-0-only if any is non-finite. The bail-out preserves the leading double (which is correct per IEEE-754) and returns 0.0 for at(k >= 1).

Verified:

denorm / denorm: at(0) = 1.0 (correct), at(1) = 0.0 (would have been inf without the guard)
Normal case 2.0/3.0: at(0) = 0.666...663, at(1) = +3.7e-17 (correct positive residual). Behavior unchanged for finite-coefficient cases.
All sampled elreal tests + Phase J oracle sweep PASS under gcc 13.3 and clang 18.1.

coveralls · 2026-05-22T03:50:59Z

Coverage Report for CI Build 26266561859

Warning

Build has drifted: This PR's base is out of sync with its target branch, so coverage data may include unrelated changes.
Quick fix: rebase this PR. Learn more →

Coverage decreased (-0.01%) to 84.233%

Details

Coverage decreased (-0.01%) from the base build.
Patch coverage: No coverable lines changed in this PR.
10 coverage regressions across 1 file.

Uncovered Changes

No uncovered changes found.

Coverage Regressions

10 previously-covered lines in 1 file lost coverage.

File	Lines Losing Coverage	Coverage
include/sw/universal/verification/test_suite_randoms.hpp	10	31.07%

Coverage Stats


Relevant Lines:	55729
Covered Lines:	46942
Line Coverage:	84.23%
Coverage Strength:	5288105.08 hits per line

💛 - Coveralls

Ravenwater self-assigned this May 22, 2026

Ravenwater added this to Universal Number Library May 22, 2026

Ravenwater added this to the V4 milestone May 22, 2026

coderabbitai Bot reviewed May 22, 2026

View reviewed changes

Comment thread include/sw/universal/number/elreal/elreal_impl.hpp

Ravenwater marked this pull request as ready for review May 22, 2026 03:22

Ravenwater merged commit e673df8 into main May 22, 2026
32 checks passed

Ravenwater deleted the feat/elreal-phase-l1 branch May 22, 2026 03:36

github-project-automation Bot moved this to Done in Universal Number Library May 22, 2026

Ravenwater mentioned this pull request May 22, 2026

fix(elreal): rename _inline member to _inl_buf for MSVC compatibility #919

Merged

5 tasks

coderabbitai Bot mentioned this pull request May 22, 2026

perf(elreal): Phase L.2.a -- depth-2 refinement for division #918

Merged

6 tasks

coderabbitai Bot mentioned this pull request Jun 1, 2026

feat(elreal): negation, multiplication, division (Phase 6, #930) #1037

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(elreal): Phase L.1 -- depth-1 refinement for division#917

perf(elreal): Phase L.1 -- depth-1 refinement for division#917
Ravenwater merged 2 commits into
mainfrom
feat/elreal-phase-l1

Ravenwater commented May 22, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 22, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Ravenwater commented May 22, 2026

Uh oh!

Uh oh!

coveralls commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Ravenwater commented May 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Algorithm

Files

Validation

Picker shift

What this PR does NOT do

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Ravenwater commented May 22, 2026

Uh oh!

Uh oh!

coveralls commented May 22, 2026

Coverage Report for CI Build 26266561859

Coverage decreased (-0.01%) to 84.233%

Details

Uncovered Changes

Coverage Regressions

Coverage Stats

💛 - Coveralls

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Ravenwater commented May 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 22, 2026 •

edited

Loading