feat(elreal): Phase B -- rational and string construction with lazy refinement#884
Conversation
…inement Phase B of epic #873 (resolves #875): adds the construction-from-source machinery and activates the lazy refinement protocol for rational inputs. Generator slot promoted from reserved-comment to real member: - mutable std::function<double(std::size_t)> _generator - at(k) walks the generator on cache miss, appending to _components - refine_to(precision_bits) iterates at(0..ceil(p/53)) - An absent generator (e.g. for elreal(double) values) leaves at(k>=depth) returning the implicit-zero extension, same as Phase A Rational constructor elreal(long long p, long long q): - Component 0: double(p) / double(q) (IEEE-754 rounded) - Component 1 (via generator): exact residual p - c0 * q computed with two_prod + two_sum, divided by q. Exact for |p| < 2^53 (i.e. any long long whose absolute value is within the 53-bit double-significand envelope); incurs one extra rounding for larger magnitudes. - Components 2+: 0.0 with a documented limitation. Arbitrary-depth refinement via McCleeary long division is deferred to Phase E/F. - elreal(p, 0) -> NaN; elreal(0, q) -> canonical zero. String constructor elreal(const std::string&): - Accepts decimal ("3.14"), scientific ("6.022e23"), and rational ("1/3") forms, plus the special tokens nan / inf / infinity (case-insensitive, with optional sign). - The parser normalises decimal / scientific to (p, q) form and delegates to the rational constructor. The acceptance criterion of #875 -- elreal("1/3").at(1) != 0 -- is met because the rational path returns a non-trivial correction. - Overflow fallback: scientific-notation values whose normalised numerator or denominator exceeds long long range fall back to a double-precision reconstruction (mantissa * pow(10, exp)). The leading component is faithful; exact-rational refinement past component 0 is unavailable. This handles e.g. "6.022e23" gracefully. - Silent-failure pattern matching dfloat / ereal: the std::string ctor swallows parse errors and returns canonical zero. parse(str, elreal&) is the bool-returning variant for callers that want to observe errors. operator double() fix: start the sum at _components[0] instead of 0.0 so that single-component -0.0 round-trips with its IEEE sign bit preserved. Tests under elastic/elreal/conversion/: - native_types.cpp: round-trip int/float/double/subnormal/inf/nan/-0.0 - rational.cpp: elreal(1, 3) delivers a non-zero correction; refinement is observably distinct from elreal(1.0/3.0); 1/2 and 1/4 are exact in a single component; -3/4 is exact; 1/0 -> NaN; 0/7 -> zero - string.cpp: decimal, scientific, rational, special-token, and parse- failure cases Cross-construction from dd / qd / ereal is left as future work; #875 marks it optional and the include-graph cost is non-trivial. Verified clean under gcc 13 and clang on dual-build-dir setup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughImplements ELREAL Phase B: adds rational and string constructors, a lazy generator for residual components, updates component materialization and double conversion to preserve IEEE semantics, and adds test programs validating native-type round-trips, rational refinement, and string parsing. ChangesELREAL Phase B: Constructors, Parsing, and Lazy Refinement
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (3)
elastic/elreal/conversion/string.cpp (2)
71-94: ⚡ Quick winCover optional sign and surrounding whitespace for special-token parsing.
Line 71 tests case-insensitivity well, but parser behavior also claims optional sign and whitespace handling. Add token cases like
" +inf "(and optionally" -inf ") to prevent regressions in that path.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@elastic/elreal/conversion/string.cpp` around lines 71 - 94, Add test inputs that cover optional leading/trailing whitespace and optional sign for special-token parsing: extend the nan and inf blocks that construct elreal from std::string (symbols: elreal, isnan(), isinf(), double(elreal)) to also create and assert parsing of strings like " +inf ", " -inf " (and similarly " NaN " / " +NaN " if desired); ensure the new +inf/inf variants call isinf() and that negative variants produce negative infinity by checking double(...) < 0.0, and treat whitespace-wrapped tokens as valid parsed values just like the existing cases.
34-52: ⚡ Quick winAdd an explicit negative-zero parsing assertion.
Line 35 onwards validates numeric equality, but
-0.0 == 0.0so this does not verify sign-bit preservation for string input. Add a dedicated"-0"case usingstd::signbit(double(...))to lock the-0.0contract.Proposed test addition
@@ `#include` <universal/number/elreal/elreal.hpp> `#include` <universal/verification/test_suite.hpp> +#include <cmath> @@ // --- decimal integers ------------------------------------------------ nrOfFailedTestCases += check_string_value("0", 0.0); @@ nrOfFailedTestCases += check_string_value("+12", 12.0); + { + elreal neg_zero{ std::string("-0") }; + if (!std::signbit(double(neg_zero))) { + std::cerr << "FAIL: elreal(\"-0\") did not preserve negative zero\n"; + ++nrOfFailedTestCases; + } + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@elastic/elreal/conversion/string.cpp` around lines 34 - 52, Add a dedicated test for the "-0" input to verify negative-zero preservation: after parsing the string "-0" using the same parser used by check_string_value (locate the parsing call inside or adjacent to check_string_value), assert that std::signbit(parsed_double) is true (rather than relying on == comparison). Keep the existing numeric-equality tests and add this extra signbit assertion so the "-0.0" sign-bit contract is locked in.elastic/elreal/conversion/native_types.cpp (1)
62-69: 💤 Low valueOptional: prefer
!std::signbit(...)over!= true.The logic is correct, but line 65 could be slightly more idiomatic:
if (!std::signbit(double(x))) {instead of
if (std::signbit(double(x)) != true) {Both work; the former is the more common C++ style.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@elastic/elreal/conversion/native_types.cpp` around lines 62 - 69, Replace the explicit comparison to true in the signbit check with the idiomatic negation: in the block that defines nz and elreal x (symbols: nz, x, elreal) change the condition using std::signbit(double(x)) != true to use !std::signbit(double(x)); leave the rest of the failure handling (nrOfFailedTestCases increment and error message) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@include/sw/universal/number/elreal/elreal_impl.hpp`:
- Around line 561-567: The code negates p when computing pmag (long long pmag =
p < 0 ? -p : p), which is UB if p == LLONG_MIN; change pmag to an unsigned type
and compute its magnitude without signed negation — e.g. use unsigned long long
pmag and set pmag = p < 0 ? static_cast<unsigned long long>(-(p + 1)) + 1ULL :
static_cast<unsigned long long>(p); then compare pmag against
std::numeric_limits<unsigned long long>::max() / 10 and update overflowed and p
(p *= 10) accordingly; update the loop in the q_pow < 0 branch (symbols: p,
pmag, overflowed, q_pow) so no signed overflow can occur.
- Around line 476-496: Parsing of the rational numerator/denominator (variables
num and den in the parsing block that checks slash != std::string::npos) does
not detect overflow when doing num = num * 10 + digit and den = den * 10 +
digit; modify both accumulation loops to check for overflow before
multiplying/adding (e.g., compare against std::numeric_limits<long long>::max():
if num > (LLONG_MAX - digit) / 10 then return false, and similarly for den) so
that the function returns false on overflow instead of silently wrapping; keep
the rest of the logic (found_num/found_den, den != 0, applying negative flag to
out) unchanged.
- Around line 498-522: The loop that accumulates mantissa (variable mantissa)
must detect overflow like the rational parser does: add a bool overflowed flag
and before doing mantissa = mantissa * 10 + (c - '0') check (e.g. compare
against LLONG_MAX/10 and digit) and set overflowed when it would wrap; continue
scanning digits but do not let undefined wrap occur. After the digit loop and
before handling the exponent ('e'/'E') branch, check overflowed and handle it
the same way as the rational parsing code path (e.g. reject/adjust parse or
branch to the overflow handling logic) so overflow is not silently ignored.
Ensure references to found_digit, seen_dot, frac_digits and pos remain
consistent.
---
Nitpick comments:
In `@elastic/elreal/conversion/native_types.cpp`:
- Around line 62-69: Replace the explicit comparison to true in the signbit
check with the idiomatic negation: in the block that defines nz and elreal x
(symbols: nz, x, elreal) change the condition using std::signbit(double(x)) !=
true to use !std::signbit(double(x)); leave the rest of the failure handling
(nrOfFailedTestCases increment and error message) unchanged.
In `@elastic/elreal/conversion/string.cpp`:
- Around line 71-94: Add test inputs that cover optional leading/trailing
whitespace and optional sign for special-token parsing: extend the nan and inf
blocks that construct elreal from std::string (symbols: elreal, isnan(),
isinf(), double(elreal)) to also create and assert parsing of strings like "
+inf ", " -inf " (and similarly " NaN " / " +NaN " if desired); ensure
the new +inf/inf variants call isinf() and that negative variants produce
negative infinity by checking double(...) < 0.0, and treat whitespace-wrapped
tokens as valid parsed values just like the existing cases.
- Around line 34-52: Add a dedicated test for the "-0" input to verify
negative-zero preservation: after parsing the string "-0" using the same parser
used by check_string_value (locate the parsing call inside or adjacent to
check_string_value), assert that std::signbit(parsed_double) is true (rather
than relying on == comparison). Keep the existing numeric-equality tests and add
this extra signbit assertion so the "-0.0" sign-bit contract is locked in.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ac8b4613-8979-4623-968f-2591168dc2c6
📒 Files selected for processing (4)
elastic/elreal/conversion/native_types.cppelastic/elreal/conversion/rational.cppelastic/elreal/conversion/string.cppinclude/sw/universal/number/elreal/elreal_impl.hpp
Coverage Report for CI Build 26201081295Warning Build has drifted: This PR's base is out of sync with its target branch, so coverage data may include unrelated changes. Coverage increased (+0.01%) to 84.249%Details
Uncovered ChangesNo uncovered changes found. Coverage Regressions2 previously-covered lines in 1 file lost coverage.
Coverage Stats
💛 - Coveralls |
Address CodeRabbit findings on PR #884. Three correctness fixes in elreal_impl.hpp: 1. Rational parser num / den accumulation now rejects on signed-overflow (num * 10 + digit > LLONG_MAX) before the multiply happens. Without the guard, 19+ digit inputs silently wrapped. 2. Decimal / scientific parser mantissa accumulation gets the same guard. 3. The q_pow < 0 multiplication loop replaces the `pmag = (p < 0 ? -p : p)` magnitude with unsigned arithmetic so LLONG_MIN does not trigger signed-negation UB. With the mantissa overflow guard above, p cannot actually reach LLONG_MIN in this path, but the unsigned form is defensive against a future caller of the rational ctor with a hand-rolled value. One consistency fix: parse("-0"), parse("-0.0"), parse("-0/7") now all preserve the negative sign through to the resulting elreal. Previously the parser collapsed these to canonical (sign-less) zero, inconsistent with elreal(-0.0) which preserves the IEEE-754 sign bit. Sign now survives across all "construct from -0" paths. Tests under elastic/elreal/conversion/string.cpp gain three new cases (elreal("-0"), elreal("-0.0"), elreal("-0/7") all asserted via std::signbit) so the contract is locked in. Trivial style cleanup in elastic/elreal/conversion/native_types.cpp: std::signbit(...) != true -> !std::signbit(...). Declining one CodeRabbit nitpick (whitespace-wrapped tokens like " +inf "): the parser claims leading-whitespace-only support (matching ereal's parse() in include/sw/universal/number/ereal/ ereal_impl.hpp:327). Trailing-whitespace handling is not part of the contract on either type; adding it without ereal also adopting it would create an asymmetric API surface. Left as future work if a consumer needs it on both types. Verified clean under gcc 13 and clang. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Review status after CodeRabbit's first pass (commit 3ead8ea):
Validated under gcc 13 and clang; all four |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@include/sw/universal/number/elreal/elreal_impl.hpp`:
- Around line 522-524: The loop that parses digits currently returns false on
mantissa overflow; instead record the overflow (e.g., set a bool like
mantissa_overflow = true) and continue scanning remaining characters to fully
validate the literal without updating mantissa further, so you don't mutate
mantissa when it would overflow. Keep the overflow detection using the same
check around (std::numeric_limits<long long>::max() - digit) / 10, but when it
trips simply flip the flag and skip the mantissa update rather than returning
false; this allows the later fallback logic (the existing code around the
current fallback at lines ~596-603 that builds an approximate leading double) to
run and produce the faithful approximate value.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e8075744-7af7-4b48-8130-c2a0f38b38fe
📒 Files selected for processing (3)
elastic/elreal/conversion/native_types.cppelastic/elreal/conversion/string.cppinclude/sw/universal/number/elreal/elreal_impl.hpp
Address CodeRabbit second-round finding on PR #884. The previous commit added overflow guards on the mantissa accumulator that returned false immediately, which had an unintended consequence: valid scientific literals with 20+ significant digits (e.g., "12345678901234567890e5") silently parsed as canonical zero through the std::string ctor. The code already had an `overflowed` flag and a fallback path for the q_pow scaling overflow case, but the mantissa overflow was hitting return false before reaching the fallback. Fix: on mantissa overflow, set a mantissa_overflow flag and keep scanning to validate the rest of the literal (rejecting on invalid characters, exhausting the exponent, etc.). After the parse loop completes, if mantissa_overflow is set, use std::stod() on the original input string to get the correctly-rounded leading double and construct the elreal from that. Exact-rational refinement past depth 0 is unavailable in this overflow path -- a documented Phase B limitation matching the existing q_pow overflow fallback. Tests in elastic/elreal/conversion/string.cpp gain two cases: - "12345678901234567890" (20-digit integer mantissa) - "1.234567890123456789012345e5" (25-digit scientific mantissa) Both must produce non-zero results matching std::stod on the same string. Verified clean under gcc 13 and clang. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@include/sw/universal/number/elreal/elreal_impl.hpp`:
- Around line 518-535: The exponent parsing currently accumulates into int
exponent without guarding against signed overflow; add a bool exponent_overflow
(like mantissa_overflow) and, inside the exponent digit loop that updates
exponent = exponent * 10 + (c - '0'), check before updating whether exponent >
(std::numeric_limits<int>::max() - digit) / 10 and set exponent_overflow instead
of performing the update; stop updating exponent once overflow would occur but
continue scanning, and ensure the existing fallback that calls std::stod(str) is
triggered when (mantissa_overflow || exponent_overflow) rather than only
mantissa_overflow so large exponents route to the fallback safely.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fc1fcfed-68e5-4c3f-af9a-db393d88aec9
📒 Files selected for processing (2)
elastic/elreal/conversion/string.cppinclude/sw/universal/number/elreal/elreal_impl.hpp
… fallback Address CodeRabbit third-round finding on PR #884. The exponent accumulation in parse_into was vulnerable to signed integer overflow UB for adversarial inputs like "1e9999999999" -- the second-round mantissa fix only covered mantissa overflow, leaving the exponent loop exposed. CodeRabbit caught this in their third pass. Two changes: 1. Mirror the mantissa overflow guard on the exponent loop. Track exponent_overflow with the same `exp > (INT_MAX - digit) / 10` test before each multiply-add. Stop updating exponent on overflow but keep scanning to validate format. Route `mantissa_overflow || exponent_overflow` to the existing fallback. 2. Switch the fallback from std::stod to std::strtod. std::stod throws std::out_of_range when the value is unrepresentable as double (e.g. for "1e9999999999"); my catch swallowed the exception and returned false, yielding canonical zero through the std::string ctor. std::strtod sets errno = ERANGE and returns +/-HUGE_VAL (overflow) or 0.0 (underflow), which matches IEEE-754 saturation semantics -- the right behavior for inputs the rational track cannot represent anyway. Tests in elastic/elreal/conversion/string.cpp gain two cases: - "1e9999999999" must saturate to +inf (not silently zero) - "1e-9999999999" must underflow to zero Caught the std::stod-throws issue during local testing of the first attempt -- the initial fix used std::stod and failed under both gcc and clang. Switching to std::strtod made both compilers green. Verified all four elreal_* binaries PASS on gcc 13 and clang. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Phase B of epic #873. Implements #875.
Activates the lazy refinement protocol that Phase A reserved a slot for, and adds the construction-from-source machinery: rational
elreal(p, q)and stringelreal("1/3")/elreal("3.14")/elreal("6.022e23").The acceptance criterion of #875 --
elreal("1/3").at(1) != 0, observably distinct fromelreal(1.0/3.0)-- is met because the rational path computes the exact residualp - c0 * qvia the existing EFTs (two_prod+two_sum).What changed
mutable std::function<double(std::size_t)>member.at(k)calls it on cache miss;refine_to(precision_bits)iteratesat(0..ceil(p/53)).elreal(long long, long long):double(p) / double(q)|p| < 2^53q == 0-> NaN;p == 0-> canonical zero.elreal(const std::string&):"3.14","-2.5""6.022e23","-1.0e-10""1/3","-22/7""nan","inf","infinity", with optional sign(p, q)form and delegates to the rational ctor. When the normalised numerator or denominator exceedslong longrange (typical for6.022e23), falls back todoublereconstruction -- leading component faithful, exact-rational refinement unavailable.dfloatandereal;parse(str, elreal&)is the bool-returning variant for callers that want to observe errors.operator double()fix: starts the sum at_components[0]instead of0.0so single-component-0.0round-trips with its sign bit preserved (otherwise0.0 + (-0.0) == +0.0swallows the sign).Cross-construction from
dd/qd/erealMarked optional in #875 and deferred in this PR. The include-graph cost of pulling in those types' headers from inside
elreal_impl.hppis non-trivial; a follow-up PR can land them targeted, once their value is clearer.Test plan
Four test binaries under
elastic/elreal/:elreal_apielreal_native_typeselreal_rationalelreal(1,3).at(1) != 0; observably distinct fromelreal(1.0/3.0); 1/2 and 1/4 exact in one component; -3/4 exact; 1/0 -> NaN; 0/7 -> zeroelreal_stringbuild_elreal/): all four PASSbuild_clang_elreal/): all four PASSCanary excerpt:
at(1) = 1.85e-17-- the exact correction1/3 - double(1/3), which is what the acceptance criterion of #875 asks for.What does not land in this PR
dd/qd/ereal<N>(future targeted PR)k >= 2) via McCleeary long division (Phase E/F)Related
Summary by CodeRabbit
New Features
Tests