Skip to content

Add ellipke for simultaneous K(m) and E(m) computation#510

Open
mgyoo86 wants to merge 14 commits intoJuliaMath:masterfrom
mgyoo86:feat/ellipke
Open

Add ellipke for simultaneous K(m) and E(m) computation#510
mgyoo86 wants to merge 14 commits intoJuliaMath:masterfrom
mgyoo86:feat/ellipke

Conversation

@mgyoo86
Copy link

@mgyoo86 mgyoo86 commented Jan 6, 2026

Depends on: #509 (perf/elliptic-optimization) — this PR builds on the refactored elliptic integral infrastructure.

Motivation

In many physics and engineering applications, both complete elliptic integrals K(m) and E(m) are needed together—for example, when calculating magnetic fields from coils (as in VacuumFields.jl).

Currently, users have to call ellipk and ellipe separately, which duplicates overhead for input validation and index calculation. MATLAB provides ellipke for exactly this reason, and it makes sense for SpecialFunctions.jl to offer a similar optimized path.

Implementation

This implementation computes both integrals in a single pass. The speedup comes from two sources:

  1. Shared indexing: The polynomial table index (idx = trunc(m * 10)) is computed once and reused for both K and E.
  2. Shared logarithm: For m ≥ 0.9 (where computation is heaviest), the expensive log term is computed once and shared between K and E.

Benchmark

Environment
julia> versioninfo()
Julia Version 1.12.3
Commit 966d0af0fdf (2025-12-15 11:20 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: macOS (arm64-apple-darwin24.0.0)
  CPU: 10 × Apple M1 Pro
  WORD_SIZE: 64
  LLVM: libLLVM-18.1.7 (ORCJIT, apple-m1)
  GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 8 virtual cores)

(SpecialFunctions) pkg> st
Project SpecialFunctions v2.7.0
Status `~/.julia/dev/SpecialFunctions/Project.toml`
  [92d709cd] IrrationalConstants v0.2.6
  [2ab3a3ac] LogExpFunctions v0.3.29
  [efe28fd5] OpenSpecFun_jll v0.5.6+0
  [05823500] OpenLibm_jll v0.8.7+0

Calling ellipke(m) is consistently faster than (ellipk(m), ellipe(m)). The gain is most significant (~2.3x) near m = 1 due to shared log calculation.

m ellipk + ellipe (ns) ellipke (ns) speedup
-19.0 43.5 21.9 1.99x
-4.7 13.1 10.0 1.32x
-0.33 13.1 7.2 1.82x
0.05 11.5 6.2 1.85x
0.55 11.8 9.0 1.31x
0.875 12.3 9.6 1.29x
0.95 42.1 18.5 2.27x
Avg 15.9 9.8 1.62x

Note: The (ellipk + ellipe) baseline already includes the optimizations from PR #509.

Usage

The function returns a tuple, behaving exactly like the standalone counterparts:

K, E = ellipke(0.5)
# K == ellipk(0.5), E == ellipe(0.5)  (bit-wise equal)

# Missing propagation follows Base.sincos convention
ellipke(missing)  # (missing, missing)

Tests

  • Bit-wise equality against separate ellipk/ellipe calls across all polynomial branches
  • Edge cases: 0, 1, -Inf, NaN, and negative values
  • DomainError for m > 1

mgyoo86 added 12 commits January 5, 2026 16:06
Replace O(n) linear if-elseif chain with O(log n) binary tree branching
for polynomial table selection. Key optimizations:

- Add _ellipk_core and _ellipe_core with binary tree structure
- Use integer index (idx = unsafe_trunc(Int, m*10)) for faster icmp vs fcmp
- Simplify wrapper functions to handle only negative m and special cases
- Add special handling for extremely negative m where x rounds to 1.0

Performance improvement:
- ellipk: 1.33x average speedup (10.5ns -> 7.9ns)
- ellipe: 1.46x average speedup (12.0ns -> 8.2ns)
- Largest gains for m in [0.65, 0.875]: 1.5x-1.85x speedup

All polynomial coefficients unchanged - only branching structure optimized.
Add boundary value tests at each polynomial table transition (m = 0.0, 0.1, ..., 0.9)
with prevfloat/nextfloat continuity checks. Add negative m transformation tests
and special value edge cases (NaN, Inf, -Inf, DomainError).
Extract polynomial lookup into separate _ellipk_horner_table function
with linear if-else structure (LLVM optimizes to switch/jump table).
Simplify _ellipk_core to handle only edge cases and delegate to table.

Phase 1 of ellip.jl refactoring plan.
Extract polynomial lookup into separate _ellipe_horner_table function
with linear if-else structure (LLVM optimizes to switch/jump table).
Simplify _ellipe_core to handle only edge cases and delegate to table.

Phase 2 of ellip.jl refactoring plan.
Rename internal functions to better reflect their domain (m >= 0).
Also reorder functions: docstring → public API → wrapper → impl → table.
Remove separate _ellipk_horner_table/_ellipe_horner_table functions
and inline the polynomial lookup directly into _ellipk_core/_ellipe_core.
Rename _ellipk_nonneg/_ellipe_nonneg back to _ellipk_core/_ellipe_core.

Simpler code structure without performance regression.
- Add @BoundsCheck to horner_table functions for debug safety
- Use @inbounds when calling horner_table from nonneg functions
- Change else to explicit elseif idx == 9 for clarity
- Optimize ellipe idx=9: call _ellipk_horner_table directly instead
  of _ellipk_nonneg to avoid redundant edge case checks
Using 'else' as fallback instead of explicit 'elseif idx == 9'
allows LLVM to generate a cleaner switch/jump table with a
default case, resulting in ~1ns faster execution.
Computes both complete elliptic integrals as (K, E) tuple by reusing
existing _ellipk_horner_table and _ellipe_horner_table functions.
More efficient than separate ellipk/ellipe calls when both are needed.

- Support for Float16, Float32, Float64
- Handle edge cases: m=0, m=1, m<0, NaN, -Inf
- Comprehensive test coverage with bit-wise equality checks
- ellipke(missing) returns (missing, missing) following Base.sincos convention
- Add test for missing behavior
For idx == 9 (0.9 <= m < 1.0), compute K and E together in
_ellipke_near_one to avoid duplicate log(qd) calls.
This gives ~1.8x speedup over naive (ellipk + ellipe) in this range.
@codecov
Copy link

codecov bot commented Jan 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.16%. Comparing base (b846391) to head (ea1673d).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #510      +/-   ##
==========================================
- Coverage   94.17%   94.16%   -0.02%     
==========================================
  Files          14       14              
  Lines        2969     2997      +28     
==========================================
+ Hits         2796     2822      +26     
- Misses        173      175       +2     
Flag Coverage Δ
unittests 94.16% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant