Skip to content

[BUG] ChiSquared._energy_x uses wrong chi-squared identity (k+1 instead of k+2), up to 48% error #964

@ANANYA542

Description

@ANANYA542

Describe the bug
ChiSquared._energy_x in skpro/distributions/chi_squared.py uses a mathematically incorrect identity in its closed-form energy (CRPS) formula. The code on line 186 computes chi2.cdf(xi, k + 1), but the correct partial expectation identity for the chi-squared distribution requires chi2.cdf(xi, k + 2).
This causes the cross-energy E[|X - x|] to return silently wrong values with errors up to 48.8%, directly corrupting CRPS scores for any chi-squared-based probabilistic prediction.
The _energy_self method (which uses numerical quadrature) is not affected — only the closed-form _energy_x is wrong.

To Reproduce

import numpy as np
from scipy.integrate import quad
from scipy.stats import chi2
from skpro.distributions.chi_squared import ChiSquared
k, x_val = 5, 5.0
d = ChiSquared(dof=k)
skpro_val = float(np.asarray(d._energy_x(np.array([[x_val]]))).flat[0])
truth, _ = quad(lambda t: abs(t - x_val) * chi2.pdf(t, k), 0, np.inf, limit=500)
print(f"skpro _energy_x: {skpro_val:.6f}")   # 1.279329
print(f"numerical truth: {truth:.6f}")         # 2.440830
print(f"error: {abs(skpro_val - truth)/truth*100:.1f}%")  # 47.6%

Full reproduction with multiple (dof, x) pairs attached as screenshot below.

Image

Expected behavior

_energy_x should match direct numerical integration. The corrected formula (using k+2) matches to machine precision (< 1e-12 relative error).

Environment

  • OS: macOS
  • Python: 3.11
  • skpro: latest main branch

Additional context
The bug is on [line 186 of chi_squared.py](https://github.com/sktime/skpro/blob/main/skpro/distributions/chi_squared.py#L186):

cdf_k1 = chi2.cdf(xi, k + 1)   # WRONG: should be k + 2

Why k+2 is correct:
The chi-squared PDF satisfies:

t · f(t; k) = k · f(t; k + 2)

This follows from Γ(k/2 + 1) = (k/2) · Γ(k/2) and 2^((k+2)/2) = 2 · 2^(k/2). Therefore:

∫₀ˣ t · f(t; k) dt = k · F(x; k+2)    ← NOT k+1

The code uses F(x; k+1), which has no mathematical justification.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions