fix(stattests): honor n_bins in get_binned_data for numeric features (#1400)#1873
Open
jbbqqf wants to merge 1 commit into
Open
fix(stattests): honor n_bins in get_binned_data for numeric features (#1400)#1873jbbqqf wants to merge 1 commit into
jbbqqf wants to merge 1 commit into
Conversation
…videntlyai#1400) The numeric branch of get_binned_data() in legacy stattests previously ignored the caller-supplied n parameter and always used Sturges' rule via np.histogram_bin_edges(combined, bins='sturges'). This made the n_bins argument of the PSI, Jensen-Shannon and KL stat tests effectively a no-op on numeric features. Pass n through to np.histogram_bin_edges(bins=n) so the documented contract holds. Add a regression test that exercises four different bin counts on numeric data with > 20 unique values (to force the numeric branch). Note: this changes PSI / JS / KL numeric values for users who relied on the implicit Sturges binning. The default n_bins=30 now produces 30 bins instead of ~Sturges(\~log2(N)+1) bins. Cited in PR body for reviewer awareness. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Good catch on the n_bins regression. One design note: quantile-based bin |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #1400.
The numeric branch of
get_binned_data()insrc/evidently/legacy/calculations/stattests/utils.pypreviously ignored thecaller-supplied
nparameter and always used Sturges' rule vianp.histogram_bin_edges(combined, bins=\"sturges\"). The PSI / Jensen-Shannon /KL stat tests all pass
n_binsthrough to this function, so theirn_binsargument was effectively a no-op on numeric features (>20 unique values).
The fix passes
nthrough tonp.histogram_bin_edges(bins=n). A regressiontest (
test_get_binned_data_honors_n_for_numeric) parametrised on four bincounts asserts the returned percent arrays have shape
(n,). The test failson
origin/main(gets ~11 bins via Sturges for the 5-bin case) and passes onthis branch.
Reproduce BEFORE/AFTER yourself (copy-paste)
```bash
Run from the repo root in a clean checkout.
git fetch origin && git fetch https://github.com/jbbqqf/evidently.git fix/1400-psi-n-bins:_pr1400
run_check() {
python - <<'PY'
import numpy as np, pandas as pd
from evidently.legacy.calculations.stattests.utils import get_binned_data
from evidently.legacy.core import ColumnType
rng = np.random.default_rng(0)
ref = pd.Series(rng.normal(size=500))
cur = pd.Series(rng.normal(size=500))
for n in (5, 10, 30, 50):
r, _ = get_binned_data(ref, cur, ColumnType.Numerical, n, feel_zeroes=False)
print(f"requested n={n}, got {len(r)} bins")
PY
}
BEFORE — origin/main: n is ignored, Sturges is used.
git checkout origin/main -- src/evidently/legacy/calculations/stattests/utils.py
pip install -q -e . >/dev/null
run_check
Expected: 'got 11 bins' (or similar) for every requested n.
AFTER — this branch: n is honored.
git checkout _pr1400 -- src/evidently/legacy/calculations/stattests/utils.py
pip install -q -e . >/dev/null
run_check
Expected: 'got N bins' matching the requested n in every line.
Restore.
git checkout origin/main -- src/evidently/legacy/calculations/stattests/utils.py
```
What I ran locally
with `assert (11,) == (5,)`, passes on this branch.
Edge cases / behavior change to flag for review
This is a behavior change for numeric features in PSI / Jensen-Shannon /
KL: users will see different drift values because the bin count now actually
matches what they configured. The previous behavior contradicted the
documented `n_bins` parameter and the issue reporter explicitly flagged this
contradiction. I'm flagging it here so a maintainer can decide whether to
ship this as a fix or guard with a deprecation toggle.
AI disclosure
This pull request was authored with assistance from Anthropic's Claude (an AI
coding assistant) running under my direction. I read the diff and reproduced
the BEFORE/AFTER behavior locally before submitting.