Skip to content

Conversation

@wtn
Copy link
Contributor

@wtn wtn commented Dec 3, 2025

Fixes #20951.

When using group_by() with a quantile parameter that varies per group (e.g., pl.col.quantile.first()), all groups incorrectly received the same quantile value instead of each group using its own.

Reproduction

df = pl.DataFrame({
    "value": [1, 2, 1, 2],
    "quantile": [0, 0, 1, 1],
})
df.group_by("quantile").agg(pl.col("value").quantile(pl.col("quantile").first()))
# Expected: quantile=0 -> 1.0, quantile=1 -> 2.0
# Actual: both groups returned 1.0

Cause

AggQuantileExpr::evaluate_on_groups() always called get_quantile() which evaluates the quantile expression against the full dataframe, returning a single scalar. This worked for literal quantile values but failed when the quantile expression varied per group (e.g., first() aggregation).

Fix

Added agg_varying_quantile which accepts a slice of quantile values (one per group) and computes quantile per group using the existing aggregation helpers.

polars-core changes:

  • Added agg_helper_idx_on_all_with_idx and _agg_helper_slice_with_idx helpers that pass the group index to closures
  • Added agg_varying_quantile_generic that iterates over groups with their corresponding quantile values
  • Added agg_varying_quantile methods to Float32Chunked, Float64Chunked, integer ChunkedArray, Series, and Column

polars-expr changes:

  • AggQuantileExpr::evaluate_on_groups() now detects whether the quantile is uniform (literal/scalar) or varies per group, and dispatches to the appropriate path

@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Dec 3, 2025
@wtn wtn marked this pull request as ready for review December 3, 2025 18:00
@wtn wtn force-pushed the quantile branch 2 times, most recently from 1269250 to ea1dbcd Compare December 3, 2025 18:27
@codecov
Copy link

codecov bot commented Dec 3, 2025

Codecov Report

❌ Patch coverage is 95.36082% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.11%. Comparing base (5dd9b23) to head (7c4fce6).

Files with missing lines Patch % Lines
...s-core/src/frame/group_by/aggregations/dispatch.rs 90.56% 5 Missing ⚠️
...polars-core/src/frame/group_by/aggregations/mod.rs 95.50% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #25606      +/-   ##
==========================================
+ Coverage   78.35%   81.11%   +2.75%     
==========================================
  Files        1777     1777              
  Lines      241720   241898     +178     
  Branches     3085     3085              
==========================================
+ Hits       189406   196205    +6799     
+ Misses      51517    44896    -6621     
  Partials      797      797              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@wtn wtn force-pushed the quantile branch 6 times, most recently from 652fb47 to 2fffe21 Compare December 10, 2025 19:52
@aparna2198
Copy link
Contributor

@orlp @alexander-beedie @ritchie46 @reswqa @c-peters @MarcoGorelli can we get the review here, we have alot of duplicate issues arising for this one. thanks

@wtn wtn force-pushed the quantile branch 2 times, most recently from 8251863 to 9b2f5dc Compare December 29, 2025 22:14
@wtn wtn force-pushed the quantile branch 3 times, most recently from 0c543cc to cf61dd8 Compare December 30, 2025 21:19
@wtn
Copy link
Contributor Author

wtn commented Dec 30, 2025

OK, I've pushed my changes. 🏓

@wtn wtn force-pushed the quantile branch 4 times, most recently from 87e08ed to 101917c Compare January 1, 2026 22:14
@wtn wtn requested a review from orlp January 1, 2026 22:15
@wtn wtn force-pushed the quantile branch 2 times, most recently from 790f349 to e4c5b7b Compare January 2, 2026 19:54
@wtn
Copy link
Contributor Author

wtn commented Jan 7, 2026

Consolidated the three separate tests into one parametrized test test_quantile_varying_by_group:

  1. Removed test_quantile_varying_by_group_basic as the original issue case is covered by the parametrized test with different per-group quantile values
  2. Renamed test_quantile_varying_by_group_parametrizedtest_quantile_varying_by_group
  3. Merged temporal types (Datetime, Duration, Time) into the dtype parametrization

@wtn wtn force-pushed the quantile branch 8 times, most recently from 8c5e4ab to a9cdec2 Compare January 9, 2026 06:17
@wtn wtn requested a review from orlp January 9, 2026 08:29
@wtn wtn force-pushed the quantile branch 8 times, most recently from eb2a7b3 to 5f80e32 Compare January 16, 2026 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Bug fix python Related to Python Polars rust Related to Rust Polars

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Varying quantile by group is broken

4 participants