Add validation for CardinalityEffect::equal when executing a plan#20875
Open
xanderbailey wants to merge 3 commits intoapache:mainfrom
Open
Add validation for CardinalityEffect::equal when executing a plan#20875xanderbailey wants to merge 3 commits intoapache:mainfrom
xanderbailey wants to merge 3 commits intoapache:mainfrom
Conversation
xanderbailey
commented
Mar 11, 2026
| /// number of output rows as their input. This is a post-execution | ||
| /// sanity check useful for debugging correctness issues. | ||
| /// Disabled by default as it adds a small amount of overhead. | ||
| pub verify_cardinality_effect: bool, default = false |
Contributor
Author
There was a problem hiding this comment.
Happy to have this just be on by default... The cost IMO is pretty small, it's a single tree walk and an extra Arc::clone. It may also be that there are current bugs in CardinalityEffect::Equal declaration?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
N/A — preventive measure inspired by #20683 / #20672 where
RepartitionExecsilently dropped rows when spilling under memory pressure.Rationale for this change
The bug fixed in #20672 (repartition dropping data when spilling) was a silent correctness issue — queries returned wrong results with no error. This class of bug is particularly dangerous because it can go undetected.
DataFusion operators already declare their expected cardinality effect via
CardinalityEffect::Equal(meaning "I output exactly as many rows as I receive"), but nothing actually verified this at runtime. This PR adds a post-execution sanity check that catches any such violation, so bugs like #20683 can be detected immediately rather than silently producing wrong results.What changes are included in this PR?
datafusion.execution.verify_cardinality_effect(default:false) inExecutionOptionscardinality_checkmodule indatafusion-physical-planwith avalidate_cardinality_effect()function that walks the executed plan tree usingExecutionPlanVisitorand verifies that every operator declaringCardinalityEffect::Equalproduced the same number of output rows as its input (based on post-execution metrics)collect()andcollect_partitioned()that run the validation after all streams are consumed, when the config flag is enabled#[derive(Debug, Clone, Copy)]onCardinalityEffect— the enum was missing these basic derivesThe check gracefully skips operators where metrics are unavailable, where
fetchlimits are set, or where the operator is not unary (one in - one out). When a violation is detected, it returnsDataFusionError::Internalwith the operator name and mismatched row counts.Are these changes tested?
Yes. Six unit tests cover:
fetchsetAre there any user-facing changes?
New config option
datafusion.execution.verify_cardinality_effect(defaultfalse). When enabled viaSET execution.verify_cardinality_effect = true, DataFusion will validate row count invariants after query execution and return an error if any operator that should preserve row counts (likeRepartitionExec,ProjectionExec,CoalesceBatchesExec,SortExec, etc.) produces a different number of rows than its input.