Skip to content

Conversation

@sgrebnov
Copy link

@sgrebnov sgrebnov commented Sep 21, 2025

What changes are proposed in this pull request?

PR implements re-enable file skipping on timestamp columns #1002 by subtracting 999 microseconds from predicate value used for filtering keeping potentially good entries. Note - DataSkippingPredicateEvaluator only performs initial filtering, final filtering is done based on actual data content/metadata so it is valid to relax filter here.

This is important optimization in order to fetch only data that corresponds to specific time window, for example last X days.

The following approach was also reviewed/considered: instead of modifying val, generate a comparison expression that adds 999 µs to the max value. However, this ran into a limitation—DataFusion does not support Timestamp + Int64 operations (it requires an Interval). Since the Delta Kernel/Delta spec does not have an Interval type, introducing a new type for Expr seemed unreasonable.

How was this change tested?

Updated unit tests and tested manually using delta lake table (Databricks liquid table)

@sgrebnov
Copy link
Author

@zachschuermann - I’ve implemented the improvement item you created, and I’m wondering if you can review it. We’ve found this functionality to be really critical for our use case. I think this is a very common use case, and it would be great to improve it.

#1002

@zachschuermann zachschuermann requested review from nicklan, scovich and zachschuermann and removed request for zachschuermann September 25, 2025 23:22
@zachschuermann
Copy link
Member

awesome, thanks for tackling this @sgrebnov! Should be able to get to a review today :)

Comment on lines +854 to +858
let max_ts_adjusted = timestamp_subtract(val, 999);
tracing::debug!(
"Adjusted timestamp value for col {col} for max stat comparison from {val:?} to {max_ts_adjusted:?}"
);
return self.eval_partial_cmp(ord, max, &max_ts_adjusted, inverted);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this always be subtracting the val?

Suppose the max is 2500, but was truncated to 2000.
Suppose that value is 2250. => adjusted to 1251 

for expression:     value < max
actual:             2250  < 2500 == true
adjusted/truncated: 1251  < 2000 == true.
=> correctly not filtered

for expression:     value > max
actual:             2250  > 2500 == false
adjusted/truncated: 1251  > 2000 == false
=> correctly filtered


now suppose value is 2750 => adjusted to 1751

for expression:     value < max:
actual:             2750  < 2500 == false
adjusted/truncated: 1751  < 2000 == true 
=> not filtered (safe)

for expression:     value > max:
actual:             2750  > 2500 == true
adjusted/truncated: 1751  > 2000 == false 
=> filtered (not safe)❗ ❗ 

Am I missing something here? Could you add some extra details/context in comments to serve as a proof?

Copy link
Collaborator

@OussamaSaoudi OussamaSaoudi Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that we're never doing a comparison value > max since we only ever want to check value <= max => ! (value < max).

EDIT: should be value <= max => ! (value > max).

Copy link
Author

@sgrebnov sgrebnov Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OussamaSaoudi - thank you for the deep dive and review! Yeah, I was thinking about the same when working on the fix - we only do max > value or !(max < value), decreasing val helps keep more records in both cases. Please let me know if you would like me to be more specific and add match for ord=Greater, inverted=false and ord=Less, inverted=true only

image

Other reasoning I used:

  1. If min is always truncated to lower value and we don't do special logic than the same should work with max increased (but we increase max by decreasing val used for comparison).
  2. if we can increase max, for example have (max + 999) and this is valid (keep more records): (max + 999) operator val then (max + 999-999) operator val-999 is valid as well so we have our current logic (max operator val-999 case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let me know if you would like me to be more specific and add match for ord=Greater, inverted=false and ord=Less, inverted=true only

Good idea! Let's handle by cases.

  1. max > value or max >= value. Represented by: (ord=Greater, inverted=false) and (ord=Less, inverted=true)
    • Proof: If max >= value + 999, it must be the case that max >= value
  2. max < value or max <= value. (ord=Less, inverted=false) and (ord=Greater, inverted=true)
    • Proof: if max <= value - 999 , then it must be the case that max <= value
    • We can adjust this to max + 999 <= value to avoid underflow

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @scovich Would love your eyes on this.

Copy link
Collaborator

@scovich scovich Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, we only have four rewrites for inequalities:

  • col < val => stats.min.col < val
  • NOT(col < val) => NOT(stats.max.col < val)
  • col > val => stats.max.col > val
  • NOT(col > val) => NOT(stats.min.col > val)

If I understand correctly, the two involving max stats would be changed to:

  • col > val => stats.max.col > val - 999
  • NOT(col < val) => NOT(stats.max.col < val - 999)

And because this is data skipping, we only care whether the new rewrite might produce a FALSE where the old rewrite did not produce FALSE. Because that corresponds to wrongly skipping a file.

For the first: if stats.max.col > val - 999 is FALSE, then the max-value is "too small" and stats.max.col > val must also return FALSE.

For the second, let's simplify a bit by pushing down the NOT:

  • NOT(stats.max.col < val - 999) => stats.max.col >= val - 999

If stats.max.col >= val - 999 is FALSE, then the max-value is again "too small" and stats.max.col >= val must also return FALSE.

AFAICT, the rewrite is sound, because any time it returns FALSE the original predicate also returned FALSE.

@OussamaSaoudi OussamaSaoudi removed the request for review from nicklan September 29, 2025 21:58
@codecov
Copy link

codecov bot commented Sep 29, 2025

Codecov Report

❌ Patch coverage is 89.47368% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.39%. Comparing base (c1c301b) to head (068be65).
⚠️ Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
kernel/src/kernel_predicates/mod.rs 85.71% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1333   +/-   ##
=======================================
  Coverage   84.38%   84.39%           
=======================================
  Files         112      112           
  Lines       27773    27766    -7     
  Branches    27773    27766    -7     
=======================================
- Hits        23437    23433    -4     
+ Misses       3202     3199    -3     
  Partials     1134     1134           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple high level thoughts:

  1. My gut feeling is that it would be better to put the checks directly in DataSkippingPredicateEvaluator::eval_pred_[le|gt|eq] methods, because the Ordering is hard-wired in each of those three places. Helper functions to reduce boilerplate are always encouraged when appropriate. What do others think of that idea?
  2. What about strings? IIRC, Delta-spark truncates string stats to some maximum number of characters which causes similar headaches for data skipping over the max-stat. Do we even try to handle that today in kernel? I didn't see any obvious code, not sure if there's even an issue tracking it. Meanwhile, I wonder if we could use a similar trick, by ensuring that the val we compare is always at least one character shorter than the max-stat value? Except we don't know that length before hand, and it could vary from file to file. Hmm.

return self.eval_partial_cmp(ord, max, &max_ts_adjusted, inverted);
}
// Equality comparison can't be applied as max stats is truncated to milliseconds, so actual microsecond value is unknown.
Ordering::Equal => return None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Equality is anyway rewritten as a pair of inequalities (one on min-stat, and another on max-stat).
See DataSkippingPredicateEvaluator::eval_pred_eq.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to still used for != predicate (eval_pred_eq with inverted: true):
https://github.com/delta-io/delta-kernel-rs/blob/main/kernel/src/kernel_predicates/mod.rs#L898

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so we never actually use == only !=. Also, IIRC, the != case can be expressed as a pair of range inequalities, which would simplify things because then eval_pred_eq would just reuse whatever eval_pred_lt and eval_pred_gt did, as needed.

@sgrebnov
Copy link
Author

A couple high level thoughts:

  1. My gut feeling is that it would be better to put the checks directly in DataSkippingPredicateEvaluator::eval_pred_[le|gt|eq] methods, because the Ordering is hard-wired in each of those three places. Helper functions to reduce boilerplate are always encouraged when appropriate. What do others think of that idea?
  2. What about strings? IIRC, Delta-spark truncates string stats to some maximum number of characters which causes similar headaches for data skipping over the max-stat. Do we even try to handle that today in kernel? I didn't see any obvious code, not sure if there's even an issue tracking it. Meanwhile, I wonder if we could use a similar trick, by ensuring that the val we compare is always at least one character shorter than the max-stat value? Except we don't know that length before hand, and it could vary from file to file. Hmm.

Thank you for review @scovich 🙏

1- I'll rely on the team here and will be looking for the guidance/decision, one concern to move this logic directly into eval_pred_[le|gt|eq] is that we will need to duplicate it in 3 difference places (column type matching/etc) ? - I see the partial_cmp_max_stat is used in eval_pred_eq, eval_pred_lt, eval_pred_gt.

@scovich
Copy link
Collaborator

scovich commented Sep 30, 2025

one concern to move this logic directly into eval_pred_[le|gt|eq] is that we will need to duplicate it in 3 difference places (column type matching/etc) ? - I see the partial_cmp_max_stat is used in eval_pred_eq, eval_pred_lt, eval_pred_gt.

Yes, but each site is significantly simpler and clearer. My intuition is that it will be a net win.

Copy link
Member

@zachschuermann zachschuermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgrebnov any updates here? how can we help?

@sgrebnov
Copy link
Author

@zachschuermann, @scovich - thank you for the additional details and suggestions - I’ll go through them in detail and follow up over the next few days.

@zachschuermann
Copy link
Member

@zachschuermann, @scovich - thank you for the additional details and suggestions - I’ll go through them in detail and follow up over the next few days.

awesome thanks! just wanted to check in :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants