Skip to content

Conversation

jonhehir
Copy link

@jonhehir jonhehir commented Jul 25, 2025

The QuantileDigestIterator claims to perform a post-order traversal, but it actually has been performing in-order traversal. This is troublesome for most Q-Digest algorithms, which depend on post-order traversal.

The upperBoundQuantile() and lowerBoundQuantile() functions were also not correct and have been removed and replaced with a simple cumulativeProportion(). (To obtain lower bound and upper bound quantiles correctly, you need to separately perform forward and reverse iteration.)

These are breaking changes, but the functions have no known usage.

See #115 for details and example.

@jonhehir jonhehir requested a review from a team as a code owner July 25, 2025 21:55
@jonhehir jonhehir force-pushed the qdigest-iterator-fix branch from 8c19c5d to b55370e Compare July 25, 2025 22:01
The QuantileDigestIterator claims to perform a post-order traversal,
but it actually has been performing in-order traversal. This is
troublesome for most Q-Digest algorithms, which depend on post-order
traversal.

The `upperBoundQuantile()` and `lowerBoundQuantile()` functions were also
not correct and have been removed and replaced with a simple
`cumulativeProportion()`. (To obtain lower bound and upper bound quantiles,
you really need to separately perform forward and reverse iteration.)

These are breaking changes, but the functions have no known usage.
@jonhehir jonhehir force-pushed the qdigest-iterator-fix branch from b55370e to bac66dd Compare July 25, 2025 22:04
@tdcmeehan tdcmeehan self-assigned this Jul 30, 2025
@tdcmeehan
Copy link

Thanks a lot for fixing these bugs. I'll have time to review this properly in the couple of weeks. Thank you for writing test cases that demonstrate your fix.

@jonhehir
Copy link
Author

Thanks, @tdcmeehan! Do you happen to know if this is being used anywhere? I want to make sure I'm not breaking anything, but I don't see any usage in OSS Presto.

@tdcmeehan
Copy link

I don't think so. IIRC, I had an idea to generate approximate histograms using Q-Digest using this iterator, but I don't think I ever wrote the function that would use it.

@jonhehir
Copy link
Author

Thanks! In essence that's what I was looking to do here too, but with empirical distribution functions instead of histograms. Since the postOrderTraversal function is private, I wanted to use QuantileDigestIterator instead.

@jonhehir
Copy link
Author

@tdcmeehan: Any chance you can review this soon? Thank you!

@tdcmeehan
Copy link

Thanks @jonhehir

@tdcmeehan tdcmeehan merged commit c0e4022 into prestodb:master Aug 20, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants