Skip to content

linear interpolation leads to some crazy estimates #14

@astolman

Description

@astolman

The linear interpolation you wrote for estimating the biases seems like a reasonable approach, but I found a case where it leads to absurd estimates. I had a counter of precision 5 with a raw estimate of 128 point 6 something and the two closest raw estimates libcount found in the table were like 128.3464 and 128.3462, and their respective biases are about 1 apart. The linear interpolation said that it's bias should be in the thousands since the straight line connecting those two closest points is so steep. As a result, I got back a negative cardinality estimation, which is definitely not right. The spec in the paper just says "interpolate" without specifying how, so I would instead of just looking for the two nearest neighbors, look for the interval which contains the raw estimate and find where the estimate lies on that line. That seems like another reasonable interpolation and avoids the bias estimate for a given value to be wildly larger than that of its nearest neighbors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions