Feature Request: Handle Tied Distances in k-Nearest Neighbor Search

In scenarios where multiple neighbors have the same distance to a query point (ties), the current implementation deterministically returns the same k neighbors based on the tree traversal or data ordering. This is problematic for datasets where ties are common, such as those with duplicate points or rounded values.

**Problem:**
When ties occur and the number of tied neighbors exceeds k, the library always selects the same k neighbors. This can lead to biased results and limits diversity in downstream applications like stochastic modeling or simulations.

For example:
A query point with 1000 equidistant neighbors (distance 0) but k = 10 will always return the same 10 neighbors.
Users cannot randomize neighbor selection among tied points, which reduces flexibility and fairness.

**Proposed Solution:**
Introduce an option to handle ties during k-NN queries:

1. Detect tied distances among neighbors.
2. Randomly select k neighbors from the tied group when ties occur.
3. Add an optional parameter (e.g., resolve_ties = TRUE) to enable or disable this behavior.

**Why This Matters:**
Tied distances are common in real-world datasets due to:

1. Exact matches or duplicate points.
2. Rounded or discretized data.
3. Without tie handling, deterministic selection introduces bias and limits the applicability of libnabo for use cases requiring diverse or randomized neighbor selection.

**Request:**
Would it be possible to add support for resolving ties as an optional feature in libnabo? This would improve the library’s flexibility and usability for datasets where ties frequently occur.

Thank you for your work on this library!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Handle Tied Distances in k-Nearest Neighbor Search #143

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Handle Tied Distances in k-Nearest Neighbor Search #143

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions