Skip to content

Conversation

@lucagrementieri
Copy link
Contributor

@lucagrementieri lucagrementieri commented Nov 27, 2025

The definition of group_distance is not very clear. This PR tries to improve it and it also changes the code to be more internally coherent. Feel free to reject the PR if the original intent of the function is the one currently implemented.

The current description of group_distance says

Find groups of points which have neighbours closer than radius, where no two points in a group are farther than distance apart.

If we accept overlapping groups the implementation is straightforward and it's sufficient to use a ball query search for every point to find the neighborhood of every point.

On the other hand, if the grouping is meant to create distinct and separate groups the task is more difficult to define because for the point A, B, C we can have ||A-B|| < distance; ||B-C|| < distance and ||A-C|| > distance.
In that case B can be both in the group of A and in the group of C and we could even decide to merge the two groups in a single group since the share a common element.

Merging the group would break any guarantee on the radius of the group so it's probably not the aim of this function, also trimesh.grouping.clusters does exactly that.

The current implementation in this situation assigns B to both the group of A and the group of C allowing for overlapping groups, but it avoids creating the group of B that could include all the points A, B, C. This design does not seem very coherent because overlapping groups are allowed only if the center is not part of an existing group.

The PR makes the group non-overlapping assigning B to the group of A and C in a separate group. The new implementation depends on the order of points like the previous one.

This example show the proposed change.

import numpy as np
import trimesh

points = np.array([[-0.9, 0.0], [0.0, 0.0], [0.9, 0]])
grouped_points, group_indices = trimesh.grouping.group_distance(points, 1.0)

print(grouped_points)
print(group_indices)

With the current implementation the result is

[[-0.45  0.  ]
 [ 0.45  0.  ]]
[array([0, 1]), array([1, 2])]

The new code would produce instead non-overlapping groups

[[-0.45  0.  ]
 [ 0.9   0.  ]]
[array([0, 1]), array([2])]

Maybe to make the function permutation-invariant we could favor bigger groups and remove elements from the other groups. For example in this case a single group with all the 3 points could be preferable and closer to the intent of the user.

To implement that variant favoring big groups the function should compute all the neighborhoods and sort them by size before assigning every point to a single group.

I'm happy to hear your thoughts!

@mikedh mikedh changed the base branch from main to release/precise December 3, 2025 19:58
@mikedh mikedh merged commit c5c70fb into mikedh:release/precise Dec 4, 2025
11 checks passed
@mikedh
Copy link
Owner

mikedh commented Dec 4, 2025

Makes sense to me, thanks for the PR! Yeah I agree group_distance isn't very well defined, it's basically "lazy clustering." Returning groups that overlap doesn't seem great, just looking at some simple 2D points the behavior with your change looks a lot nicer:

group_distance on main

Total grouped points: 128 / 100
Group Count: 38
Group Length: 3.3684210526315788 mean 1 min 7 max
group_distance_main

group_distance in this PR

Total grouped points: 100 / 100
Group Count: 41
Group Length: 2.4390243902439024 mean 1 min 8 max
group_distance_post

Test script

import matplotlib.pyplot as plt
import numpy as np

import trimesh

if __name__ == "__main__":
    points = np.random.random((100, 2))

    d = trimesh.grouping.group_distance(points, 0.1)[1]

    for g in d:
        plt.scatter(*points[g].T)

    stats = np.array([len(g) for g in d])

    print(f"Total grouped points: {stats.sum()} / {len(points)}")
    print(f"Group Count: {len(d)}")
    print(f"Group Length: {stats.mean()} mean {stats.min()} min {stats.max()} max")

    plt.show()

@lucagrementieri lucagrementieri deleted the fix-group-distance branch December 4, 2025 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants