Skip to content

Expose kmeans to python #729

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: branch-25.06
Choose a base branch
from

Conversation

benfred
Copy link
Member

@benfred benfred commented Feb 26, 2025

No description provided.

@benfred benfred added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Feb 26, 2025
@benfred benfred self-assigned this Feb 26, 2025
Copy link

copy-pr-bot bot commented Feb 26, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cjnolet
Copy link
Member

cjnolet commented Feb 27, 2025

@benfred this looks great, but one of the things we're being asked for quite a bit today is to expose the hierarchical kmeans to Python. Any chance we can also expose those functions? I don't mind doing it as a follow-up, given that this PR is already feature complete.

@benfred
Copy link
Member Author

benfred commented Feb 27, 2025

/ok to test

@benfred benfred marked this pull request as ready for review February 27, 2025 22:20
@benfred benfred requested review from a team as code owners February 27, 2025 22:20
@benfred benfred changed the base branch from branch-25.04 to branch-25.06 April 10, 2025 20:53

rmm::device_uvector<char> workspace(n_samples * sizeof(IndexT), stream);

rmm::device_uvector<DataT> x_norms(n_samples, stream);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the newer mdarray/mdspan API be used here? For the allocation of memory and the calls to raft functions that accept it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used the newer mdarray functions where possible in the last commit (but there are some cases where a device_uvector is expected, like the workspace etc, so I've left those as is)

@benfred benfred requested a review from lowener April 24, 2025 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change Python
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants