-
Notifications
You must be signed in to change notification settings - Fork 569
[DO NOT MERGE] Tracking scikit-learn failure fixes #6451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
csadorf
wants to merge
22
commits into
rapidsai:branch-25.04
Choose a base branch
from
csadorf:do-not-merge/all-scikit-learn-test-failure-related-changes
base: branch-25.04
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
[DO NOT MERGE] Tracking scikit-learn failure fixes #6451
csadorf
wants to merge
22
commits into
rapidsai:branch-25.04
from
csadorf:do-not-merge/all-scikit-learn-test-failure-related-changes
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This update introduces a new function, `support_array_like`, to handle conversion of list and tuple inputs to NumPy arrays when the accelerator is active. Additionally, the `CumlArray` class has been modified to support this conversion during initialization, ensuring compatibility with array-like data types. Changes include: - New `support_array_like` function in `api_decorators.py`. - Updated `CumlArray` to convert list/tuple inputs to NumPy arrays when the accelerator is active. - Detect lists and tuples as array-like in `input_utils.py` in accel mode.
This commit introduces a new test file, `test_array_like_input.py`, which includes tests for the `support_array_like` function and the `CumlArray` class. The tests validate the conversion of list and tuple inputs to NumPy arrays, as well as the handling of NumPy and CuPy arrays when the accelerator is active.
- Introduced a new method in KMeans to dispatch CPU implementation when sparse arrays are detected during fitting. - Updated the is_sparse function to use cupyx' and scipy's issparse method for better compatibility.
- Introduced a new test to verify that KMeans correctly dispatches to CPU when fitting with sparse input. - Ensured that the model's attributes and predictions are validated as numpy arrays when using sparse data.
- Changed default solver from 'eig' to 'auto', allowing automatic selection of 'eig'. - Updated documentation to reflect new solver options: 'auto', 'eig', 'svd', and 'cd'. - Refactored solver selection logic into a new method `_select_solver` for better clarity and maintainability.
- Check for non-scalar intercept values and ensuring correct coefficient array shapes for matrix multiplication. - Dispatch Ridge to CPU for multi-target training.
…-scikit-learn-test-failure-related-changes
…o do-not-push/all-scikit-learn-test-failure-related-changes
…ot-push/all-scikit-learn-test-failure-related-changes
…-test-failure-related-changes
…arn-test-failure-related-changes
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Updated the args_to_cpu method to allow NoneType as a valid argument alongside numbers and strings. Fixes skl tests: - test_inplace_data_preprocessing[42-False-csr_matrix] - test_linear_regression_sample_weight_consistency[42-False-csr_matrix]
Updated the input_to_cuml_array function to convert scalar values into arrays of appropriate shape. This enhancement improves the function's flexibility in handling different input types. Fixes skl tests: - test_linear_regression_sample_weight_consistency[42-False-None - test_linear_regression_sample_weight_consistency[42-True-None]
Updated the Ridge class to include a check for wide datasets (n_features > n_samples) when using the SVD solver. This addition improves the decision-making process for dispatching to the CPU implementation during fit-related functions, ensuring better performance for specific data shapes. Fixes skl tests: - test_ridge_sample_weight_consistency[42-svd-wide-None-False] - test_ridge_sample_weight_consistency[42-svd-wide-None-True]
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
Cython / Python
Cython or Python issue
DO NOT MERGE
Hold off on merging; see PR for details
non-breaking
Non-breaking change
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.