Skip to content

Support weights="distance" for KNeighbors* in cuml.accel #6554

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 24, 2025

Conversation

jcrist
Copy link
Member

@jcrist jcrist commented Apr 18, 2025

Previously we would fail if the user specified weights="distance" to KNeighborsClassifier/KNeighborsRegressor. This fixes that and adds a test.

Part of fixing this required changing the logic in dispatch_func to not special-case inference methods. Previously we would always run inference on the GPU, even if _gpuaccel was False (meaning the hyperparameters specified weren't supported by cuml). I don't believe this to be the desired logic - if cuml doesn't support the specified hyperparameters, we cannot be sure that we do support them for predict. Further, the state is stored on the cpu estimator already, running inference on CPU makes more sense anyway IMO. It also makes understanding where something runs clearer:

  • If the hyperparameters aren't supported by cuml, then everything runs on CPU
  • If the arguments provided to a method aren't supported by cuml, then that method will dispatch to CPU
  • Otherwise we run on GPU

Fixes #6545.

Copy link

copy-pr-bot bot commented Apr 18, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@jcrist jcrist marked this pull request as ready for review April 18, 2025 19:06
@jcrist jcrist requested a review from a team as a code owner April 18, 2025 19:06
@jcrist jcrist requested review from teju85 and divyegala April 18, 2025 19:06
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Apr 18, 2025
@jcrist jcrist added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change cuml-accel Issues related to cuml.accel labels Apr 18, 2025
@jcrist jcrist self-assigned this Apr 18, 2025
@jcrist jcrist requested a review from csadorf April 21, 2025 14:56
@csadorf
Copy link
Contributor

csadorf commented Apr 22, 2025

I ran the scikit-learn test suite against this with the regression testing that I'm currently implementing in #6553. This is what I found:

Test Summary:
  Total Tests:             36096
  Passed:                  30994
  Failed:                      1
  XFailed:                  1279
  XPassed (strict):            9
  XPassed (non-strict):        0
  Errors:                      0
  Skipped:                  3813
  Pass Rate:              85.87%
  Total Time:            267.86s

Failed Tests:
  test_regression_criterion[absolute_error-RandomForestRegressor]

Potential Improvements (Strict XPASS):
  test_knn_imputer_weight_distance[nan]
  test_knn_imputer_weight_distance[-1]
  test_neighbors_regressors_zero_distance
  test_neighbors_metrics[float64-42-mahalanobis]
  test_valid_brute_metric_for_auto_algorithm[float64-csr_matrix-mahalanobis]
  test_kneighbors_brute_backend[float64-42-mahalanobis]
  test_valid_brute_metric_for_auto_algorithm[float64-csr_array-mahalanobis]
  test_ovo_consistent_binary_classification
  test_unsupervised_model_fit[2]

That's overall very positive!

However, it looks like the test_regression_criterion[absolute_error-RandomForestRegressor] regression might be real, because it goes away when I revert the change to the base.pyx module.

Traceback
ensemble/tests/test_forest.py::test_regression_criterion[absolute_error-RandomForestRegressor] FAILED                                                                                   [100%]

========================================================================================== FAILURES ===========================================================================================
_______________________________________________________________ test_regression_criterion[absolute_error-RandomForestRegressor] _______________________________________________________________

name = 'RandomForestRegressor', criterion = 'absolute_error'

    @pytest.mark.parametrize("name", FOREST_REGRESSORS)
    @pytest.mark.parametrize(
        "criterion", ("squared_error", "absolute_error", "friedman_mse")
    )
    def test_regression_criterion(name, criterion):
        # Check consistency on regression dataset.
        ForestRegressor = FOREST_REGRESSORS[name]
    
        reg = ForestRegressor(n_estimators=5, criterion=criterion, random_state=1)
        reg.fit(X_reg, y_reg)
>       score = reg.score(X_reg, y_reg)

../../../miniforge3/envs/cuml-work0/lib/python3.12/site-packages/sklearn/ensemble/tests/test_forest.py:173: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../../miniforge3/envs/cuml-work0/lib/python3.12/site-packages/cuml/internals/api_decorators.py:219: in wrapper
    return func(*args, **kwargs)
../../../miniforge3/envs/cuml-work0/lib/python3.12/site-packages/nvtx/nvtx.py:122: in inner
    result = func(*args, **kwargs)
randomforestregressor.pyx:691: in cuml.ensemble.randomforestregressor.RandomForestRegressor.score
    ???
../../../miniforge3/envs/cuml-work0/lib/python3.12/site-packages/cuml/internals/api_decorators.py:217: in wrapper
    ret = func(*args, **kwargs)
../../../miniforge3/envs/cuml-work0/lib/python3.12/site-packages/nvtx/nvtx.py:122: in inner
    result = func(*args, **kwargs)
../../../miniforge3/envs/cuml-work0/lib/python3.12/site-packages/cuml/internals/api_decorators.py:369: in dispatch
    return self.dispatch_func(func_name, gpu_func, *args, **kwargs)
../../../miniforge3/envs/cuml-work0/lib/python3.12/site-packages/cuml/internals/api_decorators.py:219: in wrapper
    return func(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   TypeError: ForestRegressor.predict() got an unexpected keyword argument 'algo'

base.pyx:757: TypeError

Copy link
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! We will need to address one regression.

# if using accelerator and doing inference, always use GPU
elif func_name not in ['fit', 'fit_transform', 'fit_predict']:
device_type = DeviceType.device

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular change appears to introduce a regression, see #6554 (comment) .

@dantegd can you comment on the initial motivation for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically speaking we can run inference in many cases even if training was done on GPU, but I agree with the change done in this PR after analyzing the behavior that @jcrist mentions. The CPU to GPU transfer eats a lot of the time that the inference acceleration gains.

jcrist added 2 commits April 22, 2025 14:34
Previously we would fail if the user specified `weights="distance"` to
`KNeighborsClassifier`/`KNeighborsRegressor`. This fixes that and adds a
test.

Part of fixing this required changing the logic in `dispatch_func` to
not special-case inference methods. Previously we would always run
inference on the GPU, even if `_gpuaccel` was False (meaning the
hyperparameters specified weren't supported by cuml). I don't believe
this to be the desired logic - if cuml doesn't support the specified
hyperparameters, we cannot be sure that we do support them for
`predict`. Further, the state is stored on the cpu estimator already,
running inference on CPU makes more sense anyway IMO. It also makes
understanding where something runs clearer:

- If the hyperparameters aren't supported by cuml, then everything runs
  on CPU
- If the arguments provided to a method aren't supported by cuml, then
  that method will dispatch to CPU
- Otherwise we run on GPU
@jcrist jcrist force-pushed the fix-kneighbors-weights branch from 5a8fd83 to 5d5c2c8 Compare April 22, 2025 21:54
@jcrist
Copy link
Member Author

jcrist commented Apr 22, 2025

Regression should be fixed.

@jcrist jcrist requested a review from csadorf April 22, 2025 21:58
# if using accelerator and doing inference, always use GPU
elif func_name not in ['fit', 'fit_transform', 'fit_predict']:
device_type = DeviceType.device

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically speaking we can run inference in many cases even if training was done on GPU, but I agree with the change done in this PR after analyzing the behavior that @jcrist mentions. The CPU to GPU transfer eats a lot of the time that the inference acceleration gains.

@jcrist jcrist dismissed csadorf’s stale review April 24, 2025 16:12

The regression has been resolved.

@jcrist
Copy link
Member Author

jcrist commented Apr 24, 2025

/merge

@rapids-bot rapids-bot bot merged commit f8496e3 into rapidsai:branch-25.06 Apr 24, 2025
72 of 73 checks passed
@jcrist jcrist deleted the fix-kneighbors-weights branch April 24, 2025 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuml-accel Issues related to cuml.accel Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] KNeighborsClassifier fails with weights="distance" in cuml.accel
3 participants