Open
Description
Currently, cuml.accel
does not support scikit-learn's set_output()
API, which was introduced in scikit-learn 1.2.0 to control the output format of transformers and estimators. This feature allows users to specify whether they want outputs as numpy arrays, pandas DataFrames, or other formats.
Expected Behavior
All cuml.accel
estimators that implement transform()
or predict()
methods should support the set_output()
API. This includes:
PCA
TruncatedSVD
KNeighborsClassifier
KNeighborsRegressor
NearestNeighbors
- And other relevant estimators
Dependencies
- scikit-learn >= 1.2.0 (for set_output API)
- pandas (for DataFrame output support)
Related Issues
- Document that cuml.accel does not support the sklearn set_output() API among known limitations #6605
Acceptance Criteria
- All relevant estimators support
set_output()
- Tests pass for both numpy and pandas output formats
- Documentation is updated to reflect the new functionality
- No regression in existing functionality