Skip to content

cuml-cpu: fix import issues, enable conda import tests #6400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Mar 19, 2025

Conversation

jameslamb
Copy link
Member

@jameslamb jameslamb commented Mar 5, 2025

Fixes #6403

This project publishes a conda package, cuml-cpu, which does what it sounds like... allows the use of cuML on systems without a GPU.

This proposes some updates to packaging for cuml-cpu:

Notes for Reviewers

Why all these changes in Python code?

See some of the challenges I faced documented in #6400 (comment).

In short, import cuml when it was installed via cuml-cpu will break at import time whenever modules imported with cuml.internals.safe_imports.gpu_only_import() are used in any of the following ways:

  • type hints
  • decorators
  • any other module-level direct use

Like this:

cuml.internals.safe_imports.UnavailableError: cudf is not installed in non GPU-enabled installations

How long has this been broken? What's the root cause?

It seems like something changed within 25.04... earlier versions of cuML are not affected by these issues: #6403 (comment)

I don't know what the root cause is. Maybe some changes to cuml's top-level imports in 25.04 is now pulling in the modules with these problems at runtime, when previously it wasn't? I'm really not sure.

Benefits of these Changes

This adds a bit of test coverage in CI, minimally verifying that cuml-cpu is installable and that import cuml works in an environment without a GPU.

Inspired by:

How I tested this

Saw stuff like this in conda-python-build jobs, confirming that the import tests were running and passing:

BUILD START: ['cuml-cpu-25.04.00a137-py310_250312_g153b21870_137.conda']
...
import: 'cuml'
...
Resource usage statistics from testing cuml-cpu:
...
   Time elapsed: 0:00:10.0
...
TEST END: /tmp/conda-bld-output/linux-64/cuml-cpu-25.04.00a137-py310_250312_g153b21870_137.conda

@jameslamb jameslamb added 2 - In Progress Currenty a work in progress improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 5, 2025

This comment was marked as resolved.

@github-actions github-actions bot added conda conda issue ci labels Mar 5, 2025
@jameslamb jameslamb changed the title WIP: cuml-cpu: enable conda import tests, update dependencies to match cuml WIP: cuml-cpu: enable conda import tests, update scikit-learn and hdbscan to match cuml Mar 6, 2025
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Mar 7, 2025
@jameslamb jameslamb self-assigned this Mar 10, 2025
@@ -13,6 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
from __future__ import annotations
Copy link
Member Author

@jameslamb jameslamb Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, using from __future__ import annotations like this seems to not to be enough to avoid these errors for return type hints.

Traceback (most recent call last):
  File "lib/python3.10/site-packages/cuml/internals/base_return_types.py", line 72, in _get_base_return_type
    type_hints = typing.get_type_hints(attr)
  File "lib/python3.10/typing.py", line 1871, in get_type_hints
    value = _eval_type(value, globalns, localns)
  File "lib/python3.10/typing.py", line 327, in _eval_type
    return t._evaluate(globalns, localns, recursive_guard)
  File "lib/python3.10/typing.py", line 694, in _evaluate
    eval(self.__forward_code__, globalns, localns),
  File "<string>", line 1, in <module>
  File "lib/python3.10/site-packages/cuml/internals/safe_imports.py", line 86, in __getattr__
    raise UnavailableError(cls._msg)
cuml.internals.safe_imports.UnavailableError: cudf is not installed in non GPU-enabled installations

During handling of the above exception, another exception occurred:
...
Traceback (most recent call last):
  File "/opt/conda/conda-bld/test_tmp/run_test.py", line 2, in <module>
    import cuml
  File "lib/python3.10/site-packages/cuml/__init__.py", line 116, in <module>
    from cuml.linear_model.linear_regression import LinearRegression
  File "lib/python3.10/site-packages/cuml/linear_model/__init__.py", line 21, in <module>
    from cuml.linear_model.logistic_regression import LogisticRegression
  File "logistic_regression.pyx", line 27, in init cuml.linear_model.logistic_regression
  File "lib/python3.10/site-packages/cuml/preprocessing/__init__.py", line 17, in <module>
    from cuml.preprocessing.LabelEncoder import LabelEncoder
  File "lib/python3.10/site-packages/cuml/preprocessing/LabelEncoder.py", line 56, in <module>
    class LabelEncoder(Base):
  File "lib/python3.10/site-packages/cuml/internals/base_helpers.py", line 118, in __new__
    classDict[attributeName] = _wrap_attribute(
  File "lib/python3.10/site-packages/cuml/internals/base_helpers.py", line 40, in _wrap_attribute
    return_type = _get_base_return_type(class_name, attribute)
  File "lib/python3.10/site-packages/cuml/internals/base_return_types.py", line 100, in _get_base_return_type
    assert False, "Shouldn't get here"
AssertionError: Shouldn't get here

This is being reached:

except Exception:
assert False, "Shouldn't get here"
return None

I think from processing this return type hint:

def transform(self, y) -> cudf.Series:

Which triggers this:

type_hints = typing.get_type_hints(attr)

Which I think leads to it looking like the cudf import is getting "used", leading to:

cuml.internals.safe_imports.UnavailableError: cudf is not installed in non GPU-enabled installations

That return type of -> cudf.Series has been there for 6+ years, since it was added back in cfa7f8d ... so I'm not sure what's changed here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same problem instead using string-literal type hints to defer evaluation like this:

def inverse_transform(self, y: "cudf.Series") -> "cudf.Series":

I think what we have here is that something like this is being run:

import typing

typing.get_type_hints(cuml.preprocessing.LabelEncoder.LabelEncoder, "transform")

And that's triggering an attribute access in GPU-only import cudf:

def __getattr__(cls, name):
raise UnavailableError(cls._msg)

@jameslamb jameslamb changed the title WIP: cuml-cpu: enable conda import tests, update scikit-learn and hdbscan to match cuml WIP: cuml-cpu: fix import issues, enable conda import tests, update scikit-learn and hdbscan to match cuml Mar 10, 2025
@@ -27,7 +27,6 @@
from dask_cudf import Series as dcSeries
from dask.dataframe import Series as daskSeries
from dask.dataframe import DataFrame as daskDataFrame
from cudf import Series
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cudf is not a run: dependency of cuml-cpu, so it can't be imported unconditionally here. This change and the one below uses the gpu_only_import() wrapped version instead.

cudf = gpu_only_import("cudf")

return None
raise AssertionError(
f"Failed to determine return type for {attr} (class = '${class_name}'). This is a bug in cuML, please report it."
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hit this during testing, whenever the return type hint on a function was from a module wrapped in gpu_only_import().

Proposing here to make the error message more informative. I think RuntimeError is a more appropriate choice here, but I chose AssertionError to preserve the existing behavior that this raises an AssertionError.

@@ -97,7 +97,8 @@ def _get_base_return_type(class_name, attr):
if attr.__annotations__["return"].replace("'", "") == class_name:
return "base"
except Exception:
assert False, "Shouldn't get here"
return None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice that this return None was unreachable because of the assert False before it.

@@ -258,7 +257,7 @@ def fit_transform(self, y, z=None) -> cudf.Series:

return y.cat.codes

def inverse_transform(self, y: cudf.Series) -> cudf.Series:
def inverse_transform(self, y: "cudf.Series"):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using strings or from __future__ import annotations for is enough to avoid an import-time attribute access for argument type hints... but not for return types. See #6400 (comment)

@jameslamb jameslamb changed the title WIP: cuml-cpu: fix import issues, enable conda import tests, update scikit-learn and hdbscan to match cuml cuml-cpu: fix import issues, enable conda import tests Mar 13, 2025
@jameslamb jameslamb removed the 2 - In Progress Currenty a work in progress label Mar 13, 2025
@jameslamb jameslamb requested review from dantegd, csadorf and jcrist March 13, 2025 19:52
@jameslamb jameslamb marked this pull request as ready for review March 13, 2025 19:52
@jameslamb jameslamb requested review from a team as code owners March 13, 2025 19:52
@jameslamb jameslamb requested a review from gforsyth March 13, 2025 19:52
@csadorf
Copy link
Contributor

csadorf commented Mar 13, 2025

This PR is further evidence why we need to revise and simplify the type reflection system and to not use type hints at runtime to inform behavior. This effort is captured in #5022 .

Copy link
Contributor

@gforsyth gforsyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work tracking down all these nitty failure points, @jameslamb !

Copy link
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! How can we make sure that we are catching those issues automatically in the future? Are the conda-python-build tests sufficient?

@gforsyth
Copy link
Contributor

Thanks a lot! How can we make sure that we are catching those issues automatically in the future? Are the conda-python-build tests sufficient?

They're a big improvement, certainly. The import cuml in the test will trigger all the imports in the top-level cuml/__init__.py -- it might be worth doing an audit of all the modules that should be CPU importable and making sure those imports get tested in the conda build, too.

I think that's out of scope for this PR but would be a good followup if cuml wants to continue supporting CPU-only workloads.

@csadorf
Copy link
Contributor

csadorf commented Mar 17, 2025

I think that's out of scope for this PR but would be a good followup if cuml wants to continue supporting CPU-only workloads.

Yes, we are pursuing a decision on this.

@jameslamb jameslamb requested a review from betatim March 17, 2025 16:15
@csadorf
Copy link
Contributor

csadorf commented Mar 19, 2025

/merge

@rapids-bot rapids-bot bot merged commit 744c848 into rapidsai:branch-25.04 Mar 19, 2025
73 of 75 checks passed
@jameslamb jameslamb deleted the ci/import-tests branch March 19, 2025 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci conda conda issue Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] cuml-cpu 25.04 fails to import without a GPU present
4 participants