-
Notifications
You must be signed in to change notification settings - Fork 568
cuml-cpu: fix import issues, enable conda import tests #6400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
@@ -13,6 +13,7 @@ | |||
# See the License for the specific language governing permissions and | |||
# limitations under the License. | |||
# | |||
from __future__ import annotations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, using from __future__ import annotations
like this seems to not to be enough to avoid these errors for return type hints.
Traceback (most recent call last):
File "lib/python3.10/site-packages/cuml/internals/base_return_types.py", line 72, in _get_base_return_type
type_hints = typing.get_type_hints(attr)
File "lib/python3.10/typing.py", line 1871, in get_type_hints
value = _eval_type(value, globalns, localns)
File "lib/python3.10/typing.py", line 327, in _eval_type
return t._evaluate(globalns, localns, recursive_guard)
File "lib/python3.10/typing.py", line 694, in _evaluate
eval(self.__forward_code__, globalns, localns),
File "<string>", line 1, in <module>
File "lib/python3.10/site-packages/cuml/internals/safe_imports.py", line 86, in __getattr__
raise UnavailableError(cls._msg)
cuml.internals.safe_imports.UnavailableError: cudf is not installed in non GPU-enabled installations
During handling of the above exception, another exception occurred:
...
Traceback (most recent call last):
File "/opt/conda/conda-bld/test_tmp/run_test.py", line 2, in <module>
import cuml
File "lib/python3.10/site-packages/cuml/__init__.py", line 116, in <module>
from cuml.linear_model.linear_regression import LinearRegression
File "lib/python3.10/site-packages/cuml/linear_model/__init__.py", line 21, in <module>
from cuml.linear_model.logistic_regression import LogisticRegression
File "logistic_regression.pyx", line 27, in init cuml.linear_model.logistic_regression
File "lib/python3.10/site-packages/cuml/preprocessing/__init__.py", line 17, in <module>
from cuml.preprocessing.LabelEncoder import LabelEncoder
File "lib/python3.10/site-packages/cuml/preprocessing/LabelEncoder.py", line 56, in <module>
class LabelEncoder(Base):
File "lib/python3.10/site-packages/cuml/internals/base_helpers.py", line 118, in __new__
classDict[attributeName] = _wrap_attribute(
File "lib/python3.10/site-packages/cuml/internals/base_helpers.py", line 40, in _wrap_attribute
return_type = _get_base_return_type(class_name, attribute)
File "lib/python3.10/site-packages/cuml/internals/base_return_types.py", line 100, in _get_base_return_type
assert False, "Shouldn't get here"
AssertionError: Shouldn't get here
This is being reached:
cuml/python/cuml/cuml/internals/base_return_types.py
Lines 99 to 101 in c6a03b7
except Exception: | |
assert False, "Shouldn't get here" | |
return None |
I think from processing this return type hint:
def transform(self, y) -> cudf.Series: |
Which triggers this:
type_hints = typing.get_type_hints(attr) |
Which I think leads to it looking like the cudf
import is getting "used", leading to:
cuml.internals.safe_imports.UnavailableError: cudf is not installed in non GPU-enabled installations
That return type of -> cudf.Series
has been there for 6+ years, since it was added back in cfa7f8d ... so I'm not sure what's changed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same problem instead using string-literal type hints to defer evaluation like this:
def inverse_transform(self, y: "cudf.Series") -> "cudf.Series":
I think what we have here is that something like this is being run:
import typing
typing.get_type_hints(cuml.preprocessing.LabelEncoder.LabelEncoder, "transform")
And that's triggering an attribute access in GPU-only import cudf
:
cuml/python/cuml/cuml/internals/safe_imports.py
Lines 85 to 86 in 29871eb
def __getattr__(cls, name): | |
raise UnavailableError(cls._msg) |
@@ -27,7 +27,6 @@ | |||
from dask_cudf import Series as dcSeries | |||
from dask.dataframe import Series as daskSeries | |||
from dask.dataframe import DataFrame as daskDataFrame | |||
from cudf import Series |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudf
is not a run:
dependency of cuml-cpu
, so it can't be imported unconditionally here. This change and the one below uses the gpu_only_import()
wrapped version instead.
cudf = gpu_only_import("cudf") |
return None | ||
raise AssertionError( | ||
f"Failed to determine return type for {attr} (class = '${class_name}'). This is a bug in cuML, please report it." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hit this during testing, whenever the return type hint on a function was from a module wrapped in gpu_only_import()
.
Proposing here to make the error message more informative. I think RuntimeError
is a more appropriate choice here, but I chose AssertionError
to preserve the existing behavior that this raises an AssertionError
.
@@ -97,7 +97,8 @@ def _get_base_return_type(class_name, attr): | |||
if attr.__annotations__["return"].replace("'", "") == class_name: | |||
return "base" | |||
except Exception: | |||
assert False, "Shouldn't get here" | |||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice that this return None
was unreachable because of the assert False
before it.
@@ -258,7 +257,7 @@ def fit_transform(self, y, z=None) -> cudf.Series: | |||
|
|||
return y.cat.codes | |||
|
|||
def inverse_transform(self, y: cudf.Series) -> cudf.Series: | |||
def inverse_transform(self, y: "cudf.Series"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using strings or from __future__ import annotations
for is enough to avoid an import-time attribute access for argument type hints... but not for return types. See #6400 (comment)
This PR is further evidence why we need to revise and simplify the type reflection system and to not use type hints at runtime to inform behavior. This effort is captured in #5022 . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work tracking down all these nitty failure points, @jameslamb !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! How can we make sure that we are catching those issues automatically in the future? Are the conda-python-build tests sufficient?
They're a big improvement, certainly. The I think that's out of scope for this PR but would be a good followup if cuml wants to continue supporting CPU-only workloads. |
Yes, we are pursuing a decision on this. |
/merge |
Fixes #6403
This project publishes a conda package,
cuml-cpu
, which does what it sounds like... allows the use of cuML on systems without a GPU.This proposes some updates to packaging for
cuml-cpu
:Notes for Reviewers
Why all these changes in Python code?
See some of the challenges I faced documented in #6400 (comment).
In short,
import cuml
when it was installed viacuml-cpu
will break at import time whenever modules imported withcuml.internals.safe_imports.gpu_only_import()
are used in any of the following ways:Like this:
How long has this been broken? What's the root cause?
It seems like something changed within 25.04... earlier versions of cuML are not affected by these issues: #6403 (comment)
I don't know what the root cause is. Maybe some changes to
cuml
's top-level imports in 25.04 is now pulling in the modules with these problems at runtime, when previously it wasn't? I'm really not sure.Benefits of these Changes
This adds a bit of test coverage in CI, minimally verifying that
cuml-cpu
is installable and thatimport cuml
works in an environment without a GPU.Inspired by:
cuvs
: cuvs-bench-cpu: avoid 'mkl' dependency on aarch64 cuvs#750How I tested this
Saw stuff like this in
conda-python-build
jobs, confirming that the import tests were running and passing: