Skip to content

ImportError after installing cudf that is calling dask_cudf which is looking for a dask.dataframe dependency and failing to find it #9019

@jacobtomlinson

Description

@jacobtomlinson

Dear dask community,

It is 2025 and I am running into an ImportError after installing cudf that is calling dask_cudf which is looking for a dask.dataframe dependency and failing to find it. I got the cudfand dask installed from rapidsai and conda-forgechannels using conda:
conda install -c rapidsai -c conda-forge -c nvidia cudf cuml 'cuda-version=12.6'
And that installed the following packages:

$ conda list 'dask|cuml|cudf|distributed'
# packages in environment at ~/anaconda3/envs/rnntf2:
#
# Name                    Version                   Build  Channel
cudf                      24.12.00        cuda12_py312_241211_gff41ecf473_0    rapidsai
cuml                      24.12.00        cuda12_py312_241211_ge79cd670a_0    rapidsai
dask                      2024.11.2          pyhff2d567_1    conda-forge
dask-core                 2024.11.2          pyhff2d567_1    conda-forge
dask-cuda                 24.12.00        py312_241211_g3b3b356_0    rapidsai
dask-cudf                 24.12.00        cuda12_py312_241211_gff41ecf473_0    rapidsai
dask-expr                 1.1.19             pyhd8ed1ab_0    conda-forge
distributed               2024.11.2          pyhff2d567_1    conda-forge
distributed-ucxx          0.41.00         py3.12_241211_gd355f9c_0    rapidsai
libcudf                   24.12.00        cuda12_241211_gff41ecf473_0    rapidsai
libcuml                   24.12.00        cuda12_241211_ge79cd670a_0    rapidsai
libcumlprims              24.12.00        cuda12_241211_g8df6c7e_0    rapidsai
pylibcudf                 24.12.00        cuda12_py312_241211_gff41ecf473_0    rapidsai
raft-dask                 24.12.00        cuda12_py312_241211_geaf9cc72_0    rapidsai
rapids-dask-dependency    24.12.00                   py_0    rapidsai

My idea is to use dask on a slurm based HPC system and I saw in the documentation that you recommend the dask-jobqueue package. I found two similarly named packages:

$ conda search -c conda-forge 'dask*jobqueue'
dask-gateway-server-jobqueue           0.9.0  py38h578d9bd_2  conda-forge         
dask-gateway-server-jobqueue           0.9.0  py39hf3d152e_2  conda-forge         
dask-gateway-server-jobqueue        2022.4.0      ha770c72_0  conda-forge         
dask-gateway-server-jobqueue        2022.6.1      ha770c72_0  conda-forge         
dask-gateway-server-jobqueue       2022.10.0      ha770c72_0  conda-forge         
dask-gateway-server-jobqueue        2023.1.0      ha770c72_0  conda-forge         
dask-gateway-server-jobqueue        2023.1.1      ha770c72_0  conda-forge         
dask-gateway-server-jobqueue        2023.9.0      ha770c72_0  conda-forge         
dask-gateway-server-jobqueue        2024.1.0      ha770c72_0  conda-forge         
dask-jobqueue                  0.8.0    pyhd8ed1ab_0  conda-forge         
dask-jobqueue                  0.8.1    pyhd8ed1ab_0  conda-forge         
dask-jobqueue                  0.8.2    pyhd8ed1ab_0  conda-forge         
dask-jobqueue                  0.8.5    pyhd8ed1ab_0  conda-forge         
dask-jobqueue                  0.9.0    pyhd8ed1ab_0  conda-forge         

What is the difference between them? Which one should I pick?

Thanks!

Originally posted by @ovalerio in #962

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions