Skip to content

Potential solution to RuntimeError: No such operator fbgemm::jagged_2d_to_dense #3168

Open
@Jary-lrj

Description

My envs BEFORE:

  • in conda virtual env
  • NVIDIA Tesla V100
  • python: 3.10
  • torch 2.1.0 + cu118
  • cudnn 8.7.0
  • fbgemm_gpu 0.7.0

My envs AFTER:

  • in conda virtual env
  • NVIDIA Tesla V100
  • python: 3.10 -> 3.12
  • torch 2.1.0 + cu118 -> torch 2.3.0 + cu118 (Key)
  • cudnn 8.7.0
  • fbgemm_gpu 0.7.0

PS: When I use fbgemm_gpu 0.8.0, there will be another error: AttributeError: '_OpNamespace' 'fbgemm' object has no attribute 'merge_pooled_embeddings'. I have no idea why the later version has such an error.

Hint: If you find similar errors, check your configs by the following order:
(1) GPU device: My NVIDIA RTX 4090 can't work with the same config in envs AFTER. It seems only V and A devices can work.
(2) pytorch and cuda: If possible, you can try run fbgemm in conda virtual envs instead of docker / bare linux. CUDA 11.8 & 12.1 is recommended. AND USE torch 2.3.0+ NOT 2.1.0. As for libnvidia_ml.so, libtorch.so, no matter you use pip or conda to install torch, they will be installed.
(3) version: Try 0.7.0 but not 0.8.0.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions