Skip to content

ninja -v指令出错导致transformer_inference.so文件缺失 #12

Open
@Debouter

Description

Hi~
我在运行demo.py时出现了以下Error:

Traceback (most recent call last):
  File "/mnt/petrelfs/klk/anaconda3/envs/ds/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
    subprocess.run(
  File "/mnt/petrelfs/klk/anaconda3/envs/ds/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
    ......
ImportError: /mnt/petrelfs/klk/.cache/torch_extensions/py310_cu118/transformer_inference/transformer_inference.so: cannot open shared object file: No such file or directory

我初步认为这是ninja -v指令执行存在问题,导致共享目标文件transformer_inference.so没有生成。

我已经尝试了网上解决Command '['ninja', '-v']' returned non-zero exit status 1的各种方法,例如安装或禁用ninja库、降低pytorch版本等,但都无法解决这个问题。

我使用的环境如下:

  • python==3.10.12
  • torch/cuda/deepspeed版本均与你的环境一致

请问你是否遇到过这个问题?如果没有的话可否分享一下你的transformer_inference.so文件,该文件大概在路径<user_path>/.cache/torch_extensions/pyXX_cuXX/transformer_inference处。

谢谢!

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions