Skip to content

PermissionError and Segmentation Fault with torch.compile (Inductor) during Model Warm-up when --complie is set #932

Open
@MaxRubby

Description

@MaxRubby

Self Checks

  • This template is only for bug reports. For questions, please visit Discussions.
  • I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
  • I have searched for existing issues, including closed ones. Search issues
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Self Hosted (Source)

Environment Details

  • OS: WSL Ubuntu 22.04
  • Python Version: 3.10.16
  • PyTorch Version: * 2.4.1+cu121
  • CUDA Version:
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2021 NVIDIA Corporation
    Built on Thu_Nov_18_09:45:30_PST_2021
    Cuda compilation tools, release 11.5, V11.5.119
    Build cuda_11.5.r11.5/compiler.30672275_0
  • GPU: NVIDIA GeForce RTX 3060
  • Fish-Speech Version: 1.5
  • torch version: 2.4.1
  • torchvision version : 0.19.1+cu121
  • torchaudio version 0.19.1+cu121
  • torchtext version: 2.4.1+cu121

Steps to Reproduce

I started up successfully before with --complie and it works fine but dont why this doesnt work this time:

  1. Set up the fish-speech environment (including dependencies).

  2. Run the API server with the --compile flag:

    python -m tools.api_server \
        --listen 0.0.0.0:7865 \
        --llama-checkpoint-path checkpoints/fish-speech-1.5 \
        --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth \
        --decoder-config-name firefly_gan_vq \
        --compile

✔️ Expected Behavior

Application start up successfully

❌ Actual Behavior

I'm encountering a PermissionError followed by a Segmentation Fault when running a TTS model (fish-speech) with torch.compile using the Inductor backend. The error occurs during the model warm-up phase (model_manager.warm_up). The issue seems related to file access permissions in the temporary directory used by TorchInductor, even after attempts to adjust permissions and use shutil.move as a workaround which referred from unslothai/unsloth#1999.

Error Log:

run as user:

(tts) koma@LAPTOP-UFED71OD:~/fish-speech$ python -m tools.api_server --listen 0.0.0.0:7865 --llama-checkpoint-path checkpoints/fish-speech-1.5 --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth --decoder-config-name firefly_gan_vq --compile
INFO:     Started server process [1330]
INFO:     Waiting for application startup.
2025-03-23 11:28:20.578 | INFO     | fish_speech.models.text2semantic.inference:load_model:681 - Restored model from checkpoint
2025-03-23 11:28:20.578 | INFO     | fish_speech.models.text2semantic.inference:load_model:687 - Using DualARTransformer
2025-03-23 11:28:20.578 | INFO     | fish_speech.models.text2semantic.inference:load_model:695 - Compiling function...
2025-03-23 11:28:22.062 | INFO     | tools.server.model_manager:load_llama_model:99 - LLAMA model loaded.
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:445: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:630: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:147: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/lookup_free_quantization.py:209: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
2025-03-23 11:28:27.765 | INFO     | fish_speech.models.vqgan.inference:load_model:46 - Loaded model: <All keys matched successfully>
2025-03-23 11:28:27.766 | INFO     | tools.server.model_manager:load_decoder_model:107 - Decoder model loaded.
2025-03-23 11:28:27.795 | INFO     | fish_speech.models.text2semantic.inference:generate_long:788 - Encoded text: Hello world.
2025-03-23 11:28:27.797 | INFO     | fish_speech.models.text2semantic.inference:generate_long:806 - Generating sentence 1/1 of sample 1/1
  0%|                                                                                  | 0/1023 [00:00<?, ?it/s]/home/koma/.conda/envs/tts/lib/python3.10/contextlib.py:103: FutureWarning: `torch.backends.cuda.sdp_kernel()` is deprecated. In the future, this context manager will be removed. Please see `torch.nn.attention.sdpa_kernel()` for the new context manager, with updated signature.
  self.gen = func(*args, **kwds)
W0323 11:30:52.749000 140116945266240 torch/fx/experimental/symbolic_shapes.py:4449] [0/0] xindex is not in var_ranges, defaulting to unknown range.
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ttir.tmp.pid_1465_a4aea7d5-c578-456b-a4e9-f2c3298911a3 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ttgir.tmp.pid_1465_15dc99d5-a380-438c-b1da-c05e1e98490f -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/lo/.1330.140116945266240.tmp -> /tmp/torchinductor_koma/lo/clogj3r7bsakyus6wz3yqefmjlhto65qemf2qfe4ns7mp52pxd6n.py
  0%|                                                                                  | 0/1023 [02:41<?, ?it/s]
ERROR:    Traceback (most recent call last):
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/kui/asgi/lifespan.py", line 36, in __call__
    await result
  File "/home/koma/fish-speech/tools/api_server.py", line 82, in initialize_app
    app.state.model_manager = ModelManager(
  File "/home/koma/fish-speech/tools/server/model_manager.py", line 65, in __init__
    self.warm_up(self.tts_inference_engine)
  File "/home/koma/fish-speech/tools/server/model_manager.py", line 121, in warm_up
    list(inference(request, tts_inference_engine))
  File "/home/koma/fish-speech/tools/server/inference.py", line 25, in inference_wrapper
    raise HTTPException(
baize.exceptions.HTTPException: (500, '\'backend=\\\'inductor\\\' raised:\\nPermissionError: [Errno 13] Permission denied: \\\'/tmp/torchinductor_koma/lo/.1330.140116945266240.tmp\\\' -> \\\'/tmp/torchinductor_koma/lo/clogj3r7bsakyus6wz3yqefmjlhto65qemf2qfe4ns7mp52pxd6n.py\\\'\\n\\nSet TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information\\n\\n\\nYou can suppress this exception and fall back to eager by setting:\\n    import torch._dynamo\\n    torch._dynamo.config.suppress_errors = True\\n\'')

Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.llir.tmp.pid_1465_7cc0f06c-2609-4d20-a45e-ae625a161c39 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.llir
ERROR:    Application startup failed. Exiting.
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ptx.tmp.pid_1465_aaab7612-5f9d-4669-bf5e-1f685a9eb20a -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.cubin.tmp.pid_1465_7e66aa04-7c5f-4624-a304-84f1c6297651 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.json.tmp.pid_1465_c426a30f-3d7c-4382-ae50-3281c8bfc682 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/__grp__triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.json.tmp.pid_1465_3c4ed8ff-5f40-46aa-bad2-06ea8a68ff97 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/__grp__triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttir.tmp.pid_1465_8b7a4a8b-f4fe-4a5d-b211-dcb7b802a116 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttgir.tmp.pid_1465_09c6789f-443a-4516-8364-6ed1dca9a77e -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttir.tmp.pid_1463_63dde67e-2ce1-4689-9dd6-201faa09fbbd -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttgir.tmp.pid_1463_937aec16-747f-4c7d-ad14-03e631c2c8fb -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.llir.tmp.pid_1465_c3a159f6-b382-4c1a-a151-ad6ec86f6930 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.llir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ptx.tmp.pid_1465_b79ce8db-924e-43b3-8d93-efc4e1b15264 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.llir.tmp.pid_1463_ab90bde9-214a-4ff5-a32a-3d6a7ef50ced -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.llir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ptx.tmp.pid_1463_7a0e89a4-53ff-4fd9-835b-1a8e88d92ee6 -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.cubin.tmp.pid_1463_2b7ccc2a-537c-4bc7-9c26-1428a6db4caa -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json.tmp.pid_1463_f222d114-66c1-4eca-a03a-0bb33e9bd35d -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json.tmp.pid_1463_739c28d6-8df0-4417-b096-a23454ab5497 -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.cubin.tmp.pid_1465_65ecf108-ed86-4aca-83ff-afb63f98f017 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json.tmp.pid_1465_87b5dcad-2b6f-4cf1-93ef-7e3748fde4cc -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json.tmp.pid_1465_82aa0069-a027-4475-be22-dcd0977f4a30 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttir.tmp.pid_1463_7efde184-dd6f-445f-bbbe-be559b7db26f -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttgir.tmp.pid_1463_e4bcbad9-34eb-4fe8-a5ef-990efcb22576 -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttir.tmp.pid_1465_0fac9c6a-a91a-4b3c-ba58-42e239e93e9f -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttgir.tmp.pid_1465_bc0f6074-6542-4b19-a7e6-0d09e123720f -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.llir.tmp.pid_1463_1c8ec382-65e8-4066-b82c-ccdfec37d317 -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.llir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ptx.tmp.pid_1463_a6b78858-ec93-4366-b195-adadf53f3575 -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.llir.tmp.pid_1465_6ca938da-2740-4e83-9180-7f9e2199d222 -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.llir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ptx.tmp.pid_1465_abae5c09-5d18-4d8b-ad58-52b30fc1b2b9 -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.cubin.tmp.pid_1463_058654c1-bc84-4853-84e0-46b42ac6931c -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json.tmp.pid_1463_eeb3c09f-052e-422e-ad6e-981dc43232d4 -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json.tmp.pid_1463_10e1f3f6-e8af-4ce6-b46a-68061ee4355c -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.cubin.tmp.pid_1465_9e493fbf-9a1f-4b56-b7b6-261a2b6bead1 -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json.tmp.pid_1465_49237752-09ac-450d-b429-970e476f09af -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json.tmp.pid_1465_75cf45bf-0969-43d4-b9f1-c2107e319764 -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json

run with sudo

(tts) koma@DESKTOP-O5IAPM2:~/fish-speech$ sudo /home/koma/.conda/envs/tts/bin/python  -m tools.api_server --listen 0.0.0.0:7865 --llama-checkpoint-path checkpoints/fish-speech-1.5 --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth --decoder-config-name firefly_gan_vq --compile
INFO:     Started server process [8528]
INFO:     Waiting for application startup.
2025-03-23 10:45:56.124 | INFO     | fish_speech.models.text2semantic.inference:load_model:681 - Restored model from checkpoint
2025-03-23 10:45:56.124 | INFO     | fish_speech.models.text2semantic.inference:load_model:687 - Using DualARTransformer
2025-03-23 10:45:56.124 | INFO     | fish_speech.models.text2semantic.inference:load_model:695 - Compiling function...
2025-03-23 10:45:56.842 | INFO     | tools.server.model_manager:load_llama_model:99 - LLAMA model loaded.
Traceback (most recent call last):
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/_inductor/compile_worker/__main__.py", line 7, in <module>
    from torch._inductor.async_compile import pre_fork_setup
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/__init__.py", line 2263, in <module>
    _logging._init_logs()
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/_logging/_internal.py", line 884, in _init_logs
    _update_log_state_from_env()
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/_logging/_internal.py", line 716, in _update_log_state_from_env
    log_state = _parse_log_settings(log_setting)
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/_logging/_internal.py", line 660, in _parse_log_settings
    raise ValueError(_invalid_settings_err_msg(settings))
ValueError:
Invalid log settings: torch._dynamo=DEBUG, must be a comma separated list of fully
qualified module names, registered log names or registered artifact names.
For more info on various settings, try TORCH_LOGS="help"
Valid settings:
all, dynamo, aot, autograd, inductor, dynamic, torch, distributed, c10d, ddp, pp, fsdp, onnx, export, aot_graphs, graph_sizes, bytecode, graph_code, not_implemented, custom_format_test_artifact, graph_breaks, cudagraphs, kernel_code, fusion, recompiles, output_code, onnx_diagnostics, recompiles_verbose, trace_bytecode, compiled_autograd, schedule, trace_source, overlap, perf_hints, trace_call, sym_node, ddp_graphs, verbose_guards, graph, compiled_autograd_verbose, guards, aot_joint_graph, post_grad_graphs

/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:445: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:630: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:147: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/lookup_free_quantization.py:209: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
2025-03-23 10:45:57.498 | INFO     | fish_speech.models.vqgan.inference:load_model:46 - Loaded model: <All keys matched successfully>
2025-03-23 10:45:57.499 | INFO     | tools.server.model_manager:load_decoder_model:107 - Decoder model loaded.
2025-03-23 10:45:57.511 | INFO     | fish_speech.models.text2semantic.inference:generate_long:788 - Encoded text: Hello world.
2025-03-23 10:45:57.511 | INFO     | fish_speech.models.text2semantic.inference:generate_long:806 - Generating sentence 1/1 of sample 1/1
  0%|                                                                                                                                                              | 0/1023 [00:00<?, ?it/s]/home/koma/.conda/envs/tts/lib/python3.10/contextlib.py:103: FutureWarning: `torch.backends.cuda.sdp_kernel()` is deprecated. In the future, this context manager will be removed. Please see `torch.nn.attention.sdpa_kernel()` for the new context manager, with updated signature.
  self.gen = func(*args, **kwds)
W0323 10:46:39.124000 139948787234368 torch/fx/experimental/symbolic_shapes.py:4449] [0/0] xindex is not in var_ranges, defaulting to unknown range.
Segmentation fault (core dumped) /tmp/torchinductor_root/q7/.8528.139948787234368.tmp -> /tmp/torchinductor_root/q7/cq7aqs2ot34rpqjm36euezlogdt6eptsfb2ihhipmgx4f3prrecf.py
  0%|                                                                                                                                                              | 0/1023 [00:47<?, ?it/s]
ERROR:    Traceback (most recent call last):
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/kui/asgi/lifespan.py", line 36, in __call__
    await result
  File "/home/koma/fish-speech/tools/api_server.py", line 100, in initialize_app
    app.state.model_manager = ModelManager(
  File "/home/koma/fish-speech/tools/server/model_manager.py", line 65, in __init__
    self.warm_up(self.tts_inference_engine)
  File "/home/koma/fish-speech/tools/server/model_manager.py", line 121, in warm_up
    list(inference(request, tts_inference_engine))
  File "/home/koma/fish-speech/tools/server/inference.py", line 36, in inference_wrapper
    raise HTTPException(
baize.exceptions.HTTPException: (500, '\'backend=\\\'inductor\\\' raised:\\nPermissionError: [Errno 13] Permission denied: \\\'/tmp/torchinductor_root/q7/.8528.139948787234368.tmp\\\' -> \\\'/tmp/torchinductor_root/q7/cq7aqs2ot34rpqjm36euezlogdt6eptsfb2ihhipmgx4f3prrecf.py\\\'\\n\\nSet TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information\\n\'')

ERROR:    Application startup failed. Exiting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions