Skip to content

"Negative code found" error for short input texts. #804

Open
@twocode

Description

@twocode

Self Checks

  • This template is only for bug reports. For questions, please visit Discussions.
  • I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
  • I have searched for existing issues, including closed ones. Search issues
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Self Hosted (Docker)

Environment Details

Tesla T4

Steps to Reproduce

"python", "-m", "tools.api", \
"--listen", "0.0.0.0:8080", \
"--llama-checkpoint-path", "checkpoints/fish-speech-1.4", \
"--decoder-checkpoint-path", "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth", \
"--decoder-config-name", "firefly_gan_vq", \
"--compile", \
"--half" \

✔️ Expected Behavior

When input is short like hi, he the correct audio should be generated, stably.

❌ Actual Behavior

It randomly succeeds or fails. Error message is AssertionError: Negative code found

2025-01-04 15:59:09.779 | INFO     | tools.llama.generate:generate_long:759 - Encoded text: hi.
2025-01-04 15:59:09.779 | INFO     | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
  1%|          | 11/1023 [00:00<00:11, 87.47it/s]
2025-01-04 15:59:09.989 | INFO     | tools.llama.generate:generate_long:823 - Compilation time: 0.21 seconds
2025-01-04 15:59:09.989 | INFO     | tools.llama.generate:generate_long:832 - Generated 13 tokens in 0.21 seconds, 62.07 tokens/sec
2025-01-04 15:59:09.989 | INFO     | tools.llama.generate:generate_long:835 - Bandwidth achieved: 30.69 GB/s
2025-01-04 15:59:09.990 | INFO     | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/exceptions.py", line 27, in wrapper
    return await endpoint()
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/views.py", line 29, in wrapper
    return await function()
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/parameters.py", line 119, in callback_with_auto_bound_params
    result = await result
             ^^^^^^^^^^^^
  File "/opt/fish-speech/tools/api.py", line 756, in api_invoke_model
    fake_audios = next(inference(req))
                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 36, in generator_context
    response = gen.send(None)
               ^^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/api.py", line 683, in inference
    raise result.response
  File "/opt/fish-speech/tools/llama/generate.py", line 904, in worker
    for chunk in generate_long(
                 ^^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/llama/generate.py", line 848, in generate_long
    assert (codes >= 0).all(), f"Negative code found"
           ^^^^^^^^^^^^^^^^^^
AssertionError: Negative code found
INFO:     10.0.3.136:35684 - "POST /v1/tts HTTP/1.1" 500 Internal Server Error
2025-01-04 15:59:16.891 | INFO     | tools.api:inference:623 - Use same references
2025-01-04 15:59:16.894 | INFO     | tools.llama.generate:generate_long:759 - Encoded text: he.
2025-01-04 15:59:16.894 | INFO     | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
  0%|          | 2/1023 [00:00<00:15, 64.00it/s]
2025-01-04 15:59:17.010 | INFO     | tools.llama.generate:generate_long:823 - Compilation time: 0.12 seconds
2025-01-04 15:59:17.010 | INFO     | tools.llama.generate:generate_long:832 - Generated 4 tokens in 0.12 seconds, 34.65 tokens/sec
2025-01-04 15:59:17.010 | INFO     | tools.llama.generate:generate_long:835 - Bandwidth achieved: 17.13 GB/s
2025-01-04 15:59:17.011 | INFO     | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/exceptions.py", line 27, in wrapper
    return await endpoint()
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/views.py", line 29, in wrapper
    return await function()
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/parameters.py", line 119, in callback_with_auto_bound_params
    result = await result
             ^^^^^^^^^^^^
  File "/opt/fish-speech/tools/api.py", line 756, in api_invoke_model
    fake_audios = next(inference(req))
                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 36, in generator_context
    response = gen.send(None)
               ^^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/api.py", line 683, in inference
    raise result.response
  File "/opt/fish-speech/tools/llama/generate.py", line 904, in worker
    for chunk in generate_long(
                 ^^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/llama/generate.py", line 848, in generate_long
    assert (codes >= 0).all(), f"Negative code found"
           ^^^^^^^^^^^^^^^^^^
AssertionError: Negative code found
INFO:     10.0.3.136:45164 - "POST /v1/tts HTTP/1.1" 500 Internal Server Error
2025-01-04 15:59:21.064 | INFO     | tools.api:inference:623 - Use same references
2025-01-04 15:59:21.066 | INFO     | tools.llama.generate:generate_long:759 - Encoded text: Hee
2025-01-04 15:59:21.067 | INFO     | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
  1%|          | 12/1023 [00:00<00:11, 87.44it/s]
2025-01-04 15:59:21.289 | INFO     | tools.llama.generate:generate_long:823 - Compilation time: 0.22 seconds
2025-01-04 15:59:21.289 | INFO     | tools.llama.generate:generate_long:832 - Generated 14 tokens in 0.22 seconds, 63.06 tokens/sec
2025-01-04 15:59:21.289 | INFO     | tools.llama.generate:generate_long:835 - Bandwidth achieved: 31.18 GB/s
2025-01-04 15:59:21.290 | INFO     | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/exceptions.py", line 27, in wrapper
    return await endpoint()
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/views.py", line 29, in wrapper
    return await function()
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/parameters.py", line 119, in callback_with_auto_bound_params
    result = await result
             ^^^^^^^^^^^^
  File "/opt/fish-speech/tools/api.py", line 756, in api_invoke_model
    fake_audios = next(inference(req))
                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 36, in generator_context
    response = gen.send(None)
               ^^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/api.py", line 683, in inference
    raise result.response
  File "/opt/fish-speech/tools/llama/generate.py", line 904, in worker
    for chunk in generate_long(
                 ^^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/llama/generate.py", line 848, in generate_long
    assert (codes >= 0).all(), f"Negative code found"
           ^^^^^^^^^^^^^^^^^^
AssertionError: Negative code found
INFO:     10.0.3.136:45178 - "POST /v1/tts HTTP/1.1" 500 Internal Server Error
2025-01-04 16:01:01.233 | INFO     | tools.api:inference:623 - Use same references
2025-01-04 16:01:01.236 | INFO     | tools.llama.generate:generate_long:759 - Encoded text: what
2025-01-04 16:01:01.236 | INFO     | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
  1%|▏         | 15/1023 [00:00<00:11, 89.25it/s]
2025-01-04 16:01:01.488 | INFO     | tools.llama.generate:generate_long:823 - Compilation time: 0.25 seconds
2025-01-04 16:01:01.489 | INFO     | tools.llama.generate:generate_long:832 - Generated 17 tokens in 0.25 seconds, 67.47 tokens/sec
2025-01-04 16:01:01.489 | INFO     | tools.llama.generate:generate_long:835 - Bandwidth achieved: 33.36 GB/s
2025-01-04 16:01:01.489 | INFO     | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
2025-01-04 16:01:01.490 | INFO     | tools.api:decode_vq_tokens:189 - VQ features: torch.Size([8, 16])
INFO:     10.0.3.136:52978 - "POST /v1/tts HTTP/1.1" 200 OK
2025-01-04 16:01:10.114 | INFO     | tools.api:inference:623 - Use same references
2025-01-04 16:01:10.116 | INFO     | tools.llama.generate:generate_long:759 - Encoded text: Oh .
2025-01-04 16:01:10.117 | INFO     | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
  2%|▏         | 18/1023 [00:00<00:11, 89.46it/s]
2025-01-04 16:01:10.403 | INFO     | tools.llama.generate:generate_long:823 - Compilation time: 0.29 seconds
2025-01-04 16:01:10.403 | INFO     | tools.llama.generate:generate_long:832 - Generated 20 tokens in 0.29 seconds, 69.87 tokens/sec
2025-01-04 16:01:10.404 | INFO     | tools.llama.generate:generate_long:835 - Bandwidth achieved: 34.55 GB/s
2025-01-04 16:01:10.404 | INFO     | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
2025-01-04 16:01:10.405 | INFO     | tools.api:decode_vq_tokens:189 - VQ features: torch.Size([8, 19])
INFO:     10.0.3.136:52980 - "POST /v1/tts HTTP/1.1" 200 OK
2025-01-04 16:01:22.751 | INFO     | tools.api:inference:623 - Use same references
2025-01-04 16:01:22.754 | INFO     | tools.llama.generate:generate_long:759 - Encoded text: Hi.
2025-01-04 16:01:22.755 | INFO     | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
  2%|▏         | 24/1023 [00:00<00:11, 90.80it/s]
2025-01-04 16:01:23.104 | INFO     | tools.llama.generate:generate_long:823 - Compilation time: 0.35 seconds
2025-01-04 16:01:23.104 | INFO     | tools.llama.generate:generate_long:832 - Generated 26 tokens in 0.35 seconds, 74.49 tokens/sec
2025-01-04 16:01:23.104 | INFO     | tools.llama.generate:generate_long:835 - Bandwidth achieved: 36.83 GB/s
2025-01-04 16:01:23.105 | INFO     | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
2025-01-04 16:01:23.106 | INFO     | tools.api:decode_vq_tokens:189 - VQ features: torch.Size([8, 25])
INFO:     10.0.3.136:46966 - "POST /v1/tts HTTP/1.1" 200 OK

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions