Skip to content

v1.4.1 Index out of bounds #943

Open
@Kingdroper

Description

@Kingdroper

Self Checks

  • This template is only for bug reports. For questions, please visit Discussions.
  • I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
  • I have searched for existing issues, including closed ones. Search issues
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Self Hosted (Source)

Environment Details

Official environment configuration

Steps to Reproduce

uvicorn workers=4
When performing inference with multiple texts (for example, 50,000 texts), the following error always occurs.
-----generate_long-----
llama-236 ------generate------
llama-186 num_new_tokens:1023

0%| | 0/1023 [00:00<?, ?it/s]
2%|▏ | 25/1023 [00:00<00:04, 249.48it/s]
5%|▍ | 50/1023 [00:00<00:03, 249.70it/s]
7%|▋ | 68/1023 [00:00<00:03, 246.03it/s]
2025-04-04 23:07:17.245 | INFO | tools.llama.generate:generate_long:507 - Compilation time: 0.35 seconds
2025-04-04 23:07:17.245 | INFO | tools.llama.generate:generate_long:516 - Generated 70 tokens in 0.35 seconds, 197.90 tokens/sec
2025-04-04 23:07:17.246 | INFO | tools.llama.generate:generate_long:519 - Bandwidth achieved: 97.86 GB/s
2025-04-04 23:07:17.246 | INFO | tools.llama.generate:generate_long:524 - GPU Memory used: 3.34 GB
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [67,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [8,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed.
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/responses.py", line 259, in call
await wrap(partial(self.listen_for_disconnect, receive))
File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/responses.py", line 255, in wrap
await func()
File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/responses.py", line 232, in listen_for_disconnect
message = await receive()
File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 555, in receive
await self.message_event.wait()
File "/root/miniconda3/envs/fish142/lib/python3.10/asyncio/locks.py", line 214, in wait
await fut
asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f95185496f0

During handling of the above exception, another exception occurred:

  • Exception Group Traceback (most recent call last):
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
    | result = await app( # type: ignore[func-returns-value]
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in call
    | return await self.app(scope, receive, send)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call
    | await super().call(scope, receive, send)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/applications.py", line 113, in call
    | await self.middleware_stack(scope, receive, send)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in call
    | raise exc
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in call
    | await self.app(scope, receive, _send)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in call
    | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    | raise exc
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    | await app(scope, receive, sender)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/routing.py", line 715, in call
    | await self.middleware_stack(scope, receive, send)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
    | await route.handle(scope, receive, send)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
    | await self.app(scope, receive, send)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
    | await wrap_app_handling_exceptions(app, request)(scope, receive, send)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    | raise exc
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    | await app(scope, receive, sender)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/routing.py", line 74, in app
    | await response(scope, receive, send)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/responses.py", line 252, in call
    | async with anyio.create_task_group() as task_group:
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 680, in aexit
    | raise BaseExceptionGroup(
    | exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
    +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    | File "/data/fish_speech/fish-speech-1.4.1/tools/main.py", line 656, in inference
    | fake_audios = decode_vq_tokens(
    | File "/data/fish_speech/fish-speech-1.4.1/tools/main.py", line 357, in decode_vq_tokens
    | return decoder_model.decode(
    | File "/data/fish_speech/fish-speech-1.4.1/fish_speech/models/vqgan/modules/firefly.py", line 582, in decode
    | z = self.quantizer.decode(indices) * mel_masks_float_conv
    | File "/data/fish_speech/fish-speech-1.4.1/fish_speech/models/vqgan/modules/fsq.py", line 114, in decode
    | z_q = self.residual_fsq.get_output_from_indices(indices)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/vector_quantize_pytorch/residual_fsq.py", line 248, in get_output_from_indices
    | outputs = tuple(rvq.get_output_from_indices(chunk_indices) for rvq, chunk_indices in zip(self.rvqs, indices))
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/vector_quantize_pytorch/residual_fsq.py", line 248, in
    | outputs = tuple(rvq.get_output_from_indices(chunk_indices) for rvq, chunk_indices in zip(self.rvqs, indices))
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/vector_quantize_pytorch/residual_fsq.py", line 134, in get_output_from_indices
    | codes = self.get_codes_from_indices(indices)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/vector_quantize_pytorch/residual_fsq.py", line 116, in get_codes_from_indices
    | all_codes = get_at('q [c] d, b n q -> q b n d', self.codebooks, indices)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/einx/traceback_util.py", line 71, in func_with_reraise
    | raise e.with_traceback(tb) from None
    | File "", line 3, in op1
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 480, in _flat_vmap
    | batched_outputs = func(*batched_inputs, **kwargs)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 48, in fn
    | return f(*args, **kwargs)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 331, in vmap_impl
    | return _flat_vmap(
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/apis.py", line 201, in wrapped
    | return vmap_impl(
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 480, in _flat_vmap
    | batched_outputs = func(*batched_inputs, **kwargs)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 48, in fn
    | return f(*args, **kwargs)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 331, in vmap_impl
    | return _flat_vmap(
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/apis.py", line 201, in wrapped
    | return vmap_impl(
    | File "", line 10, in op0
    | RuntimeError: CUDA error: device-side assert triggered
    | Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
    |
    |
    | During handling of the above exception, another exception occurred:
    |
    | Traceback (most recent call last):
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/responses.py", line 255, in wrap
    | await func()
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/responses.py", line 244, in stream_response
    | async for chunk in self.body_iterator:
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/concurrency.py", line 62, in iterate_in_threadpool
    | yield await anyio.to_thread.run_sync(_next, as_iterator)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    | return await get_async_backend().run_sync_in_worker_thread(
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
    | return await future
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run
    | result = context.run(func, *args)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/concurrency.py", line 51, in _next
    | return next(iterator)
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 57, in generator_context
    | response = gen.send(request)
    | File "/data/fish_speech/fish-speech-1.4.1/tools/main.py", line 733, in inference
    | torch.cuda.empty_cache()
    | File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/cuda/memory.py", line 170, in empty_cache
    | torch._C._cuda_emptyCache()
    | RuntimeError: CUDA error: device-side assert triggered
    | Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
    |
    +------------------------------------

✔️ Expected Behavior

How can we solve this bug?

❌ Actual Behavior

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions