v1.4.1 Index out of bounds

### Self Checks

- [x] This template is only for bug reports. For questions, please visit [Discussions](https://github.com/fishaudio/fish-speech/discussions).
- [x] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. [English](https://speech.fish.audio/) [中文](https://speech.fish.audio/zh/) [日本語](https://speech.fish.audio/ja/) [Portuguese (Brazil)](https://speech.fish.audio/pt/)
- [x] I have searched for existing issues, including closed ones. [Search issues](https://github.com/fishaudio/fish-speech/issues)
- [x] I confirm that I am using English to submit this report (我已阅读并同意 [Language Policy](https://github.com/fishaudio/fish-speech/issues/515)).
- [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
- [x] Please do not modify this template and fill in all required fields.

### Cloud or Self Hosted

Self Hosted (Source)

### Environment Details

Official environment configuration

### Steps to Reproduce

uvicorn workers=4
When performing inference with multiple texts (for example, 50,000 texts), the following error always occurs. 
-----generate_long-----
llama-236 ------generate------
llama-186 num_new_tokens:1023

  0%|          | 0/1023 [00:00<?, ?it/s]
  2%|▏         | 25/1023 [00:00<00:04, 249.48it/s]
  5%|▍         | 50/1023 [00:00<00:03, 249.70it/s]
  7%|▋         | 68/1023 [00:00<00:03, 246.03it/s]
2025-04-04 23:07:17.245 | INFO     | tools.llama.generate:generate_long:507 - Compilation time: 0.35 seconds
2025-04-04 23:07:17.245 | INFO     | tools.llama.generate:generate_long:516 - Generated 70 tokens in 0.35 seconds, 197.90 tokens/sec
2025-04-04 23:07:17.246 | INFO     | tools.llama.generate:generate_long:519 - Bandwidth achieved: 97.86 GB/s
2025-04-04 23:07:17.246 | INFO     | tools.llama.generate:generate_long:524 - GPU Memory used: 3.34 GB
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [67,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [8,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/responses.py", line 259, in __call__
    await wrap(partial(self.listen_for_disconnect, receive))
  File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/responses.py", line 255, in wrap
    await func()
  File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/responses.py", line 232, in listen_for_disconnect
    message = await receive()
  File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 555, in receive
    await self.message_event.wait()
  File "/root/miniconda3/envs/fish142/lib/python3.10/asyncio/locks.py", line 214, in wait
    await fut
asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f95185496f0

During handling of the above exception, another exception occurred:

  + Exception Group Traceback (most recent call last):
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
  |     result = await app(  # type: ignore[func-returns-value]
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
  |     return await self.app(scope, receive, send)
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
  |     await super().__call__(scope, receive, send)
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
  |     await self.middleware_stack(scope, receive, send)
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
  |     raise exc
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
  |     await self.app(scope, receive, _send)
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
  |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
  |     raise exc
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
  |     await app(scope, receive, sender)
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
  |     await self.middleware_stack(scope, receive, send)
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
  |     await route.handle(scope, receive, send)
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
  |     await self.app(scope, receive, send)
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
  |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
  |     raise exc
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
  |     await app(scope, receive, sender)
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/routing.py", line 74, in app
  |     await response(scope, receive, send)
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/responses.py", line 252, in __call__
  |     async with anyio.create_task_group() as task_group:
  |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 680, in __aexit__
  |     raise BaseExceptionGroup(
  | exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/data/fish_speech/fish-speech-1.4.1/tools/main.py", line 656, in inference
    |     fake_audios = decode_vq_tokens(
    |   File "/data/fish_speech/fish-speech-1.4.1/tools/main.py", line 357, in decode_vq_tokens
    |     return decoder_model.decode(
    |   File "/data/fish_speech/fish-speech-1.4.1/fish_speech/models/vqgan/modules/firefly.py", line 582, in decode
    |     z = self.quantizer.decode(indices) * mel_masks_float_conv
    |   File "/data/fish_speech/fish-speech-1.4.1/fish_speech/models/vqgan/modules/fsq.py", line 114, in decode
    |     z_q = self.residual_fsq.get_output_from_indices(indices)
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/vector_quantize_pytorch/residual_fsq.py", line 248, in get_output_from_indices
    |     outputs = tuple(rvq.get_output_from_indices(chunk_indices) for rvq, chunk_indices in zip(self.rvqs, indices))
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/vector_quantize_pytorch/residual_fsq.py", line 248, in <genexpr>
    |     outputs = tuple(rvq.get_output_from_indices(chunk_indices) for rvq, chunk_indices in zip(self.rvqs, indices))
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/vector_quantize_pytorch/residual_fsq.py", line 134, in get_output_from_indices
    |     codes = self.get_codes_from_indices(indices)
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/vector_quantize_pytorch/residual_fsq.py", line 116, in get_codes_from_indices
    |     all_codes = get_at('q [c] d, b n q -> q b n d', self.codebooks, indices)
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/einx/traceback_util.py", line 71, in func_with_reraise
    |     raise e.with_traceback(tb) from None
    |   File "<string>", line 3, in op1
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 480, in _flat_vmap
    |     batched_outputs = func(*batched_inputs, **kwargs)
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 48, in fn
    |     return f(*args, **kwargs)
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 331, in vmap_impl
    |     return _flat_vmap(
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/apis.py", line 201, in wrapped
    |     return vmap_impl(
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 480, in _flat_vmap
    |     batched_outputs = func(*batched_inputs, **kwargs)
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 48, in fn
    |     return f(*args, **kwargs)
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/vmap.py", line 331, in vmap_impl
    |     return _flat_vmap(
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/_functorch/apis.py", line 201, in wrapped
    |     return vmap_impl(
    |   File "<string>", line 10, in op0
    | RuntimeError: CUDA error: device-side assert triggered
    | Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
    | 
    | 
    | During handling of the above exception, another exception occurred:
    | 
    | Traceback (most recent call last):
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/responses.py", line 255, in wrap
    |     await func()
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/responses.py", line 244, in stream_response
    |     async for chunk in self.body_iterator:
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/concurrency.py", line 62, in iterate_in_threadpool
    |     yield await anyio.to_thread.run_sync(_next, as_iterator)
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    |     return await get_async_backend().run_sync_in_worker_thread(
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
    |     return await future
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run
    |     result = context.run(func, *args)
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/starlette/concurrency.py", line 51, in _next
    |     return next(iterator)
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 57, in generator_context
    |     response = gen.send(request)
    |   File "/data/fish_speech/fish-speech-1.4.1/tools/main.py", line 733, in inference
    |     torch.cuda.empty_cache()
    |   File "/root/miniconda3/envs/fish142/lib/python3.10/site-packages/torch/cuda/memory.py", line 170, in empty_cache
    |     torch._C._cuda_emptyCache()
    | RuntimeError: CUDA error: device-side assert triggered
    | Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
    | 
    +------------------------------------

### ✔️ Expected Behavior

How can we solve this bug?

### ❌ Actual Behavior

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v1.4.1 Index out of bounds #943

Self Checks

Cloud or Self Hosted

Environment Details

Steps to Reproduce

✔️ Expected Behavior

❌ Actual Behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

v1.4.1 Index out of bounds #943

Description

Self Checks

Cloud or Self Hosted

Environment Details

Steps to Reproduce

✔️ Expected Behavior

❌ Actual Behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions