GitHub

F5-TTS的onnxruntime和tensorrt-llm推理

环境配置

安装pytorch, cuda, onnxruntime, tensor

需要保证版本兼容，我本地环境是

python==3.10

nvidia版本
nvidia-smi 驱动版本12.9
nvidia nvcc版本12.4

torch版本
torch                                    2.6.0+cu124
torchaudio                               2.6.0+cu124
torchdiffeq                              0.2.5
torchprofile                             0.0.4
torchvision                              0.21.0+cu124

onnxruntime版本
onnxruntime-gpu                          1.20.0

tensorrt版本
tensorrt                                 10.9.0.34
tensorrt_cu12                            10.9.0.34
tensorrt_cu12_bindings                   10.9.0.34
tensorrt_cu12_libs                       10.9.0.34
tensorrt-llm                             0.18.0

vocos
vocos                                    0.1.0

显卡配置, RTX4060 8G（8G消费显卡也可以学习大模型）

python的包兼容性很差，换个版本可能跑不通了

本地安装F5-TTS

F5-TTS 来自

git clone https://github.com/SWivid/F5-TTS.git -b 0.6.2
cd F5-TTS
pip install -e .

由于python模块变化很快，因此在本地也放了F5-TTS的代码，避免以后找不到

运行F5-TTS f5-tts_infer-cli

onnxruntime推理

设置.env文件里的环境变量
将 torch模型转为onnx模型

python ./export_onnx/Export_F5.py

会覆盖f5-tts和 vocos下的文件

# f5-tts模块
dit.py
modules.py
util_infer.py

# vocos 模块
heads.py
models.py
modules.py
pretrained.py

shutil.copyfile(modified_path + '/vocos/heads.py', python_package_path + '/vocos/heads.py')
shutil.copyfile(modified_path + '/vocos/models.py', python_package_path + '/vocos/models.py')
shutil.copyfile(modified_path + '/vocos/modules.py', python_package_path + '/vocos/modules.py')
shutil.copyfile(modified_path + '/vocos/pretrained.py', python_package_path + '/vocos/pretrained.py')
shutil.copyfile(modified_path + '/F5/modules.py', F5_project_path + '/f5_tts/model/modules.py')
shutil.copyfile(modified_path + '/F5/dit.py', F5_project_path + '/f5_tts/model/backbones/dit.py')
shutil.copyfile(modified_path + '/F5/utils_infer.py', F5_project_path + '/f5_tts/infer/utils_infer.py')

覆盖的原因后续研究下

可以直接执行python ./export_onnx/F5-TTS-ONNX-Inference.py 推理

时间和torch时间都是4s左右，但需要注意由于覆盖了文件，这和原本的f5-tts代码不同了。后续可以深入分析下

tensorrt-llm加速

在tensorrt-llm 增加f5-tts的模块

找到tensorrt_llm的安装模块pip show tensorrt-llm 在~/miniconda3/envs/py310/lib/python3.10/site-packages/tensorrt_llm/models，创建f5tts目录。

将F5_TTS_Faster/export_trtllm/model下面的文件拷贝到~/miniconda3/envs/py310/lib/python3.10/site-packages/tensorrt_llm/models

在tensorrt_llm/models/__init__.py中增加

from .f5tts.model import F5TTS

__all__ = [..., 'F5TTS']

# 并且在 `MODEL_MAP` 添加模型：
MODEL_MAP = {..., 'F5TTS': F5TTS}

执行convert-checkpoint.py

python ./export_trtllm/convert_checkpoint.py \
        --timm_ckpt "./ckpts/F5TTS_Base/model_1200000.pt" \
        --output_dir "./ckpts/trtllm_ckpt"

build_engine

trtllm-build --checkpoint_dir ./ckpts/trtllm_ckpt \
             --remove_input_padding disable \
             --bert_attention_plugin disable \
             --output_dir ./ckpts/engine_outputs

执行推理

python ./export_trtllm/sample.py \
        --tllm_model_dir "./ckpts/engine_outputs"

推理耗时,1.2s

TODO

export_onnx/modeling_modified目录下都改了什么文件，作用是什么，为什么要改
/export_trtllm/model 目录下加的文件有什么作用，为什么tensorrt-llm将推理时间从4s提升到1.2s，关键优化是什么

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
F5-TTS		F5-TTS
F5-TTS-2		F5-TTS-2
assets		assets
export_onnx		export_onnx
export_trtllm		export_trtllm
output		output
.env		.env
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

F5-TTS的onnxruntime和tensorrt-llm推理

环境配置

onnxruntime推理

tensorrt-llm加速

TODO

About

Uh oh!

Releases

Packages

Languages

larrystd/F5_TTS_learn

Folders and files

Latest commit

History

Repository files navigation

F5-TTS的onnxruntime和tensorrt-llm推理

环境配置

onnxruntime推理

tensorrt-llm加速

TODO

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages