Skip to content

[Performance] kokoro onnx performance issues #23384

Open
@MithrilMan

Description

Describe the issue

Hello.
I'm trying to use a kokoro onnx model and I see there are a lot of performance difference between pytorch CPU and onnxruntime CPU (no special provider specified)

I've seen there is https://www.ui.perfetto.dev/ useful to investigate performance issues
I'm attaching a tracing of 3 inferences in a row, I don't have the skill to understand what's the problem
I've used CPU without any other SessionOptions specified, in c# with onnxruntime

onnxruntime_profile__2025-01-15_15-22-14.zip

using the query select name, (dur/1000000) as ms, ts from slice where parent_id=3 AND category = 'Node' order by dur desc where 3 is the slice id SequentialExecuter of a single inference run, that i used to filter infos of a specific inference run (I don't know if there was a better way to get it) I was able to sort the nodes execution by time spent but I'm not able to go further because I don't know how to evaluate these timings with expected ones

I'd like someone to point me out to a resource about how to detect bottlenecks of a model, or someone who have the skill to help with the issue

To reproduce

thanks @thewh1teagle

"""
pip install kokoro-onnx soundfile
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/kokoro-v0_19.onnx
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/voices.json
ONNX_PROVIDER=CoreMLExecutionProvider LOG_LEVEL=DEBUG uv run main.py
ONNX_PROVIDER=CPUExecutionProvider LOG_LEVEL=DEBUG uv run main.py
"""

import soundfile as sf
from kokoro_onnx import Kokoro

kokoro = Kokoro("kokoro-v0_19.onnx", "voices.json")
samples, sample_rate = kokoro.create(
    "Hello. This audio generated by kokoro!", voice="af_sarah", speed=1.0, lang="en-us"
)
sf.write("audio.wav", samples, sample_rate)
print("Created audio.wav")

Urgency

Could you please suggest how to properly understand what are the performance problem of a model?
It's hard to me to find proper documentation that guide me to understanding deeply how things works, any link to docs/tutorials are apprecciated (remind I'm mainly a C# guy)

Thanks

Platform

Windows

OS Version

Windows 11

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

nuget package 1.20.1

ONNX Runtime API

C#

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

No

Metadata

Assignees

No one assigned

    Labels

    .NETPull requests that update .net codeapi:CSharpissues related to the C# APIperformanceissues related to performance regressions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions