Skip to content

[Performance] kokoro onnx performance issues #23384

Open
@MithrilMan

Description

Describe the issue

Hello.
I'm trying to use a kokoro onnx model and I see there are a lot of performance difference between pytorch CPU and onnxruntime CPU (no special provider specified)

I've seen there is https://www.ui.perfetto.dev/ useful to investigate performance issues
I'm attaching a tracing of 3 inferences in a row, I don't have the skill to understand what's the problem
I've used CPU without any other SessionOptions specified, in c# with onnxruntime

onnxruntime_profile__2025-01-15_15-22-14.zip

using the query select name, (dur/1000000) as ms, ts from slice where parent_id=3 AND category = 'Node' order by dur desc where 3 is the slice id SequentialExecuter of a single inference run, that i used to filter infos of a specific inference run (I don't know if there was a better way to get it) I was able to sort the nodes execution by time spent but I'm not able to go further because I don't know how to evaluate these timings with expected ones

I'd like someone to point me out to a resource about how to detect bottlenecks of a model, or someone who have the skill to help with the issue

To reproduce

thanks @thewh1teagle

"""
pip install kokoro-onnx soundfile
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/kokoro-v0_19.onnx
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/voices.json
ONNX_PROVIDER=CoreMLExecutionProvider LOG_LEVEL=DEBUG uv run main.py
ONNX_PROVIDER=CPUExecutionProvider LOG_LEVEL=DEBUG uv run main.py
"""

import soundfile as sf
from kokoro_onnx import Kokoro

kokoro = Kokoro("kokoro-v0_19.onnx", "voices.json")
samples, sample_rate = kokoro.create(
    "Hello. This audio generated by kokoro!", voice="af_sarah", speed=1.0, lang="en-us"
)
sf.write("audio.wav", samples, sample_rate)
print("Created audio.wav")

Urgency

Could you please suggest how to properly understand what are the performance problem of a model?
It's hard to me to find proper documentation that guide me to understanding deeply how things works, any link to docs/tutorials are apprecciated (remind I'm mainly a C# guy)

Thanks

Platform

Windows

OS Version

Windows 11

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

nuget package 1.20.1

ONNX Runtime API

C#

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

No

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    .NETPull requests that update .net codeapi:CSharpissues related to the C# APIperformanceissues related to performance regressionsstaleissues that have not been addressed in a while; categorized by a bot

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions