Harden inference error handling, reset semantics, and warmup safety by axeldelafosse · Pull Request #2 · sweetspotsoundsystem/stemgen-rt

axeldelafosse · 2026-02-07T04:29:40Z

Summary

return failure from ONNX inference when output tensor extraction fails (instead of returning success with invalid output)
harden inference queue epoch/reset behavior to avoid stale-slot blockage after transport resets
make inference thread drop failed requests instead of publishing them as processed
add bounded warmup wait in prepareToPlay to prevent indefinite blocking

Validation

cmake --build build
ctest --test-dir build --output-on-failure
result: 65/65 tests passed

gemini-code-assist · 2026-02-07T04:29:55Z

Summary of Changes

Hello @axeldelafosse, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the robustness and reliability of the ONNX inference system by addressing several critical areas. It ensures that inference errors are properly handled and reported, prevents the inference queue from getting stuck due to stale requests after a reset, and adds a safety mechanism to the warmup process to avoid indefinite blocking. These changes collectively enhance the stability and predictability of the plugin's real-time audio processing capabilities.

Highlights

Hardened ONNX Inference Error Handling: The runInference method now explicitly returns false and logs an error if output tensor data cannot be extracted, preventing the system from proceeding with invalid output.
Improved Inference Queue Reset Semantics: The inference queue's reset behavior has been enhanced to prevent stale-slot blockage. This includes tracking an epochStartIdx_ to allow the inference thread to jump past old, non-ready requests after a reset, and clearing stale flags for previous epoch slots.
Inference Thread Drops Failed Requests: The inference thread now explicitly drops requests that fail during inference or become stale due to a reset, rather than publishing them as processed. This ensures only valid results are propagated.
Bounded Warmup Wait in prepareToPlay: The prepareToPlay function now includes a bounded wait for the ONNX runtime warmup process, with a 2-second timeout. This prevents indefinite blocking if the inference thread stalls during warmup.

Changelog

plugin/include/StemgenRT/InferenceQueue.h
- Added #include <cstdint> for integer types.
- Introduced epochStartIdx_ atomic variable to track the starting write index for new epochs, aiding in reset synchronization.
plugin/source/InferenceQueue.cpp
- Modified getWriteSlot() to reclaim stale processed slots from previous epochs, preventing them from blocking new requests.
- Updated getOutputSlot() to skip stale results from previous epochs and continue scanning for valid outputs.
- Enhanced reset() to capture the current writeIdx_ as epochStartIdx_ and immediately advance consumeIdx_ to this point, effectively discarding old-epoch outputs.
- Modified fullReset() to also reset epochStartIdx_ to 0.
- In inferenceThreadFunc(), the readIdx_ now jumps to epochStartIdx_ on an epoch change, avoiding scanning old holes.
- Added logic in inferenceThreadFunc() to clear ready and processed flags for slots belonging to previous epochs, ensuring proper reuse.
- Implemented error handling in inferenceThreadFunc(): if runInference fails, the request is marked as not ready/processed and the readIdx_ is advanced, effectively dropping the failed request.
- Ensured that if a reset occurs during inference, the processed flag is also cleared for the request.
plugin/source/OnnxRuntime.cpp
- Modified runInference() to return false if GetTensorMutableData fails or returns a null pointer, indicating an inference output extraction error.
- Added debug logging for cases where output tensor data cannot be retrieved.
plugin/source/PluginProcessor.cpp
- Added #include <chrono> for time-related functionalities.
- Updated comments in prepareToPlay to clarify the warmup logic.
- Implemented a 2-second timeout for the ONNX runtime warmup loop in prepareToPlay to prevent indefinite blocking.
- Added debug logging to indicate whether the warmup completed successfully or timed out.
- Ensured inferenceQueue_.reset() is always called after the warmup attempt to invalidate any late warmup results and re-synchronize queue indices.

Activity

The author has implemented changes to harden error handling, reset semantics, and warmup safety.
The pull request includes validation steps, confirming that cmake --build build and ctest --test-dir build --output-on-failure resulted in 65/65 tests passed.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

gemini-code-assist

Code Review

This pull request significantly hardens the inference queue's error handling, reset semantics, and warmup safety, incorporating improvements such as returning failure from ONNX inference on tensor extraction errors, a more robust inference queue reset mechanism, and a timeout for warmup waits. However, two medium-severity issues were identified: a potential division-by-zero when handling silent input, which could inject NaN/Inf into the audio stream, and a logic error in the epoch reset handling that might cause temporary stalls in the real-time inference thread after a transport reset. Addressing these will further enhance the plugin's stability and safety.

Harden inference queue reset handling and warmup safety

9882226

greptile-apps Bot reviewed Feb 7, 2026

View reviewed changes

gemini-code-assist Bot reviewed Feb 7, 2026

View reviewed changes

Comment thread plugin/source/OnnxRuntime.cpp

Comment thread plugin/source/InferenceQueue.cpp Outdated

Comment thread plugin/source/InferenceQueue.cpp Outdated

axeldelafosse added 2 commits February 6, 2026 20:39

Address PR feedback on gain guard and epoch reset flow

5c86c04

Apply review nit on epoch start index read

701c5a8

axeldelafosse merged commit 9d8d641 into main Feb 7, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden inference error handling, reset semantics, and warmup safety#2

Harden inference error handling, reset semantics, and warmup safety#2
axeldelafosse merged 3 commits into
mainfrom
fix_inference_reset_warmup

axeldelafosse commented Feb 7, 2026

Uh oh!

gemini-code-assist Bot commented Feb 7, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

axeldelafosse commented Feb 7, 2026

Summary

Validation

Uh oh!

gemini-code-assist Bot commented Feb 7, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant