Skip to content

Conversation

@qjia7
Copy link
Contributor

@qjia7 qjia7 commented Jan 26, 2026

No description provided.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds RAII (Resource Acquisition Is Initialization) wrappers for ONNX Runtime Model Editor API types and refactors the graph builder code to use these wrappers for improved resource management. Additionally, it removes mutex protection from the graph session cache based on the assumption that ort-genai execution is single-threaded.

Changes:

  • Added RAII wrapper structs (OrtGraph, OrtModel, OrtValueInfo, OrtNode) in onnxruntime_api.h following the existing OrtOpAttr pattern
  • Removed SessionCache struct and mutex from graph_session_cache_ in generators.h, replacing it with a simple unordered_map
  • Refactored graph_builder.cpp Build() function to use std::unique_ptr for automatic resource cleanup instead of manual try/catch blocks

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
src/models/onnxruntime_api.h Adds RAII wrapper structs for OrtGraph, OrtModel, OrtValueInfo, and OrtNode with custom delete operators for automatic cleanup
src/generators.h Removes SessionCache struct with mutex, replaces with simple unordered_map based on single-threaded execution assumption
src/models/graph_executor.cpp Removes mutex include and simplifies cache access after SessionCache removal
src/models/graph_builder.cpp Refactors Build() to use RAII patterns with std::unique_ptr instead of manual try/catch exception handling

src/generators.h Outdated
Comment on lines 142 to 143
// Note: No mutex needed since ort-genai execution is single-threaded
std::unordered_map<uint64_t, std::unique_ptr<OrtSession>> graph_session_cache_;
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment stating "No mutex needed since ort-genai execution is single-threaded" may not be accurate. The codebase contains evidence of multi-threaded execution:

  1. The Engine class documentation (src/engine/engine.h:24) states it's "designed to handle multiple requests concurrently"
  2. WorkerThread class exists for asynchronous task execution (src/worker_thread.h)
  3. Other global caches use mutexes for thread safety (e.g., TopkBenchmarkCacheManager in src/cuda/cuda_topk_benchmark_cache.h:70)
  4. The tracing sink uses a mutex for thread-safe logging (src/tracing.cpp:85)

If graph_session_cache_ can be accessed from multiple threads (e.g., through ExecuteCastOp called from different Engine requests), removing the mutex could lead to race conditions during cache lookups and insertions. Consider verifying whether GetOrCreateSession can be called concurrently before removing the mutex protection.

Copilot uses AI. Check for mistakes.
Comment on lines +172 to +175
// Create node attributes
for (const auto& attr : config.attributes) {
node_attributes.push_back(CreateOpAttr(attr));
}
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The node_attributes vector contains raw pointers that are manually released at line 218-220. However, if any Ort::ThrowOnError call throws an exception between lines 174 (where attributes are created) and 220 (where they're released), these attributes will leak.

Consider using std::unique_ptr for node_attributes as well, similar to how graph_inputs_owned and graph_outputs_owned are managed. This would ensure automatic cleanup on exception. The raw pointers for passing to CreateNode can be extracted when needed.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateNode does NOT copy the attributes - it stores references/pointers to them. Therefore:

The OrtOpAttr* pointers must remain valid until AFTER the model is fully built
We cannot use RAII (std::unique_ptr) which would destroy them immediately when the vector goes out of scope
Manual cleanup is required, but only AFTER AddGraphToModel completes

@qjia7 qjia7 marked this pull request as ready for review January 27, 2026 08:53
@qjia7
Copy link
Contributor Author

qjia7 commented Jan 27, 2026

@baijumeswani This is the follow-up PR to resolve your remaining comments in #1895. The pipeline failures may not related with my changes. My another PR #1952 meets the similar issue after I rebase the code to latest without any code changes. However, it passed before.

@kunal-vaishnavi
Copy link
Contributor

@baijumeswani This is the follow-up PR to resolve your remaining comments in #1895. The pipeline failures may not related with my changes. My another PR #1952 meets the similar issue after I rebase the code to latest without any code changes. However, it passed before.

The pipeline failures are because Hugging Face released transformers v5 and it has some bugs. I have a fix for the CIs in this PR. You can pin the version like this in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants