Skip to content

Clarify and improve ownership semantics Model Editor C/C++ API#28800

Open
yuslepukhin wants to merge 16 commits into
mainfrom
yuslepukhin/revert_model_editor_change
Open

Clarify and improve ownership semantics Model Editor C/C++ API#28800
yuslepukhin wants to merge 16 commits into
mainfrom
yuslepukhin/revert_model_editor_change

Conversation

@yuslepukhin
Copy link
Copy Markdown
Member

@yuslepukhin yuslepukhin commented Jun 5, 2026

This pull request improves memory management and exception safety in the ONNX Runtime Model Editor C and C++ APIs, particularly around ownership transfer of graph/model components (inputs, outputs, initializers, nodes, and graphs). The changes ensure that ownership is only transferred on success, preventing double-free and dangling pointer issues, and update documentation and types to reflect the new ownership semantics.

Key changes:

Memory Management and Ownership Semantics

  • Added custom deleters (OrtValueDeleter, OrtValueInfoDeleter, OrtNodeDeleter, OrtGraphDeleter) for all major ONNX Runtime types, ensuring destruction always routes through the correct API release functions and preventing accidental double-free or memory leaks. (onnxruntime/core/graph/model_editor_api_types.h)
  • Updated internal storage in ModelEditorGraph to use unique_ptr with the appropriate custom deleters for inputs, outputs, initializers, and nodes, enforcing correct ownership and destruction. (onnxruntime/core/graph/model_editor_api_types.h)

API and Documentation Improvements

  • Clarified and expanded documentation for ownership transfer and atomicity in the C API (onnxruntime_c_api.h), specifying all-or-nothing behavior: ownership is only transferred on success, and pointers are nulled out to make the transfer explicit. (onnxruntime/core/session/onnxruntime_c_api.h) [1] [2] [3] [4] [5]
  • Updated C++ API comments and signatures to reflect strong exception safety: ownership is only transferred if the operation succeeds, and on failure, input objects remain unchanged and owned by the caller. (onnxruntime/core/session/onnxruntime_cxx_api.h) [1] [2]

Implementation Updates

  • Modified C++ API implementations to transfer ownership only after a successful call, using release() only after the API call succeeds. This pattern is now used for adding initializers, nodes, and graphs. (onnxruntime/core/session/onnxruntime_cxx_inline.h)
  • Updated model editor code to handle new pointer types and ownership semantics, including moving out of unique_ptr when consuming initializers. (onnxruntime/core/graph/graph.cc) [1] [2] [3]

Minor Cleanups

  • Removed unused or redundant owned_ flags from model editor types, as ownership is now tracked via smart pointers. (onnxruntime/core/graph/model_editor_api_types.h) [1] [2]
  • Improved documentation consistency and removed unnecessary OrtApi:: prefixes in comments. (onnxruntime/core/session/onnxruntime_c_api.h) [1] [2] [3]

These changes collectively make the ONNX Runtime Model Editor API safer and more robust, especially in the face of errors or exceptions.

AddInitializerToGraph and AddNodeToGraph wrap the caller-supplied raw pointer in a unique_ptr to take ownership. If a caller passes the same pointer twice, two unique_ptrs end up owning the same allocation, producing a double-free when the graph is destroyed.

Add a pointer-equality check against existing entries before taking ownership and return ORT_INVALID_ARGUMENT on duplicate. This is API misuse rather than an exploitable vulnerability, but a clear error is preferable to a crash.
…orApi

Cover AddInitializerToGraph and AddNodeToGraph rejecting a second add of the same raw pointer. Use the C++ wrappers for RAII setup and drop to the C API only for the deliberately-misused second call, since the C++ AddNode/AddInitializer wrappers always release() on success and cannot reproduce the duplicate-pointer scenario.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Model Editor C/C++ API to make ownership transfer of graph/model components (value infos, nodes, initializers, graphs) explicit, aligning public headers/docs, C++ wrappers, and internal storage with “graph/model takes ownership” semantics.

Changes:

  • Updates C/C++ API signatures and wrapper behavior to reflect ownership transfer (notably AddInitializerToGraph / Graph::AddInitializer).
  • Refactors internal Model Editor graph storage to use std::unique_ptr for initializers and removes prior owned_ flags.
  • Adjusts docs and tests to match the new ownership rules and simplified implementation.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
onnxruntime/test/shared_lib/test_model_builder_api.cc Updates tests to stop releasing objects after ownership transfer; revises/reshapes regression coverage.
onnxruntime/core/session/onnxruntime_c_api.cc Simplifies Release* functions for Model Editor graph IR types by removing ownership guards.
onnxruntime/core/session/model_editor_c_api.cc Implements new ownership-transfer behavior for inputs/outputs/nodes/initializers/model graph attachment.
onnxruntime/core/session/model_editor_api.h Updates Model Editor API declaration to accept non-const initializer tensors.
onnxruntime/core/graph/model_editor_api_types.h Removes owned_ flags and changes initializer containers to unique_ptr<OrtValue>.
onnxruntime/core/graph/graph.cc Updates Model Editor loading path for unique_ptr-held initializers.
include/onnxruntime/core/session/onnxruntime_cxx_inline.h Updates C++ inline wrapper to release initializer after successful ownership transfer.
include/onnxruntime/core/session/onnxruntime_cxx_api.h Updates C++ API surface to accept non-const initializer for ownership transfer.
include/onnxruntime/core/session/onnxruntime_c_api.h Updates public docs/signatures/comments to describe the new ownership semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/session/model_editor_c_api.cc Outdated
Comment thread onnxruntime/core/session/model_editor_c_api.cc Outdated
Comment thread onnxruntime/core/session/model_editor_c_api.cc Outdated
Comment thread onnxruntime/core/session/model_editor_c_api.cc Outdated
Comment thread onnxruntime/core/session/model_editor_c_api.cc Outdated
Comment thread onnxruntime/core/session/model_editor_c_api.cc Outdated
Comment thread onnxruntime/core/graph/graph.cc Outdated
Comment thread onnxruntime/core/session/onnxruntime_c_api.cc
- model_editor_c_api.cc: validate ort_graph/inputs/outputs/name/tensor/node/model are non-null in SetGraphInputs, SetGraphOutputs, AddInitializerToGraph, AddNodeToGraph, and AddGraphToModel. ToInternal() dereferences its argument unconditionally, so a null graph/node previously crashed instead of returning ORT_INVALID_ARGUMENT.

- AddInitializerToGraph: reject duplicate initializer names and switch from operator[] assignment to emplace, so a name collision is reported instead of silently destroying the previously-owned tensor.

- AddGraphToModel: reject if the model already owns a graph; a second call would otherwise silently delete the first via unique_ptr assignment.

- graph.cc: comment the intentional move-out-of-const-map pattern in LoadFromModelEditorApiModel; the OrtModel is consumed exactly once by Load and the const unique_ptr does not propagate const to the contained OrtValue.

- onnxruntime_c_api.h: clarify that ownership of OrtValueInfo / OrtValue / OrtNode / OrtGraph transfers only on success; on failure the caller still owns the object. Document that calling Release* after a successful transfer would double-free, and that adding a duplicate raw pointer or a duplicate initializer name is now rejected with ORT_INVALID_ARGUMENT.
Refactor SetGraphInputs, SetGraphOutputs, AddInitializerToGraph and AddNodeToGraph to a two-phase validate-then-commit pattern: every operation that may throw (validation, allocation, reserve, try_emplace) runs before any observable state changes; the commit phase uses only noexcept operations (vector::swap, vector::emplace_back into pre-reserved capacity, unique_ptr::reset, raw-pointer ctors).

As a result, on any failure path the graph state and the caller's input array are unchanged, so ownership is never partially transferred. Previously SetGraphInputs/Outputs cleared the existing list before validating each entry and could transfer some ValueInfo instances before hitting an invalid one mid-loop; AddInitializerToGraph wrapped tensor in a unique_ptr before emplace, so a bad_alloc inside emplace would have destroyed the tensor while the caller still believed it owned it.

Public API docs (SetGraphInputs, SetGraphOutputs, AddInitializerToGraph, AddNodeToGraph, AddGraphToModel) now state the strong exception safety / all-or-nothing guarantee explicitly.
C API headers: replaced 'strong exception safety' wording with 'atomicity / all-or-nothing' since the C interface does not surface exceptions. The semantics are unchanged.

C++ API: documented strong exception safety on Graph::SetInputs/SetOutputs/AddInitializer/AddNode and Model::AddGraph, and fixed AddNode and AddGraph to actually provide it. They previously called node.release() / graph.release() before invoking the C function, so a thrown error from the underlying C call would leak the raw OrtNode/OrtGraph (the wrapper had already given up ownership and the C function did not take it on the failure path). Now the raw pointer is passed via the implicit Base::operator T*() conversion and release() is called only after ThrowOnError returns.
…heck

Replace the magic '< 128' check in AddInitializerToGraph with the existing onnxruntime::utils::kSmallTensorExternalDataThreshold constant from core/framework/tensorprotoutils.h. The constant is defined as 127 with the rationale documented at the declaration; '< 128' is equivalent to '<= 127' so behavior is unchanged.
…oGraph

Replace the separate count-then-emplace pattern with a single try_emplace into the target map plus a count() probe of the other map. The probe of the non-target map is unavoidable because the regular and external initializer maps share a name space, but try_emplace's own inserted==false result handles the in-target-map case directly. Both collision paths now return ORT_INVALID_ARGUMENT, removing the previous ORT_ENFORCE that asserted invariance the function had just established.
…l=true)

Reword the inline comment and error message to reflect that this API is the sole legitimate producer of in-memory external-data references (kTensorProtoNativeEndianMemoryAddressTag): the encoding is internal to ORT and clients cannot construct it themselves, so the size guard exists to prevent *us* from emitting an in-memory reference for tensors small enough to be consumed inline by ONNX shape inferencing. No behavior change.
Cover the validation paths added by recent commits on this PR: duplicate initializer name (within map and across the regular/external boundary), data_is_external=true with a tensor at or below kSmallTensorExternalDataThreshold, second AddGraphToModel on the same model, and null-argument rejection on AddInitializerToGraph / AddNodeToGraph / AddGraphToModel. All tests verify ownership is not transferred on failure (Ort::Value / Ort::Node / Ort::Graph destructors release the still-owned pointers; no leaks, no double-frees).
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Comment thread onnxruntime/core/session/model_editor_c_api.cc
Comment thread onnxruntime/core/session/model_editor_c_api.cc Outdated
Comment thread include/onnxruntime/core/session/onnxruntime_c_api.h Outdated
Comment thread onnxruntime/test/shared_lib/test_model_builder_api.cc
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.

Comment thread onnxruntime/core/session/model_editor_c_api.cc
Comment thread onnxruntime/core/session/model_editor_c_api.cc
Comment thread onnxruntime/core/session/model_editor_c_api.cc
Comment thread onnxruntime/core/session/model_editor_c_api.cc
Comment thread onnxruntime/core/session/model_editor_c_api.cc
Comment thread onnxruntime/core/graph/graph.cc
Comment thread onnxruntime/core/session/onnxruntime_c_api.cc
Comment thread onnxruntime/test/shared_lib/test_model_builder_api.cc
Comment thread include/onnxruntime/core/session/onnxruntime_c_api.h
Comment thread onnxruntime/core/graph/model_editor_api_types.h Outdated
Comment thread onnxruntime/core/graph/model_editor_api_types.h
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Comment thread onnxruntime/core/graph/graph.cc Outdated
Comment thread include/onnxruntime/core/session/onnxruntime_c_api.h Outdated
Comment thread include/onnxruntime/core/session/onnxruntime_c_api.h Outdated
Comment thread onnxruntime/core/session/model_editor_api.h Outdated
Comment thread onnxruntime/core/graph/graph.cc Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread onnxruntime/core/session/model_editor_c_api.cc Outdated
Comment thread onnxruntime/test/shared_lib/test_model_builder_api.cc Outdated
Address review feedback: the public C-API error returned by AddInitializerToGraph when data_is_external=true is rejected for too-small tensors used to reference the internal symbol name kSmallTensorExternalDataThreshold, which is not actionable for users and does not state the actual minimum size. Build the message with MakeString and include both the concrete byte threshold and the observed tensor size.

Also clarify the MakeCpuFloatTensor test helper's ownership docstring (the returned Ort::Value only owns the OrtValue wrapper; the data buffer lives in the caller-provided storage) and switch the shape parameter from a by-value std::vector<int64_t> to gsl::span<const int64_t>, which avoids the copy without pulling internal headers into a public-API-only test.
…r dims

gsl::span has no constructor that accepts a braced-init-list, so call sites like MakeCpuFloatTensor(storage, 4, {2, 2}) failed to build on MSVC. Take dims by const reference to a std::vector instead, which still avoids a copy and accepts braced initializers.
Copy link
Copy Markdown
Contributor

@tianleiwu tianleiwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Reviewed at head 44bb34f1. This is a clean, well-executed re-implementation of OrtModelEditorApi ownership transfer (reverting #28123's copy semantics): graphs/models own their children via unique_ptr + custom deleters routed through OrtApis::Release*, with strong exception safety on every mutating entry point. All prior review threads are resolved on this head, and I independently confirmed the previously-flagged crashes and within-container double-frees are fixed. Non-blocking (COMMENT).

Verified solid:

  • Strong exception safety: SetGraphInputs/SetGraphOutputs validate every entry and reserve (the last throwing op) before any observable mutation, then commit with only noexcept ops (emplace_back into reserved storage, swap, null the array). AddInitializerToGraph uses try_emplace(name) then a noexcept reset(tensor). Ownership is never partially transferred on failure.
  • Leak-on-error fix: the C++ AddNode/AddGraph wrappers now pass the raw pointer and call release() only after the C call returns OK, fixing the prior leak of the just-released pointer when the status was non-OK.
  • Custom deleters keep every owning slot consistent with Release*; OrtModel reuse is preserved by copying (not moving) OrtValues into ortvalue_initializers_; restored null-arg validation and _Inout_ SAL; kSmallTensorExternalDataThreshold (127) is behavior-preserving (<= 127 ≡ old < 128) with an actionable rejection message.

Optional follow-ups (none blocking):

  1. AddNodeToGraph O(n²) growthgraph->nodes.reserve(graph->nodes.size() + 1) forces exact (non-geometric) growth, so past the initial 64-node CreateGraph reservation every add reallocates and moves all nodes. The reserve-before-emplace_back invariant (needed for the noexcept commit) can be kept and amortized O(1) by reserving geometrically only at capacity:

    if (graph->nodes.size() == graph->nodes.capacity())
      graph->nodes.reserve(std::max(graph->nodes.capacity() * 2, graph->nodes.size() + 1));
  2. Cross-object double-free is now caller-contract-only — the duplicate guards only detect reuse within the same container, so the same OrtNode* → two graphs, OrtGraph* → two models, or OrtValue* → two graphs is unguarded and double-frees on destruction. This is consistent with the intentional "don't reuse after transfer" decision (same rationale as making Release* an unconditional delete), and the C++ wrappers are safe because they release() on success. Since AddGraphToModel's SameGraphTwoModels regression test was removed, a one-line doc note ("an OrtGraph must not be added to more than one OrtModel") would preserve the intent for C-API callers.

  3. AddInitializerToGraph ownership-contract change (copy → take-ownership) has limited in-repo blast radius — only the updated C++ wrapper and tests call it. #28123 (copy semantics) merged 2026-05-06; if it shipped in any released package, the contract change is worth a release-note callout. (The original 1.22 contract was take-ownership, so this restores the documented behavior.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AddInitializerToGraph ownership-semantics change in #28123 breaks C API callers (Chromium WebNN crash)

3 participants