Skip to content

[Feature Request] CreateOpAttr: Change len parameter from int to size_t to support >2GB ep_cache_context string attribute #28806

@mingmingtasd

Description

@mingmingtasd

Describe the feature request

Motivation & Real-World Failure

The WebNN API implementation in Chromium uses ONNX Runtime as a backend. In this architecture, model compilation happens in a sandboxed compiler process using embed mode (embed_mode=1), and the compiled EP context is serialized and sent back to the GPU process via IPC— embedding the compiled blob directly into the ONNX model's ep_cache_context string attribute. This is the preferred approach because the sandboxed process has restricted file system access, making external file-based EP Context impractical.

When compiling large models (e.g., Stable Diffusion Turbo UNet with ~1.6GB weights), the EP-generated compiled blob can reach ~3.3GB, which exceeds the 2GB limit imposed by int and causes the export to fail.

Failure Scenario

When an EP creates an EPContext node with a large ep_cache_context string attribute, it must convert the size_t length to int to pass to Ort::OpAttr(name, data, len, ORT_OP_ATTR_STRING). For data exceeding ~2GB, this conversion either:

  • Throws at runtime if using a narrowing check (e.g., gsl::narrow<int>), or
  • Silently overflows and corrupts data if using static_cast<int>().

This is a framework-level limitation in ORT, not specific to any single EP. Any EP using embed mode with large compiled artifacts will hit the same wall.

Problem

OrtApi::CreateOpAttr uses int for the len parameter:

ORT_API2_STATUS(CreateOpAttr,
                _In_ const char* name,
                _In_ const void* data,
                _In_ int len,          // max ~2GB for ORT_OP_ATTR_STRING
                _In_ OrtOpAttrType type,
                _Outptr_ OrtOpAttr** op_attr);

For ORT_OP_ATTR_STRING, len represents the byte count of the string data. A 32-bit signed int caps this at 2^31 - 1 ≈ 2.1GB, which is insufficient for modern large model compiled artifacts.

All callers are forced to use gsl::narrow<int>() or static_cast<int>(), both of which fail or silently overflow for data larger than 2GB.

Proposed Solution

Change the len parameter from int to size_t across the full API chain:

Layer File Change
C API declaration include/onnxruntime/core/session/onnxruntime_c_api.h int lensize_t len
Internal declaration onnxruntime/core/session/ort_apis.h int lensize_t len
Implementation onnxruntime/core/session/standalone_op_invoker.cc 2 function signatures + loop variables
C++ wrapper declaration include/onnxruntime/core/session/onnxruntime_cxx_api.h Ort::OpAttr constructor
C++ wrapper implementation include/onnxruntime/core/session/onnxruntime_cxx_inline.h Ort::OpAttr constructor
Minimal build stub standalone_op_invoker.cc Stub signature
Callers Test code, EP implementations Remove static_cast<int> / gsl::narrow<int>

Breaking Change Considerations

This is an ABI breaking change to the OrtApi struct. The function pointer signature changes from int to size_t, which differs in width on 64-bit platforms (4 bytes vs 8 bytes). This requires:

  • Bumping ORT_API_VERSION
  • Documenting the change in release notes
  • Existing plugins compiled against the old API will need to be recompiled

Alternatives Considered

  1. Add CreateOpAttr2 with size_t — Avoids ABI break by adding a new API alongside the existing one. The old CreateOpAttr would delegate to the new one internally. Downside: adds API surface clutter.

  2. Use int64_t instead of size_t — Fixed-width type, consistent across platforms (avoids 32-bit size_t on 32-bit builds). Some ORT APIs already use int64_t for sizes (e.g., tensor dimensions).

Affected Callers (known)

  • Any plugin EP using embed mode (embed_mode=1) with ORT_OP_ATTR_STRING for large compiled data (e.g., ep_cache_context)
  • onnxruntime/test/autoep/library/example_plugin_ep/ep.ccstatic_cast<int>
  • onnxruntime/test/autoep/library/example_plugin_ep_virt_gpu/ep.ccstatic_cast<int>
  • onnxruntime/test/shared_lib/custom_op_utils.ccstatic_cast<int>

@adrastogi @huningxin @fdwr @ibelem

Metadata

Metadata

Assignees

No one assigned

    Labels

    ep:WebNNWebNN execution providerfeature requestrequest for unsupported feature or enhancement

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions