See the /ort-build, /ort-test, and /ort-lint skills (in .agents/skills/) for detailed instructions.
ONNX Runtime is a cross-platform inference and training engine for ONNX models. The core pipeline is: Load model → Build graph → Optimize graph → Partition across Execution Providers → Execute.
graph/— ONNX model/graph IR.Modelwraps aGraphofNodes.GraphViewerprovides read-only traversal.optimizer/— Graph transformations (fusion, elimination, constant folding, layout transforms). Organized by optimization level (Level1–Level4).framework/— Execution machinery:OpKernel,Tensor,KernelRegistry, allocators, executors.session/—InferenceSession:Load()→Initialize()(optimize + assign kernels) →Run().providers/— Execution Provider (EP) implementations. Each EP implementsIExecutionProvider. CPU EP is the default fallback. 20+ EPs exist (CUDA, TensorRT, DirectML, CoreML, OpenVINO, WebGPU, QNN, etc.).common/— Utilities, status/error types, logging, threading.platform/— OS abstraction (file I/O, threading).
Custom operators not in the ONNX standard, organized by EP (cpu/, cuda/, js/, webgpu/). Each EP has its own contrib kernel registration file (e.g., cpu_contrib_kernels.cc, cuda_contrib_kernels.cc, js_contrib_kernels.cc, webgpu_contrib_kernels.cc).
Training-specific code (gradient ops, loss functions, optimizers, TrainingSession) layered on top of the inference framework.
csharp/, java/, js/, objectivec/, rust/ — each wraps the C API (include/onnxruntime/core/session/onnxruntime_c_api.h).
Style: Google C++ Style with modifications. Max line length 120 (aim for 80). See docs/Coding_Conventions_and_Standards.md for full details.
Functions that can fail return onnxruntime::common::Status. Key macros from core/common/common.h:
ORT_RETURN_IF_ERROR(expr)— early-return ifexprreturns non-OK StatusORT_THROW_IF_ERROR(expr)— throw ifexprreturns non-OK StatusORT_RETURN_IF(cond, ...)/ORT_RETURN_IF_NOT(cond, ...)— conditional early-return with messageORT_ENFORCE(cond, ...)— assert-like; throwsOnnxRuntimeExceptionon failureORT_MAKE_STATUS(category, code, ...)— construct a Status object
Exceptions may be disabled in a build, in which case, the throwing macros will call abort() instead.
At the C API boundary, use API_IMPL_BEGIN / API_IMPL_END to catch exceptions — C++ exceptions must never cross the C API boundary.
Use these instead of std::vector / std::unordered_map:
InlinedVector<T>— small-buffer-optimized vector (64 bytes inline)InlinedHashSet<T>,InlinedHashMap<K,V>— flat hash containersNodeHashSet<T>,NodeHashMap<K,V>— when pointer stability is neededTensorShapeVector— for shape dimensions
Use reserve() not resize(). Do not use absl:: directly — use the ORT typedefs.
#pragma oncefor header guardsORT_DISALLOW_COPY_ASSIGNMENT_AND_MOVEfor new classes until copy/move is proven necessary- Prefer
gsl::span<const T>overconst std::vector<T>&for input parameters - Prefer
std::string_viewby value overconst std::string& SafeInt<size_t>(fromcore/common/safeint.h) for memory size arithmetic- Don't use
elseafterreturn - Avoid
long(ambiguous width) — useint64_tfor dimensions,size_tfor counts using namespaceallowed in limited scope but never at global scope in headersstd::make_unique()for heap allocations; preferstd::optionaloverunique_ptrfor optional/delayed construction
Build and test processes may install Python packages. Create and activate an isolated virtual environment first:
python -m venv .venv # one-time setup
source .venv/bin/activate # Linux/macOS
.\.venv\Scripts\Activate.ps1 # Windows (PowerShell)If a virtual environment already exists (e.g., .venv/), activate it rather than creating a new one.
- Follow Google Python Style Guide (extension of PEP 8)
- Max line length: 120 characters
- Formatter: ruff (configured in
pyproject.toml) - Static type checking: pyright/pylance
- Test framework:
unittest(preferred) withpytestas runner
The main public C API header is include/onnxruntime/core/session/onnxruntime_c_api.h. Other public headers are in include/onnxruntime/core/session/ and orttraining/orttraining/training_api/include/.
- Functions that may fail return
OrtStatus*(nullptron success); release/cleanup functions returnvoid - Object lifecycle:
OrtCreateXxx/OrtReleaseXxx - All strings are UTF-8 encoded
- Use
int64_tfor dimensions,size_tfor counts and memory sizes - APIs requiring allocation take an
OrtAllocator*parameter - Failed calls must not modify out-parameters
- Keep PRs small (aim for ≤10 files; separate cosmetic changes from functional ones)
- All changes must have unit tests, unless documentation-only or already adequately covered
- Build and test locally on at least one platform before submitting
- PR author is responsible for merging after approval