You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### Description
Preview:
https://adrianlizarraga.github.io/onnxruntime/docs/execution-providers/EP-Context-Design.html#compile-api
Adds documentation for the ORT compile API. Includes the following
examples:
- Compiling to an output stream with custom function that allows an
application to specify where each initializer is stored.
- Cross-compiling with plugin EPs
- EPContext weight sharing with plugin EPs
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Copy file name to clipboardExpand all lines: docs/execution-providers/EP-Context-Design.md
+318-1Lines changed: 318 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -379,4 +379,321 @@ To use the dumped EPContext models with weight sharing enabled, ONNX Runtime inf
379
379
380
380
session1.run(...);
381
381
session2.run(...);
382
-
```
382
+
```
383
+
384
+
## Compile API
385
+
ORT 1.22 introduced an explicit [model compilation API](https://github.com/microsoft/onnxruntime/blob/a5ba2ba3998820dd8da111c90c420479aac7a11e/onnxruntime/python/onnxruntime_inference_collection.py#L680-L709) that enables additional compilation options:
386
+
- Read input model from a file or a buffer.
387
+
- Write output model to a file, a buffer, or an output stream.
388
+
- Provide a callback function to specify the location of each ONNX initializer in the output model.
389
+
- Set compilation flags: "error if no nodes compiled", "error if output file already exists", etc.
390
+
391
+
### Usage example: compiling a model (from file) to an output stream
392
+
```python
393
+
import onnxruntime as ort
394
+
395
+
"""
396
+
Compile a model (from file) to an output stream using a custom write function.
397
+
The custom write function just saves the output model to disk.
398
+
A custom initializer handler stores "large" initializers into an external file.
399
+
"""
400
+
input_model_path ="input_model.onnx"
401
+
output_model_path ="output_model.onnx"
402
+
output_initializer_file_path ="output_model.bin"
403
+
404
+
withopen(output_model_path, "wb") as output_model_fd, \
405
+
open(output_initializer_file_path, "wb") as output_initializer_fd:
406
+
407
+
# Custom function that ORT calls (one or more times) to stream out the model bytes in chunks.
408
+
# This example function simply writes the output model to a file.
409
+
defoutput_model_write_func(buffer: bytes):
410
+
output_model_fd.write(buffer)
411
+
412
+
# Custom function that ORT calls to determine where to store each ONNX initializer in the output model.
413
+
#
414
+
# Note: the `external_info` argument denotes the location of the initializer in the original input model.
415
+
# An implementation may choose to directly return the received `external_info` to use the same external weights.
The above snippet stores ONNX initializers for the output model into a new external file. To keep initializers in the same external file used in the original model,
460
+
return the `external_info` argument from the `output_model_onnx_initializer_handler` function:
# The `external_info` argument denotes the location of the initializer in the original input model (if not None).
469
+
# Return it directly to use the same external initializer file.
470
+
return external_info
471
+
472
+
# ...
473
+
```
474
+
475
+
#### References
476
+
-[Additional Python usage examples in unit tests](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/test/python/onnxruntime_test_python_compile_api.py)
-[C++ API functions](https://github.com/microsoft/onnxruntime/blob/879ec0392ad5128968440a4e5b5a0bb742494ae5/include/onnxruntime/core/session/onnxruntime_cxx_api.h#L1617-L1623)
479
+
-[C API functions](https://github.com/microsoft/onnxruntime/blob/879ec0392ad5128968440a4e5b5a0bb742494ae5/include/onnxruntime/core/session/onnxruntime_c_api.h#L7751-L7774)
480
+
481
+
### Usage example: cross-compilation with a plugin EP
482
+
By default, ONNX Runtime only allows the use of [plugin EPs](./plugin-ep-libraries.md) that are compatible with real hardware devices discovered by ONNX Runtime.
483
+
To support the creation of compiled models targeted for hardware devices not present on the compiling machine (i.e., cross-compiling), a plugin EP may be allowed
484
+
to create virtual hardware devices that an application can use to compile models.
485
+
486
+
#### Application code
487
+
An application grants a plugin EP library permission to create virtual hardware device by using a library registration name
488
+
that ends in the ".virtual" suffix. A virtual hardware device created by an EP will have the metadata entry "is_virtual" set to "1".
489
+
490
+
```python
491
+
import onnxruntime as ort
492
+
import onnxruntime_ep_contoso_ai as contoso_ep
493
+
494
+
# An application uses a registration name that ends in ".virtual" to signal that virtual devices are allowed.
A plugin EP library determines if the creation of virtual devices is allowed by checking if the "allow_virtual_devices" environment configuration entry
525
+
is set to "1". The following snippet from a [reference EP implementation](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/test/autoep/library/example_plugin_ep_virt_gpu/ep_lib_entry.cc) shows how a plugin EP library could check environment configuration entries within the library's
-[Reference example plugin EP with virtual GPU](https://github.com/microsoft/onnxruntime/tree/main/onnxruntime/test/autoep/library/example_plugin_ep_virt_gpu)
646
+
-[OrtEpApi::GetEnvConfigEntries C API function](https://github.com/microsoft/onnxruntime/blob/990ba5f0c3e0c8735fec8bf89dd11953224a9c03/include/onnxruntime/core/session/onnxruntime_ep_c_api.h#L1431-L1446)
647
+
-[Ort::GetEnvConfigEntries C++ API function](https://github.com/microsoft/onnxruntime/blob/990ba5f0c3e0c8735fec8bf89dd11953224a9c03/include/onnxruntime/core/session/onnxruntime_cxx_api.h#L3531-L3532)
648
+
-[Plugin EP library documentation](./plugin-ep-libraries.md)
649
+
-[Additional Python usage examples in unit tests](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/test/python/onnxruntime_test_python_compile_api.py)
-[Plugin EP library documentation](./plugin-ep-libraries.md)
698
+
-[Additional Python usage examples in unit tests](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/test/python/onnxruntime_test_python_compile_api.py)
0 commit comments