File tree 1 file changed +6
-6
lines changed
1 file changed +6
-6
lines changed Original file line number Diff line number Diff line change @@ -13,14 +13,14 @@ It implements the generative AI loop for ONNX models, including pre and post pro
13
13
14
14
See documentation at https://onnxruntime.ai/docs/genai .
15
15
16
- | Support matrix | Supported now | Under development | On the roadmap |
16
+ | Support matrix| Supported now| Under development| On the roadmap|
17
17
| -------------- | ------------- | ----------------- | -------------- |
18
18
| Model architectures | Gemma <br /> Llama * <br /> Mistral + <br /> Phi (language + vision) <br /> Qwen <br /> Nemotron <br /> Granite <br /> AMD OLMo | Whisper | Stable diffusion |
19
- | API | Python <br /> C# <br /> C/C++ <br /> Java ^ | Objective-C | |
20
- | Platform | Linux <br /> Windows <br /> Mac ^ <br /> Android ^ | | iOS |
21
- | Architecture | x86 <br /> x64 <br /> Arm64 ~ | | |
22
- | Hardware Acceleration | CUDA <br /> DirectML <br /> | QNN <br /> OpenVINO <br /> ROCm | |
23
- | Features | | Interactive decoding <br /> Customization (fine-tuning) | Speculative decoding |
19
+ | API| Python <br />C# <br />C/C++ <br /> Java ^ | Objective-C| |
20
+ | Platform| Linux <br /> Windows <br />Mac ^ <br />Android ^ || iOS || |
21
+ | Architecture| x86 <br /> x64 <br /> Arm64 ~ ||| |
22
+ | Hardware Acceleration| CUDA<br />DirectML<br />| QNN <br /> OpenVINO <br /> ROCm ||
23
+ | Features| MultiLoRA <br /> Continuous decoding (session continuation)^ | Constrained decoding | Speculative decoding |
24
24
25
25
\* The Llama model architecture supports similar model families such as CodeLlama, Vicuna, Yi, and more.
26
26
You can’t perform that action at this time.
0 commit comments