Skip to content

Commit 856b110

Browse files
committed
Add C++ application to snapdragon tutorial
1 parent e1394f5 commit 856b110

File tree

3 files changed

+103
-8
lines changed

3 files changed

+103
-8
lines changed

docs/genai/tutorials/deepseek-python.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Learn how to chat with DeepSeek-R1-Distill ONNX models on your devi
44
has_children: false
55
parent: Tutorials
66
grand_parent: Generate API (Preview)
7-
nav_order: 4
7+
nav_order: 5
88
---
99

1010
# Reasoning in Python with DeepSeek-R1-Distill models

docs/genai/tutorials/snapdragon.md

Lines changed: 63 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,69 @@ curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/ma
4444
python .\model-qa.py -e cpu -g -v --system_prompt "You are a helpful assistant. Be brief and concise." --chat_template "<|user|>\n{input} <|end|>\n<|assistant|>" -m ..\..\models\microsoft\phi-3.5-mini-instruct-npu-qnn-2.31-v2
4545
```
4646

47-
## C# Application
47+
### A look inside the Python script
48+
49+
50+
## C++ Application
51+
52+
To run the models on snadragon NPU within a C++ application, use the code from here: https://github.com/microsoft/onnxruntime-genai/tree/main/examples/c.
53+
54+
Building and running this application requires a Windows PC with a Snadragon NPU, as well as:
55+
* cmake
56+
* Visual Studio 2022
57+
58+
59+
1. Clone the repo
60+
61+
```powershell
62+
git clone https://github.com/microsoft/onnxruntime-genai
63+
cd examples\c
64+
```
65+
66+
2. Install onnxruntime
67+
68+
Currently requires the nightly build of onnxruntime, as there are up to the minute changes to QNN support for language models.
69+
70+
Download the nightly version of the ONNX Runtime QNN binaries from [here](https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/ORT-Nightly/NuGet/Microsoft.ML.OnnxRuntime.QNN/overview/1.22.0-dev-20250225-0548-e46c0d8)
71+
72+
73+
```powershell
74+
mkdir onnxruntime-win-arm64-qnn
75+
move Microsoft.ML.OnnxRuntime.QNN.1.22.0-dev-20250225-0548-e46c0d8.nupkg onnxruntime-win-arm64-qnn
76+
cd onnxruntime-win-arm64-qnn
77+
tar xvzf Microsoft.ML.OnnxRuntime.QNN.1.22.0-dev-20250225-0548-e46c0d8.nupkg
78+
copy runtimes\win-arm64\native\* ..\..\..\lib
79+
cd ..
80+
```
81+
82+
83+
3. Install onnxruntime-genai
84+
85+
```powershell
86+
curl https://github.com/microsoft/onnxruntime-genai/releases/download/v0.6.0/onnxruntime-genai-0.6.0-win-arm64.zip -o onnxruntime-genai-win-arm64.zip
87+
tar xvf onnxruntime-genai-win-arm64.zip
88+
cd onnxruntime-genai-0.6.0-win-arm64
89+
copy include\* ..\include
90+
copy lib\* ..\lib
91+
```
92+
93+
4. Build the sample
94+
95+
```powershell
96+
cmake -A arm64 -S . -B build -DPHI3-QA=ON
97+
cd build
98+
cmake --build . --config Release
99+
```
100+
101+
5. Run the sample
102+
103+
```
104+
cd Release
105+
.\phi3_qa.exe <path_to_model>
106+
```
107+
108+
109+
48110

49111

50112

docs/install/index.md

Lines changed: 39 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -117,18 +117,16 @@ To build from source on Linux, follow the instructions [here](https://onnxruntim
117117

118118

119119

120-
## C#/C/C++/WinML Installs
120+
## C# Installs
121121

122-
### Install ONNX Runtime
123-
124-
#### Install ONNX Runtime CPU
122+
### Install ONNX Runtime CPU
125123

126124
```bash
127125
# CPU
128126
dotnet add package Microsoft.ML.OnnxRuntime
129127
```
130128

131-
#### Install ONNX Runtime GPU (CUDA 12.x)
129+
### Install ONNX Runtime GPU (CUDA 12.x)
132130

133131
The default CUDA version for ORT is 12.x
134132

@@ -137,7 +135,7 @@ The default CUDA version for ORT is 12.x
137135
dotnet add package Microsoft.ML.OnnxRuntime.Gpu
138136
```
139137

140-
#### Install ONNX Runtime GPU (CUDA 11.8)
138+
### Install ONNX Runtime GPU (CUDA 11.8)
141139

142140
1. Project Setup
143141

@@ -179,6 +177,41 @@ dotnet add package Microsoft.ML.OnnxRuntime.DirectML
179177
dotnet add package Microsoft.AI.MachineLearning
180178
```
181179

180+
## C++/C Installs
181+
182+
### CPU
183+
184+
Find your release here: https://github.com/microsoft/onnxruntime/releases
185+
186+
Download and unzip the archive.
187+
188+
For example:
189+
190+
```
191+
curl -LO https://github.com/microsoft/onnxruntime/releases/download/v1.20.0/onnxruntime-win-arm64-1.20.0.zip
192+
```
193+
194+
On Windows
195+
196+
```
197+
tar xvzf onnxruntime-win-arm64-1.20.0.zip
198+
move onnxruntime-win-arm64-1.20.0\include <your application include folder>
199+
move onnxruntime-win-arm64-1.20.0\lib <your application lib folder>
200+
move
201+
```
202+
203+
### Arm64 QNN
204+
205+
QNN binaries are published in the NuGet archive
206+
207+
```
208+
curl -LO https://www.nuget.org/api/v2/package/Microsoft.ML.OnnxRuntime.QNN/1.20.0
209+
tar xvzf microsoft.ml.onnxruntime.qnn.1.20.2.nupkg
210+
move build\native\include <your application include folder>
211+
move build\native\win-arm64\native <your application lib folder>
212+
```
213+
214+
182215
## Install on web and mobile
183216

184217
The pre-built packages have full support for all ONNX opsets and operators.

0 commit comments

Comments
 (0)