Skip to content

Commit afb6113

Browse files
author
quic_calvnguy
committed
Address PR comments, add HTP backend in example code
1 parent dbc9d6f commit afb6113

File tree

1 file changed

+6
-5
lines changed

1 file changed

+6
-5
lines changed

docs/execution-providers/QNN-ExecutionProvider.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -449,16 +449,17 @@ Profiling data is available with the HTP backend. Enabling QNN profiling will ge
449449
If onnxruntime is compiled with a more recent QAIRT SDK (2.39 or later), then a _qnn.log file will also be generated alongside the .csv file. This .log file is parsable by [qnn-profile-viewer](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-10/general_tools.html#qnn-profile-viewer), which is provided in the SDK.
450450

451451
### General Usage
452-
To utilize QNN profiling, simply set the EP options profiling_level to basic, detailed, or optrace. Additionally, the EP option profiling_file_path must also be defined to the output .csv filepath you would like write data to:
452+
To utilize QNN profiling, simply set the EP option profiling_level to basic, detailed, or optrace. Additionally, the EP option profiling_file_path must also be set to the output .csv filepath you would like to write data to:
453453
```python
454454
# Python on Windows on Snapdragon device
455455
import onnxruntime as ort
456456
import numpy as np
457457

458458
provider_options = [
459+
"backend_path": "path/to/QnnHtp.dll", # Use libQnnHtp.so if on Linux
459460
"htp_performance_mode": "burst",
460461
"device_id": "0",
461-
"htp_graph_finalization_optimization_mode":"3"
462+
"htp_graph_finalization_optimization_mode":"3",
462463
"soc_model": "60",
463464
"htp_arch": "73",
464465
"vtcm_mv": "8",
@@ -503,9 +504,10 @@ import onnxruntime as ort
503504
import numpy as np
504505

505506
provider_options = [
507+
"backend_path": "path/to/QnnHtp.dll", # Use libQnnHtp.so if on Linux
506508
"htp_performance_mode": "burst",
507509
"device_id": "0",
508-
"htp_graph_finalization_optimization_mode":"3"
510+
"htp_graph_finalization_optimization_mode":"3",
509511
"soc_model": "60",
510512
"htp_arch": "73",
511513
"vtcm_mv": "8",
@@ -516,7 +518,6 @@ provider_options = [
516518
sess_options = ort.SessionOptions()
517519

518520
# Enable context bin generation
519-
sess_options.add_session_config_entry("session.disable_cpu_ep_fallback", "1")
520521
sess_options.add_session_config_entry("ep.context_embed_mode", "0")
521522
sess_options.add_session_config_entry("ep.context_enable", "1")
522523

@@ -531,7 +532,7 @@ session = ort.InferenceSession(
531532
Upon successful session creation, three files will be generated:
532533
- model_ctx.onnx
533534
- model_qnn.bin
534-
- QNNExecutionProvider_QNN__<number>_schematic.bin
535+
- QNNExecutionProvider_QNN_\<number\>_schematic.bin
535536

536537
model_ctx.onnx is an onnx model with a node that points to the model_qnn.bin context binary, which will be used by the HTP backend for execution. The _schematic.bin file will be used by qnn-profile-viewer to generate QHAS data.
537538

0 commit comments

Comments
 (0)