You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Profiling data is available with the HTP backend. Enabling QNN profiling will generate a user-readable .csv file that will contain information from initialization, execution, and de-initialization.
448
+
449
+
If onnxruntime is compiled with a more recent QAIRT SDK (2.39 or later), then a _qnn.log file will also be generated alongside the .csv file. This .log file is parsable by [qnn-profile-viewer](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-10/general_tools.html#qnn-profile-viewer), which is provided in the SDK.
450
+
451
+
## General Usage
452
+
To utilize QNN profiling, simply set the EP options profiling_level to basic, detailed, or optrace. Additionally, the EP option profiling_file_path must also be defined to the output .csv filepath you would like write data to:
453
+
```python
454
+
# Python on Windows on Snapdragon device
455
+
import onnxruntime as ort
456
+
import numpy as np
457
+
458
+
provider_options = [
459
+
"htp_performance_mode": "burst",
460
+
"device_id": "0",
461
+
"htp_graph_finalization_optimization_mode":"3"
462
+
"soc_model": "60",
463
+
"htp_arch": "73",
464
+
"vtcm_mv": "8",
465
+
"profiling_level": "basic",
466
+
"profiling_file_path": "output.csv"
467
+
]
468
+
469
+
sess_options = ort.SessionOptions()
470
+
471
+
session = ort.InferenceSession(
472
+
"model.onnx",
473
+
sess_options=sess_options,
474
+
providers=["QNNExecutionProvider"],
475
+
provider_options=provider_options
476
+
)
477
+
478
+
input0 = np.ones((1,2,3,4), dtype=np.float32)
479
+
result = session.run(None, {"input": input0})
480
+
```
481
+
482
+
With the example above, a file "output.csv" will be generated containing the profiling data. Additionally, if using QAIRT 2.39 SDK or later, another file "output_qnn.log" will be generated.
483
+
484
+
"output_qnn.log" can then be parsed with the appropriate qnn-profile-viewer binary:
The above will output basic information, such as the profiling data for the fastest and slowest execution as well as the average case. A .csv file can also be generated this way, too, though the information will likely not differ from the "output.csv".
490
+
491
+
Additionally, if the profiling_level is set to "detailed" or "optrace", additional data will be shown per-network-layer.
492
+
493
+
### Optrace-Level Profiling
494
+
[Optrace-level profiling](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-10/htp_backend.html#qnn-htp-profiling) generates a profiling .log file that contains [Qualcomm Hexagon Tensor Processor Analaysis Summary (QHAS)](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-10/htp_backend.html#qnn-htp-analysis-summary-qhas-) data. This data can be used to generate chrometraces and provide a web browser-friendly UI to visualize data.
495
+
496
+
**This feature is only available with the QAIRT 2.39 SDK and later.**
497
+
498
+
### Optrace Setup
499
+
To utilize this feature, a context binary must be generated prior to execution:
500
+
```python
501
+
# Python on Windows on Snapdragon device
502
+
import onnxruntime as ort
503
+
import numpy as np
504
+
505
+
provider_options = [
506
+
"htp_performance_mode": "burst",
507
+
"device_id": "0",
508
+
"htp_graph_finalization_optimization_mode":"3"
509
+
"soc_model": "60",
510
+
"htp_arch": "73",
511
+
"vtcm_mv": "8",
512
+
"profiling_level": "optrace", # Set profiling_level to optrace
model_ctx.onnx is an onnx model with a node that points to the model_qnn.bin context binary, which will be used by the HTP backend for execution. The _schematic.bin file will be used by qnn-profile-viewer to generate QHAS data.
537
+
538
+
### Generating QHAS Data
539
+
Previously for general profiling data, the a session was created and executed with ""model.onnx". However, now there is a new _ctx.onnx model that utilizes a newly generated context binary. As such, a new inference session must be created with the new _ctx.onnx model:
result = optrace_session.run(None, {"input": input0})
554
+
```
555
+
556
+
As before under "General Usage", a .csv file (optrace.csv) and a _qnn.log file (optrace_qnn.log) are generated. qnn-profile-viewer will be used again, but with different parameters:
- config.json: Please refer to the "Post Process (Chrometrace Generation)" section [on this page](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-10/htp_backend.html#qnn-htp-optrace-profiling).
563
+
- QnnHtpOptraceProfilingReader.dll: Provided as part of the QAIRT SDK. The corresponding file for Linux is libQnnHtpOptraceProfilingReader.so.
564
+
- QNNExecutionProvider_QNN_12345_schematic.bin: The name will vary. This file must be the same one generated alongside the context binary under "Optrace Setup".
565
+
566
+
Additionally, the output file is now a .json file contaning chrometrace data. This .json file can be opened with either [Perfetto Trace Vizualizer](https://ui.perfetto.dev/) or with chrome://tracing.
567
+
568
+
After running qnn-profile-viewer, you should see a handful of .json files generated with the same prefix as the --output filename parameter. You should also see an .html file generated as well. This .html file can be opened by Chrome to view the chrometrace in a more user-friendly GUI.
569
+
570
+
### Additional References
571
+
For more information how to interpret QHAS data, please refer to [this page](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-10/htp_backend.html#qnn-htp-analysis-summary-qhas-).
572
+
For more information on the data collected with optrace profiling, please refer to [this page](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-10/htp_backend.html#qnn-htp-optrace-profiling).
0 commit comments