-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Open
Labels
ep:QNNissues related to QNN exeution providerissues related to QNN exeution provider
Description
Describe the issue
onnxruntime_perf_test.exe crashes (with AV exception, code 0xC0000005) on inference due to unbound memory access:
.\onnxruntime_perf_test.exe -e qnn -i "backend_path|.\QnnHtp.dll soc_model|60 htp_arch|73 htp_graph_finalization_optimization_mode|3" -C "ep.share_ep_contexts|1" -m times -r 1 -I phi_context_qdq_ctx.onnx -s
onnxruntime cpuid_info warning: Unknown CPU vendor. cpuinfo_vendor value: 0
2026-01-02 10:59:05.7458171 [W:onnxruntime:, session_state.cc:1316 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2026-01-02 10:59:05.7550095 [W:onnxruntime:, session_state.cc:1318 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
Exit code: -1073741819
The reason for crash is a negative positions/offsets in the position_ids input array passed to MLAS library
The call stack screenshots from debugger:
The patch that helps:
---
.../cpu/bert/group_query_attention.cc | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/onnxruntime/contrib_ops/cpu/bert/group_query_attention.cc b/onnxruntime/contrib_ops/cpu/bert/group_query_attention.cc
index eb1560ac8e..2c68641291 100644
--- a/onnxruntime/contrib_ops/cpu/bert/group_query_attention.cc
+++ b/onnxruntime/contrib_ops/cpu/bert/group_query_attention.cc
@@ -154,11 +154,20 @@ Status GroupQueryAttention<T>::Compute(OpKernelContext* context) const {
for (int b = 0; b < batch_size; b++) {
const int total_seqlen = seqlens_k->Data<int32_t>()[b] + 1;
const int past_seqlen = total_seqlen - sequence_length;
- for (int s = 0; s < sequence_length; s++) {
- if (past_seqlen + s < total_seqlen) {
- default_pos_ids[b * sequence_length + s] = static_cast<int64_t>(past_seqlen) + s;
- } else {
- default_pos_ids[b * sequence_length + s] = static_cast<int64_t>(1);
+
+ // Handle inconsistent random data in seqlens_k, when past_seqlen becomes negative
+ if (past_seqlen < 0) {
+ // Fallback: generate consecutive position IDs starting from 0
+ for (int s = 0; s < sequence_length; s++) {
+ default_pos_ids[b * sequence_length + s] = static_cast<int64_t>(s);
+ }
+ } else {
+ for (int s = 0; s < sequence_length; s++) {
+ if (past_seqlen + s < total_seqlen) {
+ default_pos_ids[b * sequence_length + s] = static_cast<int64_t>(past_seqlen) + s;
+ } else {
+ default_pos_ids[b * sequence_length + s] = static_cast<int64_t>(1);
+ }
}
}
}
--
2.52.0.windows.1To reproduce
- Run
ep_weight_sharing_ctx_gen.exeto compile thephi_3_6_context_qdq.onnxmodel and generate the shared weights bin file:
ep_weight_sharing_ctx_gen.exe -e qnn -i "backend_path|./QnnHtp.dll soc_model|60 vtcm_mb|8 htp_arch|73 htp_graph_finalization_optimization_mode|3" ./phi_context_qdq.onnx
phi_context_qdq_ctx.onnx
phi_context_qdq_qnn.bin
- Run
onnxruntime_perf_test.exeon the resulting cache:
.\onnxruntime_perf_test.exe -e qnn -i "backend_path|.\QnnHtp.dll soc_model|60 htp_arch|73 htp_graph_finalization_optimization_mode|3" -C "ep.share_ep_contexts|1" -m times -r 1 -I .\phi_context_qdq_ctx.onnx -s
NB: the problem does not like QNN or phi specific, and potentially could happen to other models and EPs.
Urgency
The temporaily workaround exist in a form of the patch listed above. However, the AV exception is a severe bug that must be fixed as soon as you can.
Platform
Windows
OS Version
Microsoft Windows 11 Enterprise, 10.0.26220, ARM64
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.23.2, a83fc4d
ONNX Runtime API
C++
Architecture
ARM64
Execution Provider
Other / Unknown
Execution Provider Library Version
QNN 2.39
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
ep:QNNissues related to QNN exeution providerissues related to QNN exeution provider



