Skip to content

Commit bbd3850

Browse files
authored
[QNN EP] Support quantized BatchNorm with per-channel DQ params on QNN HTP (#26959)
## Motivation: QNN HTP was rejecting quantized BatchNorm models where parameters (scale, mean, var) come through DequantizeLinear nodes with per-channel INT8 quantization. This pattern is common in quantized models from quantization tools. ## Changes: - Helpers to resolve BatchNorm params through DQ nodes to their underlying initializers - Support per-channel dequantization for BatchNorm parameters - Support input datatype of UFIXED_POINT_16 - Add unit test covering this QDQ params configuration
1 parent 70cf577 commit bbd3850

File tree

3 files changed

+256
-94
lines changed

3 files changed

+256
-94
lines changed

onnxruntime/core/optimizer/qdq_transformer/selectors_actions/qdq_selectors.cc

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -758,7 +758,14 @@ bool BatchNormalizationNodeGroupSelector::Check(const GraphViewer& graph_viewer,
758758
const Node* redundant_clip_node,
759759
const std::vector<const Node*>& dq_nodes,
760760
const std::vector<const Node*>& q_nodes) const {
761-
if (!CheckQDQNodes(graph_viewer, node, redundant_clip_node, dq_nodes, q_nodes, 3)) {
761+
// BatchNormalization has 5 inputs: x, scale, bias, mean, var.
762+
// Require DQ on x and scale (indices 0,1). mean, var may optionally have DQ.
763+
const int num_dq_nodes = gsl::narrow_cast<int>(dq_nodes.size());
764+
if (num_dq_nodes < 3 || num_dq_nodes > 5) {
765+
return false;
766+
}
767+
768+
if (!CheckQDQNodes(graph_viewer, node, redundant_clip_node, dq_nodes, q_nodes, num_dq_nodes)) {
762769
return false;
763770
}
764771

0 commit comments

Comments
 (0)