Commit ff0ab0a
authored
Quantize Weight for Gemm/Conv on Quantized Model (#22969)
Some quantized models have QDQ around Conv/Gemm but the weight and/or
bias are not quantized. This PR adds WeightBiasQuantization optimizer to
quantize float weight and/or bias to INT8 and INT32 tensors
respectively. We only do this for weight and/or bias initializer so that
ConstantFolding will fold the sub-graph to real quantized initializers
during the graph optimization next round.1 parent c75681a commit ff0ab0a
File tree
6 files changed
+404
-183
lines changed- onnxruntime
- core/optimizer
- qdq_transformer
- test/optimizer
6 files changed
+404
-183
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
66 | | - | |
| 66 | + | |
67 | 67 | | |
68 | 68 | | |
69 | 69 | | |
| |||
245 | 245 | | |
246 | 246 | | |
247 | 247 | | |
248 | | - | |
| 248 | + | |
249 | 249 | | |
250 | 250 | | |
251 | 251 | | |
| |||
Lines changed: 0 additions & 149 deletions
This file was deleted.
Lines changed: 0 additions & 27 deletions
This file was deleted.
0 commit comments