Commit 6cf92ae
authored
Fix GatherBlockQuantized node to support symmetric quantized LM_HEAD (#1951)
Today models created with
python -m onnxruntime_genai.models.builder -p int4 -e webgpu
--extra_options shared_embeddings=true int4_algo_config=rtn_last
int4_is_symmetric=true
have invalid GatherBlockQuanntized nodes because the zero point
attribute of the node points to a non-existent tensor
lm_head.MatMul.weight_zp.
This change fixes builder.py, so that we are selective about adding that
attribute to the GatherBlockQuantized node.1 parent 9c524ed commit 6cf92ae
1 file changed
+4
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1324 | 1324 | | |
1325 | 1325 | | |
1326 | 1326 | | |
| 1327 | + | |
| 1328 | + | |
| 1329 | + | |
1327 | 1330 | | |
1328 | 1331 | | |
1329 | | - | |
| 1332 | + | |
1330 | 1333 | | |
1331 | 1334 | | |
1332 | 1335 | | |
| |||
0 commit comments