feat: Enable benchmark-mode module inventory/export across all CausalLM architectures#906
feat: Enable benchmark-mode module inventory/export across all CausalLM architectures#906
Conversation
Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>
Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>
Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>
|
@vbaddi - can we restructure this as below? We can create Attention and MOE/FFN benchmarks, and use onnx symbols to set fields? Some fields can come from config.json of model card, like dm/dh? Maybe I didn't fully understand the table you gave above |
Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>
Thanks @anujgupt-github. These are all configurable from the config or model card passed, whatever needs to be edited, can either be changed in config or pass that args in The table is basically a dummy inputs running on QAic and providing the numbers for those modules. (/sess.run()) |
Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>
WIP: This PR extends
enable_benchmark=Truesupport inQEFFAutoModelForCausalLMto all CausalLM models.What changed
gptj, mistral, mixtral, mpt, phi, phi3, qwen2, starcoder2, granite, olmo2).
get_benchmark_module_specs(...).CausalLM models.
Example benchmark output (Llama)
Input/Output shape section in report
Validation