Skip to content

Feat: export dq only#110

Merged
mht-sharma merged 7 commits intohuggingface:mainfrom
Giuseppe5:dq_flag
Mar 19, 2024
Merged

Feat: export dq only#110
mht-sharma merged 7 commits intohuggingface:mainfrom
Giuseppe5:dq_flag

Conversation

@Giuseppe5
Copy link
Copy Markdown
Contributor

@Giuseppe5 Giuseppe5 commented Mar 14, 2024

Supersedes #94

The idea is to export only Integer weights + DQ. For this, we need to use PyTorch 2.2+ because of a bug in how constant values are handled at export time.

@Giuseppe5 Giuseppe5 mentioned this pull request Mar 14, 2024
@Giuseppe5 Giuseppe5 requested a review from mht-sharma March 14, 2024 16:49
Copy link
Copy Markdown

@dineshchitlangia dineshchitlangia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM.
Minor question pls:
export_manager.change_weight_export(export_weight_q_node=True)
In your change, how are we accounting for this weight export?

@Giuseppe5
Copy link
Copy Markdown
Contributor Author

Last time I discussed with @fxmarty, he mentioned that it would be always preferable to export only Integer Weights -> DQ rather than Float Weights -> Q -> DQ, so this last option has been completely removed in this PR.

This is also related to the changes in #82 , where we assume that the weights have been exported without the Q node.

@dineshchitlangia
Copy link
Copy Markdown

Last time I discussed with @fxmarty, he mentioned that it would be always preferable to export only Integer Weights -> DQ rather than Float Weights -> Q -> DQ, so this last option has been completely removed in this PR.

This is also related to the changes in #82 , where we assume that the weights have been exported without the Q node.

Thanks for the clarity @Giuseppe5

I do not have write privileges to merge your changes so you will have to wait a bit more until someone can merge it.

Copy link
Copy Markdown
Contributor

@mht-sharma mht-sharma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. I have left a few comments

from optimum.amd.brevitas.accelerate_utils import calc_cpu_device_map, calc_gpu_device_map, offload_model, remove_hooks
from optimum.amd.brevitas.data_utils import compute_perplexity, get_dataset_for_model
from optimum.exporters.onnx import onnx_export_from_model
from optimum.amd.brevitas.export import export_quantized_model
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from optimum.amd.brevitas.export import export_quantized_model
from optimum.amd.brevitas.export import export_to_onnx

Can we keep the ONNX word in the loop to make it explicit. Other name suggestions, quantized_model_to_onnx or save_quantized_model_as_onnx

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted to keep as similar as possible to the original name so it became:
onnx_export_from_quantized_model

@mht-sharma
Copy link
Copy Markdown
Contributor

mht-sharma commented Mar 18, 2024

Could you also document the same in docs/brevitas

@Giuseppe5 Giuseppe5 requested a review from mht-sharma March 18, 2024 14:14
Copy link
Copy Markdown
Contributor

@mht-sharma mht-sharma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com>
@mht-sharma mht-sharma merged commit 8771d5c into huggingface:main Mar 19, 2024
ChaoLi-AMD pushed a commit to ChaoLi-AMD/optimum-amd that referenced this pull request Mar 20, 2024
* Feat: export dq only

* fix

* fix

* Code review

* Docs: update documentation

* Formatting

* Apply suggestions from code review

Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com>

---------

Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants