-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sparsity to benchmarking #1917
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1917
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit f2d24cd with merge base 6b76adb ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for working on this! just a couple questions but otherwise looks good
@@ -44,11 +45,33 @@ def run(config: BenchmarkConfig) -> BenchmarkResult: | |||
|
|||
# Use quantize_ to apply each quantization function to the model | |||
m_copy = deepcopy(base_model).eval().to(config.device) | |||
quantization_config = string_to_config( | |||
config.quantization, high_precision_dtype=config.high_precision_dtype | |||
aoBaseConfig = string_to_config( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably camel_case is better here?
@@ -1,9 +1,13 @@ | |||
# Sample configuration for inference benchmarks | |||
benchmark_mode: "inference" | |||
quantization_config_recipe_names: | |||
- "baseline" | |||
# - "baseline" Will always run a baseline instatance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're running baseline case as default for any benchmarking param. the reason I listed it here as a comment is because I wanted to let users know that this will always run. Maybe I can simply add it to readme, and write the comment like
# Will run a baseline inference for model by default, without quantization for comparison
- "int4wo-128" | ||
- "marlin" | ||
sparsity_config_recipe_names: | ||
# - "none" Will always run a without sparsity instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
# Mock string_to_config to return valid configs | ||
from torchao.quantization import Int4WeightOnlyConfig | ||
from torchao.sparsity.sparse_api import ( | ||
BlockSparseWeightConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need BlockSparseWeightConfig here - should be semi-structured sparsity no?
self.assertIsInstance(result, BenchmarkResult) | ||
self.assertTrue(hasattr(result, "model_inference_time_in_ms")) | ||
|
||
# Test with block sparsity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see - can we split this into two tests then, one for int4+2:4 marlin, and one for block sparsity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
ghstack-source-id: c9e900d84457e7dc527ad3ea78648d61b8004b09 ghstack-comment-id: 2752688588 Pull Request resolved: #1955
ghstack-source-id: ac08cd26ac7d672bbca67c628e305c9771536944 ghstack-comment-id: 2752688653 Pull Request resolved: #1956
ghstack-source-id: 4cf77f85be18d1df2b1827ce11afbd02fccd9188 ghstack-comment-id: 2752688734 Pull Request resolved: #1957
ghstack-source-id: 8b459e288351460492c91d476791f3d47c7b627d ghstack-comment-id: 2752688801 Pull Request resolved: #1958
ghstack-source-id: ded017942498ba330ed1e99cb784ead3a049bb6e ghstack-comment-id: 2752688871 Pull Request resolved: #1959
ghstack-source-id: df824669905a8f0fbc1de6744d7a4bc7aa1747d5 ghstack-comment-id: 2752688926 Pull Request resolved: #1960
6e00835
to
d71baa3
Compare
This reverts commit d71baa3.
Add sparsity support for benchmarking. The following support has been added