-
Notifications
You must be signed in to change notification settings - Fork 273
Enable fp16+int4 mixed precission path for int4 xpu path with int zero point #2240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2240
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit fa6ca5d with merge base 2c901b3 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@jerryzh168 can you help to review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a test? maybe add one under
ao/test/dtypes/test_affine_quantized.py
Line 299 in 4d5f657
class TestAffineQuantizedBasic(TestCase): |
@pytorchbot label new feature |
Didn't find following labels among repository labels: new,feature |
@pytorchbot label quantize |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: PR Label Check / Check PR Labels Details for Dev Infra teamRaised by workflow job |
@pytorchbot label ciflow/xpu |
To add these label(s) (ciflow/xpu) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page). This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows. |
@pytorchbot label ciflow/xpu |
Didn't find following labels among repository labels: ciflow/xpu |
@pytorchbot label ci |
Yes. I checked the test in test_affine_quantized.py. xpu is not enabled for a lot of UTs(no only for dtype), I will open a new pr to enable these UTs. |
Backgroup
For XPU device, when user select the int zero point, the _torch.ops.aten.weight_int4pack_mm_with_scales_and_zeros kernel operator will be used to do A16W4 computation. Both Afp16W4 and ABF16int4 are supported in this op on XPU device, while only the BF16 activation is supported in the torchAO now, In this PR we want to unlock the FP16 activation support.