Skip to content

Enable fp16+int4 mixed precission path for int4 xpu path with int zero point #2240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 29, 2025

Conversation

liangan1
Copy link
Contributor

Backgroup
For XPU device, when user select the int zero point, the _torch.ops.aten.weight_int4pack_mm_with_scales_and_zeros kernel operator will be used to do A16W4 computation. Both Afp16W4 and ABF16int4 are supported in this op on XPU device, while only the BF16 activation is supported in the torchAO now, In this PR we want to unlock the FP16 activation support.

Copy link

pytorch-bot bot commented May 22, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2240

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit fa6ca5d with merge base 2c901b3 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 22, 2025
@liangan1
Copy link
Contributor Author

@jerryzh168 can you help to review?

@liangan1
Copy link
Contributor Author

@EikanWang

Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test? maybe add one under

class TestAffineQuantizedBasic(TestCase):
for now, current tests are not very well structured

@liangan1
Copy link
Contributor Author

@pytorchbot label new feature

Copy link

pytorch-bot bot commented May 29, 2025

Didn't find following labels among repository labels: new,feature

@liangan1
Copy link
Contributor Author

@pytorchbot label quantize

@pytorch-bot pytorch-bot bot added the quantize label May 29, 2025
@liangan1
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: PR Label Check / Check PR Labels

Details for Dev Infra team Raised by workflow job

@liangan1
Copy link
Contributor Author

@pytorchbot label ciflow/xpu

Copy link

pytorch-bot bot commented May 29, 2025

To add these label(s) (ciflow/xpu) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@liangan1
Copy link
Contributor Author

@pytorchbot label ciflow/xpu

Copy link

pytorch-bot bot commented May 29, 2025

Didn't find following labels among repository labels: ciflow/xpu

@liangan1
Copy link
Contributor Author

@pytorchbot label ci

@pytorch-bot pytorch-bot bot added the ci label May 29, 2025
@Xia-Weiwen Xia-Weiwen added the topic: new feature Use this tag if this PR adds a new feature label May 29, 2025
@Xia-Weiwen Xia-Weiwen merged commit 0aa8dbd into pytorch:main May 29, 2025
20 of 24 checks passed
@liangan1
Copy link
Contributor Author

can you add a test? maybe add one under

class TestAffineQuantizedBasic(TestCase):

for now, current tests are not very well structured

Yes. I checked the test in test_affine_quantized.py. xpu is not enabled for a lot of UTs(no only for dtype), I will open a new pr to enable these UTs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. quantize topic: new feature Use this tag if this PR adds a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants