Feat (brevitas_examples/sdxl): better GPTQ #1250

Giuseppe5 · 2025-04-10T11:20:42Z

Reason for this PR

Few shot calibration + GPTQ

Some details need refining

Changes Made in this PR

Testing Summary

Risk Highlight

This PR includes code from another work (please detail).
This PR contains API-breaking changes.
This PR depends on work in another PR (please provide links/details).
This PR introduces new dependencies (please detail).
There are coverage gaps not covered by tests.
Documentation updates required in subsequent PR.

Checklist

Code comments added to any hard-to-understand areas, if applicable.
Changes generate no new warnings.
Updated any relevant tests, if applicable.
No conflicts with destination dev branch.
I reviewed my own code changes.
Initial CI/CD passing.
1+ reviews given, and any review issues addressed and approved.
Post-review full CI/CD passing.

Giuseppe5 · 2025-04-14T08:31:16Z

src/brevitas/graph/gptq.py

+            (self.groups, self.columns, self.columns),
+            device='cpu',
+            dtype=torch.float32,
+        )


Maybe I should re-introduce this but I was getting weird CUDA errors. Also, since we are storing in CPU, not sure why we need to use pin_memory.
@i-colbert maybe you can comment before we merge

From what I can remember, it was used to improve data transfer speeds from GPU to CPU, which is why it is only enabled if a GPU is available.

nickfraser

A few comments / question, but in principle, approved.

src/brevitas/graph/gpxq.py

src/brevitas_examples/stable_diffusion/main.py

nickfraser

LGTM!

nickfraser added enhancement New feature or request next release PRs which should be merged for the next release labels Apr 11, 2025

Giuseppe5 force-pushed the faster_sdxl branch from a60fdeb to 349376e Compare April 14, 2025 08:07

Giuseppe5 marked this pull request as ready for review April 14, 2025 08:30

Giuseppe5 commented Apr 14, 2025

View reviewed changes

Giuseppe5 added 4 commits April 24, 2025 09:10

Feat (brevitas_examples/sdxl): better GPTQ

646c109

Last touches

dcc7ff6

Cleanup

05a8b09

Correct default

d3b1ac6

Giuseppe5 force-pushed the faster_sdxl branch from 2852280 to d3b1ac6 Compare April 24, 2025 14:13

Giuseppe5 added 2 commits April 24, 2025 09:30

update

d0ce7d8

Fix

12d9b1e

nickfraser approved these changes Apr 25, 2025

View reviewed changes

review

6feb106

Giuseppe5 requested a review from nickfraser May 5, 2025 18:23

nickfraser approved these changes May 6, 2025

View reviewed changes

Giuseppe5 merged commit e578918 into Xilinx:dev May 6, 2025
374 of 396 checks passed

Giuseppe5 deleted the faster_sdxl branch May 6, 2025 11:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat (brevitas_examples/sdxl): better GPTQ #1250

Feat (brevitas_examples/sdxl): better GPTQ #1250

Uh oh!

Giuseppe5 commented Apr 10, 2025

Uh oh!

Giuseppe5 Apr 14, 2025

Uh oh!

i-colbert Apr 14, 2025

Uh oh!

nickfraser left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nickfraser left a comment

Uh oh!

Uh oh!

Uh oh!

Feat (brevitas_examples/sdxl): better GPTQ #1250

Feat (brevitas_examples/sdxl): better GPTQ #1250

Uh oh!

Conversation

Giuseppe5 commented Apr 10, 2025

Reason for this PR

Changes Made in this PR

Testing Summary

Risk Highlight

Checklist

Uh oh!

Giuseppe5 Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

i-colbert Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

nickfraser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nickfraser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!