Refactor FP8 dequantization and detection using registry pattern #1348

scopophobic · 2026-01-27T05:00:23Z

Refactor FP8 dequantization and detection using registry pattern

Summary

Refactors FP8 dequantization and detection logic to use a registry pattern, making FP8 support more explicit, maintainable, and extensible. Pure refactoring - no behavior changes, fully backward compatible.

What Changed

Added FP8_DEQUANT_REGISTRY with @register_fp8_layer() decorator
Registered CompressedLinear and FP8Linear handlers
Refactored is_fp8_linear() to use registry-first detection
Refactored convert_fp8_layer_to_linear() to use registry dispatch
Updated convert_fp8_model_to_16b_model() to support all registered types

Benefits

Explicit: Only registered layer types are supported
Consistent: Detection and dequantization stay in sync
Extensible: Adding new FP8 types requires just a handler + registration
Maintainable: Clear separation of concerns

Files Changed

auto_round/utils/model.py (only file modified)

Type: Refactoring | Breaking: No | Migration: No

Signed-off-by: Adithyan Madhu <adithyanworkmail@gmail.com>

…ibility and data_type handling from main.

for more information, see https://pre-commit.ci

yiliu30 · 2026-01-27T11:11:46Z

Hi @scopophobic, the CI is currently blocked due to the Transformers v5 upgrade. I’ll get back to you once it’s fixed.

registry based look-up for FP8 layer detection

881f03d

Signed-off-by: Adithyan Madhu <adithyanworkmail@gmail.com>

scopophobic mentioned this pull request Jan 27, 2026

ignore_layers does not take effect on FP8 model #1283

Open

yiliu30 requested review from n1ck-guo and yiliu30 January 27, 2026 05:16

Adithyan Madhu and others added 4 commits January 27, 2026 11:06

merge conflict fixed, Registry refactor preserved, plus Gaudi2 compat…

76f0d9c

…ibility and data_type handling from main.

Merge branch 'main' into FP8_Registry_and_Detection_Refactor

f3a0bfc

[pre-commit.ci] auto fixes from pre-commit.com hooks

bc91190

for more information, see https://pre-commit.ci

Merge branch 'main' into FP8_Registry_and_Detection_Refactor

a58eece

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor FP8 dequantization and detection using registry pattern #1348

Refactor FP8 dequantization and detection using registry pattern #1348

scopophobic commented Jan 27, 2026 •

edited

Loading

Uh oh!

yiliu30 commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Refactor FP8 dequantization and detection using registry pattern #1348

Are you sure you want to change the base?

Refactor FP8 dequantization and detection using registry pattern #1348

Conversation

scopophobic commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Refactor FP8 dequantization and detection using registry pattern

Summary

What Changed

Benefits

Files Changed

Uh oh!

yiliu30 commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

scopophobic commented Jan 27, 2026 •

edited

Loading