Fix QuantState and dict conversions#1729
Fix QuantState and dict conversions#1729cyyever wants to merge 8 commits intobitsandbytes-foundation:mainfrom
Conversation
f8ff22b to
5557581
Compare
60a2df6 to
5eb1e03
Compare
5eb1e03 to
398e915
Compare
|
Hi, |
|
@matthewdouglas Added |
398e915 to
4c95352
Compare
Signed-off-by: cyy <cyyever@outlook.com>
Signed-off-by: cyy <cyyever@outlook.com>
Signed-off-by: cyy <cyyever@outlook.com>
Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>
4c95352 to
5256287
Compare
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
TimDettmers
left a comment
There was a problem hiding this comment.
PR Review: #1729 — Fix QuantState and dict conversions
Bug fix: corrects two bugs in QuantState.as_dict() and one bug in QuantState.from_dict() that prevent the as_dict/from_dict round-trip from working when shape or quant_type are None (as produced by quantize_blockwise).
No blocking issues. The fix is correct, well-scoped, and includes regression tests.
Analysis
There are three distinct bugs being fixed:
-
as_dict()crashes whenself.shape is None—quantize_blockwise()createsQuantStatewithout settingshape(defaults toNone). The old code unconditionally callstuple(self.shape)on line 572, which raisesTypeError. The fix adds aNoneguard:tuple(self.shape) if self.shape is not None else None. -
as_dict(packed=True)crashes whenself.quant_type is None— The old code unconditionally concatenates"bitsandbytes__" + self.quant_typeon line 590, which raisesTypeErrorwhenquant_typeisNone(again fromquantize_blockwise). The fix conditionally appends the quant_type suffix only when it is notNone. -
from_dict()rejects valid unpacked dicts — The old validation logic at lines 523–528 usedif/elifwithnot len(qs_key)andlen(qs_key) != 1checks. When given an unpacked dict (fromas_dict(packed=False)) that hasquant_typeas a plain key but noquant_state.*packed key, theelifbranch fires becauselen(qs_key) == 0 != 1, raising a misleadingValueError. The fix restructures the conditionals so that packed-format validation only runs whenquant_typeis absent from the dict, correctly recognizing unpacked dicts.
All three fixes are narrowly scoped and correct.
Serialization Compatibility
This is the critical concern for QuantState changes. After careful analysis:
-
4-bit checkpoints (the common case):
quantize_4bit()always sets bothshapeandquant_type, so theas_dict(packed=True)path used byLinear4bit._save_to_state_dict()is unaffected. The packed key formatquant_state.bitsandbytes__nf4/quant_state.bitsandbytes__fp4is unchanged. -
Old checkpoints loading with new code: Old checkpoints use the packed format with
quant_typealways set.from_dict()still handles these correctly — the new validation logic is equivalent to the old logic when a packedquant_state.*key is present. -
New checkpoints loading with old code: Not a concern for the packed format since the output is identical for the 4-bit case. For blockwise QuantState (which is never serialized to disk via
_save_to_state_dict), the unpacked format now includesNonevalues forshapeandquant_type, but this dict is only used in-memory. -
Latent note: The
as_dict(packed=True)path withquant_type=Noneproduces a key"quant_state.bitsandbytes__"whichfrom_dictwould still reject (since"bitsandbytes__"is not invalid_qs_type_keys). This is not a practical issue because blockwise QuantState is never serialized via the packed path, but it is worth noting as a theoretical incomplete fix. Not blocking.
Test Coverage
The PR adds as_dict/from_dict round-trip coverage to four existing test functions:
test_dynamic_blockwise_quantization(two loops: default code and custom code)test_fp8_quanttest_4bit_quant
This covers both blockwise quantization (where shape/quant_type are None) and 4-bit quantization (where they are set). The tests use packed=False (the default), which is the path that was broken. The tests verify the round-trip by running dequantization after reconstruction, confirming the reconstructed QuantState is functionally correct.
The test does not cover the packed=True round-trip for blockwise QuantState, but as noted above, that code path is not used in practice.
Minor: Typo fix
The PR also fixes a documentation typo in the dequantize_4bit docstring: "The the absolute" -> "The absolute". This is fine.
- Security: Clear (no new imports, no network access, no filesystem writes, no obfuscation, no invisible Unicode characters)
- Downstream impact: None (4-bit checkpoint serialization format is unchanged; QuantState constructor signature unchanged; vLLM's
QuantState.from_dict()usage is unaffected) - Tests: Adequate (round-trip coverage for both blockwise and 4-bit paths)
- CI: Not triggered (fork PR — a maintainer needs to approve the workflow run before merge)
- Serialization: Compatible (no change to packed checkpoint format for 4-bit; blockwise QuantState is not persisted)
- Cross-PR conflicts: PR #1866 also modifies
bitsandbytes/functional.py(adds__getattr__to QuantState) but touches different lines. No direct conflict, though #1866's__getattr__callsself.as_dict(packed=True)which would be affected by theas_dictchange here. Since #1866 only exercises the 4-bit path (wherequant_typeis always set), there is no semantic conflict. - Commit hygiene: 7 commits including 2 merge commits. Recommend squash merge.
Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>
429db24 to
e59236a
Compare
Fix the failure of QuantState.as_dict then followed by QuantState.from_dict.
A reproducing example is
Before this PR, it failed with (on bitsandbytes 0.47.0):