release 0.43.3

Titus-von-Koeller · Titus-von-Koeller · commit 2e03d3440f5b · 2024-07-30T18:33:34.000Z
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,3 +1,12 @@
+### 0.43.3
+
+#### Improvements:
+
+- FSDP: Enable loading prequantized weights with bf16/fp16/fp32 quant_storage
+    - Background: This update, linked to [Transformer PR #32276](https://github.com/huggingface/transformers/pull/32276), allows loading prequantized weights with alternative storage formats. Metadata is tracked similarly to `Params4bit.__new__` post PR #970. It supports models exported with non-default `quant_storage`, such as [this NF4 model with BF16 storage](https://huggingface.co/hugging-quants/Meta-Llama-3.1-405B-BNB-NF4-BF16).
+    - Special thanks to @winglian and @matthewdouglas for enabling FSDP+QLoRA finetuning of Llama 3.1 405B on a single 8xH100 or 8xA100 node with as little as 256GB system RAM.
+
+
 ### 0.43.2
 
 This release is quite significant as the QLoRA bug fix big implications for higher `seqlen` and batch sizes.
diff --git a/bitsandbytes/__init__.py b/bitsandbytes/__init__.py
@@ -21,4 +21,4 @@
     "optim.optimizer.MockArgs": False,
 }
 
-__version__ = "0.43.3.dev"
+__version__ = "0.43.3"
diff --git a/setup.py b/setup.py
@@ -25,7 +25,7 @@ def has_ext_modules(self):
 
 setup(
     name="bitsandbytes",
-    version="0.43.3.dev",
+    version="0.43.3",
     author="Tim Dettmers",
     author_email="dettmers@cs.washington.edu",
     description="k-bit optimizers and matrix multiplication routines.",

Original file line number	Diff line number	Diff line change
`@@ -21,4 +21,4 @@`
`21`	`21`	`"optim.optimizer.MockArgs": False,`
`22`	`22`	`}`
`23`	`23`
`24`		`-__version__ = "0.43.3.dev"`
	`24`	`+__version__ = "0.43.3"`