Release Release v0.12.1 · Xilinx/brevitas

Highlights

New / Updated PTQ Algorithms:
- Qronos support #1311
- Fixes / improvements to rotation equalization #1310, #1312
- DDP-like bias correction for SDXL (experimental) #1342
Improved layer support:
- Quantization of SDPA without FX #1299
New export flows:
- Initial SHARK Export support: #1300
- Initial GGUF Export: #1291
Improved examples:
- Qronos examples (paper) #1326
- "Benchmark" experiments for stable diffusion, imagenet #1281
- Post-training model expansion examples (paper) #1355
Allow signed scales #1308
QONNX export with dynamo=True #1234

What's Changed

Fix (setup): solve incompatibility between isort and yapf by @Giuseppe5 in #1293
Feat (graph/hadamard): 152 had support by @Giuseppe5 in #1295
feat (ex/llm): fully parametrise attention quantization by @nickfraser in #1287
Feat (ex/llm): "auto" dtype by @pablomlago in #1301
Setup: update transformers version by @Giuseppe5 in #1304
Feat (brevitas_examples/llm): quant SDPA without FX by @Giuseppe5 in #1299
Fix (brevitas_examples/llm): update README and yaml by @Giuseppe5 in #1305
Setup: temporary pin pytest by @Giuseppe5 in #1307
Feat (brevitas_examples/llm): configurable expansion step by @Giuseppe5 in #1280
Versioning support in documentation by @Giuseppe5 in #1298
Feat (graph/rotate): improve R2 region in SDPA by @Giuseppe5 in #1310
Docs: improve docs build by @Giuseppe5 in #1314
Fix (graph/rotation): rotation on subset of channels for SDPA by @Giuseppe5 in #1312
Feat (brevitas_examples/llm): GGUF export by @Giuseppe5 in #1291
Setup: bump torch version by @Giuseppe5 in #1205
Fix (ex/imagenet): add forward pass in imagenet ptq example by @Giuseppe5 in #1316
Feat (qronos): initial implementation of Qronos by @i-colbert in #1311
Fix (graph/equalize): correct class check during rotation merging by @Giuseppe5 in #1317
docs (core): Typo fix in docstring by @nickfraser in #1318
Fix (llm): removing duplicate set_seed function by @i-colbert in #1319
Fix (ex/common): save scales during optimization by @pablomlago in #1313
Fix (llm): compatibility with non-uniform RMSNorm shapes by @i-colbert in #1324
Fix (graph/gpxq): fix memory leak with weight_orig by @Giuseppe5 in #1325
Shark LLM export by @Giuseppe5 in #1300
Feat (scaling): rescaled min-max scaling and zero point by @i-colbert in #1320
Fix (gguf): derived modify_tensors() returns generator by @i-colbert in #1329
Fix (gpxq): device management of weight_orig for GPxQ by @i-colbert in #1330
Fix (gguf): resolving zero point permutation issue with LlamaModel by @i-colbert in #1332
Feat (examples): refactor imagenet and stable_diffusion entrypoints by @pablomlago in #1281
Feat: skipping rotation optimization with load_checkpoint by @i-colbert in #1331
Feat (core): Remove assumptions on positiveness of scales by @pablomlago in #1308
Fix (export/qonnx): Add export support with dynamo=True by @nickfraser in #1234
Fix (core/ops_ste): preserve dtype during clamp by @Giuseppe5 in #1340
Fix (eq/rotation): find value if passed as kwarg by @nickfraser in #1338
Feat (ex): tests Stable Diffusion and ImageNet by @pablomlago in #1339
Feat (ex/sdxl): DDP-like bias correction for SDXL by @pablomlago in #1342
Feat (graph): Minor refactoring layerwise_layer_handler by @pablomlago in #1335
Fix (torch_utils): remove deprecated functions by @Giuseppe5 in #1344
Fix (setup): test against latest 2.1 torch by @Giuseppe5 in #1348
Rotation fix by @Giuseppe5 in #1334
Docs (qronos): adding docs and configs by @i-colbert in #1326
Feat: Added ONNX export to BNN-PYNQ example by @nickfraser in #916
Fix (deps): set accelerate<1.10 by @nickfraser in #1352
Fix (copyright): Fix some missing copyright headers by @nickfraser in #1353
Fix (ex/llm/benchmark): import error by @nickfraser in #1356
Setup: temporarily pin diffusers version by @Giuseppe5 in #1360
Fix (core/scaling): handle edge cases with signed scale by @Giuseppe5 in #1347
Fix(core/stochastic_round): adjust stochastic round device by @Giuseppe5 in #1359
Feat (papers): expansion paper configs by @Giuseppe5 in #1355
Fix (docs): correct link to docs in initial README by @Giuseppe5 in #1361
Fix (setup.py): Update author and contact in setup.py by @nickfraser in #1365
Docs (README): Update maximum PyTorch version by @nickfraser in #1366
Docs (getting started): Typo fix in example by @nickfraser in #1367
requirements: Updated PyTorch, python versions by @nickfraser in #1370
Feat (core/scaling): Add option to restrict the output of (scale / threshold) by @nickfraser in #1369
deps (ex) update accelerate version by @nickfraser in #1371

Full Changelog: v0.12.0...v0.12.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.12.1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

What's Changed

Contributors

Uh oh!