Release v0.5.0 · intel/auto-round

Highlights

refine autoround format inference, support 2,3,4,8 bits and marlin kernel and fix several bugs in auto-round format
support xpu in tuning and inference by @wenhuach21 in #481
support for more vlms by @n1ck-guo in #390
change quantization method name and made several refinements by @wenhuach21 in #500
support rtn via iters==0 by @wenhuach21 in #510
fix bug of mix calib dataset by @n1ck-guo in #492

What's Changed

support xpu in tuning and inference by @wenhuach21 in #481
add light ut, fixtypos by @WeiweiZhang1 in #483
bump into v0.4.7 by @XuehaoSun in #487
fix dataset combine bug by @wenhuach21 in #489
fix llama 8b time cost by @WeiweiZhang1 in #490
update 2bits acc results by @WeiweiZhang1 in #491
fix bug of mix calib dataset by @n1ck-guo in #492
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #494
[GGUF support step3]patch for double quant by @n1ck-guo in #473
refine inference backend/code step 1 by @wenhuach21 in #486
refine inference step 2 by @wenhuach21 in #498
change quantization method name and made several refinements by @wenhuach21 in #500
fix bug of awq/gptq modules_to_not_convert by @n1ck-guo in #501
use --tasks to control evaluation enabling by @wenhuach21 in #505
fix gguf eval regression bug by @n1ck-guo in #506
change to new api in readme by @wenhuach21 in #507
fix setup issue on cuda machine by @wenhuach21 in #511
support rtn via iters==0 by @wenhuach21 in #510
fix critical bug of get_multimodal_block_names by @n1ck-guo in #509
Update requirements-lib.txt by @yiliu30 in #513
add group_size divisible check in backend by @wenhuach21 in #512
support for more vlms by @n1ck-guo in #390
move gguf-dq test to cuda by @n1ck-guo in #520
fix bs!=1 for gemma and MiniMax-Text-01 by @wenhuach21 in #515
add regex support in layer_config setting by @wenhuach21 in #519
patch for vlm by @n1ck-guo in #518
rename backend to packing_format in config.json by @wenhuach21 in #521
fix example's model_dtype by @WeiweiZhang1 in #523
rm fp16 export in autoround format by @wenhuach21 in #525
update convert_hf_to_gguf to support more models by @n1ck-guo in #524
fix light config by @WeiweiZhang1 in #526
fix typos, add model card link for VLMs by @WeiweiZhang1 in #527
add backend readme by @wenhuach21 in #528
update mllm readme by @WeiweiZhang1 in #530
fix bug of cuda ut by @n1ck-guo in #532
fix inference issue by @wenhuach21 in #529
update readme by @wenhuach21 in #531
refine readme by @WeiweiZhang1 in #536
fix cuda ut by @n1ck-guo in #537

Full Changelog: v0.4.7...v0.5.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.5.0

Highlights

What's Changed

Contributors

Uh oh!