Release v1.20.0: SynapseAI v1.23, Transformers v4.55, GPT-OSS, CogVideoX1.5 · huggingface/optimum-habana

SynapseAI v1.23

Upgrade to SynapseAI v1.23 8178999 @astachowiczhabana

Transformers v4.55

Upgrade to Transformers v4.55 #2209 @regisss
[Transformers 4.55] Update typing #2215 @pbielak
[Transformers 4.55] Use hf auth login #2216 @pbielak
[Transformers 4.55] Remove UTF-8 coding header #2217 @pbielak
[Transformers 4.55] Do not use .keys() on dicts #2218 @pbielak
[Transformers 4.55] Fix typos #2219 @pbielak
[Transformers 4.55] Use hugginface.co instead of arxiv.org #2220 @pbielak
Fix Wav2Vec after TF upgrade #2248 @astachowiczhabana
Fix for GenerationConfig issue for pipeline init changes by transformers update #2266 @mengker33
[mllama] Minor fixes after transformers upgrade #2272 @ugolowic
Backport GPTBigCodeAttention from v4.53.0 to avoid major refactoring. #2271 @AKloniecki
[opt] Fix in attention after transformers upgrade #2276 @ugolowic
Fix a few Transformers 4.55 issues #2282 @regisss
Fix MLlama model hierarchy #2290 @astachowiczhabana
Fix next_decoder_cache formula based on use_new_cache after tranformers 4.55 upgrade #2284 @karol-brejna-i
Skip MiniCPM3-4B model tests, as it doesn't currently work with new Transformers version, even in HF. #2306 @AKloniecki
Disable image processor loading for THUDM/glm-4v-9b #2315 @pbielak
Fix SpeechT5 generation pipeline for compatibility with transformers==4.55 #2308 @gplutop7
Fix for visual-question-answering example #2318 @ugolowic
Refactor COCO dataset handling and fix GaudiCLIPAttention shape mismatch after migration to Transformers 4.55 and Datasets 4.0 #2317 @gplutop7
Upgrade Whisper Gaudi integration for Transformers 4.55 and migrate dataset to regisss/common_voice_11_0_hi (datasets 4.0.0) #2324 @gplutop7

GPT-OSS

Enable GPT-OSS #2214 @schoi-habana

CogVideoX1.5

Enable cogvideox1.5 and image2video pipeline #2149 @wenbinc-Bin

Model optimizations

Add support for --dynamo_allow_unspec_int_on_nn_module option in text generation tests #2200 @gplutop7
Mixtral: drop training-branching hack for SFT segfault & add ZeRO-3 leaf utility #2185 @yafshar
Add flex attention flags and args to Llama on Habana #2246 @AKloniecki
Enable --dynamo_allow_unspec_int_on_nn_module by default in text-generation example #2293 @gplutop7
Support FP8 quantization for deepseek_v3/r1 #1907 @skavulya
Improve perf for torch.compile scenarios #2303 @astachowiczhabana

Other

Add Qwen3 classification and a classification test #2150 @tianyuan211
Using LinearAllreduce flag for qwen2_moe and qwen3_moe #2143 @ranzhejiang
fea(gemma2): cleaned up old codes #2212 @imangohari1
Respect --dataset_max_samples when using --mlcommons_dataset #2223 @pbielak
Make mbxp_evaluation setup script non-interactive #2225 @pbielak
Ensure decoder_only tests are not performed on encoder-only models. #2226 @AKloniecki
Add system_instruction support for lm_eval in OH #2224 @srajabos
Add slow test for installs and update example requirements (peft) #2227 @gplutop7
Add configurable DeepSpeed installation via DEEPSPEED_SPEC env var in Makefile #2229 @gplutop7
fea(): WA for run_clm config imports #2232 @imangohari1
lm_eval to 0.4.9.1 and support for new args - rebased #2228 @12010486
Synchronize processes when saving model #2230 @pbielak
Upgrade numba to 0.61.0 #2237 @astachowiczhabana
Respect link on release branch #2244 @astachowiczhabana
Fix style after ruff upgrade #2253 @astachowiczhabana
More accelerate upstream #1979 @IlyasMoutawwakil
Remove PT_HPU_LAZY_ACC_PAR_MODE #2245 @astachowiczhabana
Fix: Convert np.float64 to native float in memory logging #2242 @yafshar
Fix: INC FP8 quantization compatibility #2247 @yafshar
[image-classification] README example fix #2236 @ugolowic
CWE 476 llava, gpt2, falcon, cohere, all_model #2208 @karol-brejna-i
fea(pytests): Added gemma/gemma2 HF pytests #2213 @imangohari1
Readme to go along with system instructions for lm_eval #2239 @srajabos
Disable SDP on BF16 default for generic diffusers HPU support #2251 @dsocek
Fixed CWE-561 Dead Code for "next_decoder_cache" Pattern #2255 @karol-brejna-i
Fix CWE-561 Dead Code Vulnerability related to use_new_cache = False #2254 @karol-brejna-i
Fix for eos in eager mode #2197 @12010486
Fix Habana profiler crash by making start() idempotent #2222 @gplutop7
Lm_eval static generation improved #2241 @12010486
Bump datasets to ≥4.0.0 across examples; keep LM-Eval on <4.0.0 #2250 @gplutop7
Disable sporadically failing test to unblock CI. #2256 @AKloniecki
Fix HabanaProfile unit tests to align with idempotent start behavior #2258 @astachowiczhabana
Check if generation_config has attr 'static_shapes' before accessing value #2264 @AKloniecki
Add measurements postprocessing script #2260 @AKloniecki
ASR: Disable torchcodec in datasets and load audio via soundfile (fix FLAC decode errors on HPU) #2261 @gplutop7
fea(ci): Updated the fixtures based on changes in PR#2246 #2262 @imangohari1
Revert transformers autocast_smart_context_manager #2265 @IlyasMoutawwakil
Fix import of check_torch_load_is_safe #2268 @gplutop7
Revert prepare_model to keep FSDP upcast disablement #2270 @IlyasMoutawwakil
Fix QLoRA test #2269 @vivekgoe
Fast ddp in prepare_model #2275 @IlyasMoutawwakil
Enable FSDP upcast #2280 @IlyasMoutawwakil
Fixes for Device Mismatch and Configuration Conflict #2283 @astachowiczhabana
Fix missing AutoModel register for llama #2286 @astachowiczhabana
Migrate common_language dataset to Datasets v4.0.0 and switch audio decoding to SoundFile #2287 @gplutop7
Fix text classification for datasets 4.0.0 #2292 @gplutop7
Patch apply fp8 to avoid issues with deepspeed+fp8 #2289 @IlyasMoutawwakil
Align evaluate version to 0.4.5 in question-answering requirements #2295 @gplutop7
Change static to relative links in READMEs and docs. #2263 @AKloniecki
Change static to relative in mdx and ipynb #2298 @astachowiczhabana
Add --attn_implementation cmd argument in text-generation/run_generation script. #2299 @AKloniecki
Remove dead code from modeling_qwen3_moe.py and modeling_qwen2.py #2302 @karol-brejna-i
Switch Matthijs/cmu-arctic-xvectors to regisss/cmu-arctic-xvectors (Parquet version) #2305 @gplutop7
More accelerate upstream #1979 @IlyasMoutawwakil
[wav2vec2] Bring back gaudi fused sdpa attention #2301 @ugolowic
Remove repeating flash_attention options. #2312 @ugolowic
Fix run_clm.py for streaming datasets #2309 @pbielak
Initial commit of Optimum-Habana Container and Readme #2297 @vdwarakn
Switch RAFT dataset source from ought/raft to regisss/raft for compatibility with datasets>=4.0.0 #2310 @gplutop7
Fix Optimum dependency so that optimum[habana] is correctly installed 8f02490 @regisss
Set missing baseline values for text_generation tests. #2316 @AKloniecki
Align distributed_runner with openmpi 5.0 #2320 @astachowiczhabana
Fix Accelerate==1.10.1 #2322 @astachowiczhabana
Remove flash attention flags from run_clm.py #2314 @pbielak
Fix doc build in the main branch #2327 @regisss
Fix device index parsing in accelerator #2328 @pbielak
Pin optimum version (due to GPTQ deprecation #2359 @karol-brejna-i

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.20.0: SynapseAI v1.23, Transformers v4.55, GPT-OSS, CogVideoX1.5

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

SynapseAI v1.23

Transformers v4.55

GPT-OSS

CogVideoX1.5

Model optimizations

Other

Contributors

Uh oh!