Skip to content

v1.20.0: SynapseAI v1.23, Transformers v4.55, GPT-OSS, CogVideoX1.5

Latest

Choose a tag to compare

@regisss regisss released this 15 Jan 15:12
· 136 commits to main since this release

SynapseAI v1.23

Transformers v4.55

  • Upgrade to Transformers v4.55 #2209 @regisss
  • [Transformers 4.55] Update typing #2215 @pbielak
  • [Transformers 4.55] Use hf auth login #2216 @pbielak
  • [Transformers 4.55] Remove UTF-8 coding header #2217 @pbielak
  • [Transformers 4.55] Do not use .keys() on dicts #2218 @pbielak
  • [Transformers 4.55] Fix typos #2219 @pbielak
  • [Transformers 4.55] Use hugginface.co instead of arxiv.org #2220 @pbielak
  • Fix Wav2Vec after TF upgrade #2248 @astachowiczhabana
  • Fix for GenerationConfig issue for pipeline init changes by transformers update #2266 @mengker33
  • [mllama] Minor fixes after transformers upgrade #2272 @ugolowic
  • Backport GPTBigCodeAttention from v4.53.0 to avoid major refactoring. #2271 @AKloniecki
  • [opt] Fix in attention after transformers upgrade #2276 @ugolowic
  • Fix a few Transformers 4.55 issues #2282 @regisss
  • Fix MLlama model hierarchy #2290 @astachowiczhabana
  • Fix next_decoder_cache formula based on use_new_cache after tranformers 4.55 upgrade #2284 @karol-brejna-i
  • Skip MiniCPM3-4B model tests, as it doesn't currently work with new Transformers version, even in HF. #2306 @AKloniecki
  • Disable image processor loading for THUDM/glm-4v-9b #2315 @pbielak
  • Fix SpeechT5 generation pipeline for compatibility with transformers==4.55 #2308 @gplutop7
  • Fix for visual-question-answering example #2318 @ugolowic
  • Refactor COCO dataset handling and fix GaudiCLIPAttention shape mismatch after migration to Transformers 4.55 and Datasets 4.0 #2317 @gplutop7
  • Upgrade Whisper Gaudi integration for Transformers 4.55 and migrate dataset to regisss/common_voice_11_0_hi (datasets 4.0.0) #2324 @gplutop7

GPT-OSS

CogVideoX1.5

Model optimizations

  • Add support for --dynamo_allow_unspec_int_on_nn_module option in text generation tests #2200 @gplutop7
  • Mixtral: drop training-branching hack for SFT segfault & add ZeRO-3 leaf utility #2185 @yafshar
  • Add flex attention flags and args to Llama on Habana #2246 @AKloniecki
  • Enable --dynamo_allow_unspec_int_on_nn_module by default in text-generation example #2293 @gplutop7
  • Support FP8 quantization for deepseek_v3/r1 #1907 @skavulya
  • Improve perf for torch.compile scenarios #2303 @astachowiczhabana

Other