Release v0.2.0 · huggingface/optimum-neuron

What's Changed

Cache granite and phi4 models by @dacorvo in #809
Refactor hub neuronx cache by @dacorvo in #829
Add Whisper for the task "automatic-speech-recognition" w/o. KV cache by @JingyaHuang in #789
Add support for Modern BERT by @JingyaHuang in #818
Set task to none for multi models cache entry by @dacorvo in #832
ci: add cv2 to workaround transformers spurious import by @dacorvo in #834
Refactor decoder modeling by @dacorvo in #835
Refactor decoder export by @dacorvo in #837
Add decoder custom modeling for inference based on NxD by @dacorvo in #840
Activate continuous batching for Llama on NxD by @dacorvo in #848
Tgi integration by @dacorvo in #855
Avoid loading weights when exporting an NxD model using the CLI by @dacorvo in #860
test(speculation): do not load weights during export by @dacorvo in #861

Training remove gpt neo models support by @tengomucho in #807
chore(test): add test comparing Linear and RowParallelLinear outputs by @tengomucho in #814
More training tests updates by @tengomucho in #808
test(training): add flash attention test by @tengomucho in #824
Granite modeling for training by @tengomucho in #830
Cache Hub API Changes by @tengomucho in #836
Custom modeling for training by @michaelbenayoun in #801
🪨 Granite Training by @tengomucho in #845
Training granite warning flash attention by @michaelbenayoun in #849
Add Qwen3 modeling for training by @tengomucho in #850

latest available tgi dlc uri by @pagezyhf in #812
Add guidelines on EC2 creation with the DLAMI by @pagezyhf in #795
Add per service section in tutorials and a first example for tutorial > inference > SageMaker by @pagezyhf in #796
Mixtral Sagemaker Inference tutorial by @pagezyhf in #820
spelling nit in pipelines.mdx by @jimburtoft in #823
Initial PR for the documentation refactoring by @JingyaHuang in #791
training dlc doc by @pagezyhf in #844
Adding environment options explanation by @jimburtoft in #798
Update the list of supported LLM models by @dacorvo in #859
Update Llama benchmarks by @dacorvo in #858
feat: Add Continuous pre-training example for SageMaker hyperpod by @Captainia in #842
Fix typos by @omahs in #846

Fix broken cache for traced models & fix runtime error of diffusion models when batch_size > 1 by @JingyaHuang in #811
Fix doc ci by @JingyaHuang in #838

Full Changelog: v0.1.0...v0.2.0