v0.67.0-dev20260210
Pre-release
Pre-release
·
192 commits
to main
since this release
Immutable
release. Only release title and notes can be modified.
Note
If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.
The changelog will now follow, showing the changes from last release.
This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/21846858892
📦 Uncategorized
- [skip ci] Disable pytest timeout for Stable Diffusion device perf tests
- PR: #37301
- Implement KV store-and-forward chain optimization for non-causal SDPA
- PR: #37285
- [GPT-OSS] Add high throughput model to vLLM nightly
- PR: #37192
- Update ResNet50 batch_size=32 performance target for Blackhole
- PR: #37373
- Improve tracing tooling to provide the whole inputs for ttnn operations
- PR: #35924
- SDXL Relax encoder2 perf targets
- PR: #37375
- chore: update LLK submodule to 7e7cf4f
- PR: #37358
- Set medgemma's max_prefill_chunk_size the same as gemma-3
- PR: #37305
- Fix setuptools pkg_resources issue
- PR: #37417
- Update SDXL VAE device perf targets after SDPA KV chain forwarding optimization
- PR: #37382
- In post sdpa op, mcast to 13x10 grid
- PR: #37427
- Removed program cache when no_dispatch
- PR: #36772
- Fix FP32 precision loss in untilize for wide tensors
- PR: #37333
- [skip ci] Remove Fabric Sanity Benchmark from BH post-commit tests
- PR: #37430
- DeepSeek Blitz MOE routed expert
- PR: #37294
- Bump blackhole deepseek blitz op tests timeout
- PR: #37369
- Add fix for Qsr packet_tag breaking compilation
- PR: #37428
- Use up-to-date main() declaration in all kernels & docs
- PR: #37147
- [skip ci] #0: remove mamba from perf models yaml
- PR: #37387
- Increase coverage of unpack reconfig
- PR: #36987
- Add 4 chunks scatter_write and extra ring optimization to all_to_all_async_generic
- PR: #37339
- Optimize decode for Llama3-70B for TG
- PR: #37359
- [skip ci] Optimize pkg-resources patch
- PR: #37438
- Add the accuracy_tips tech report
- PR: #37146