Skip to content

v0.67.0-dev20260210

Pre-release
Pre-release

Choose a tag to compare

@github-actions github-actions released this 11 Feb 00:55
· 192 commits to main since this release
Immutable release. Only release title and notes can be modified.
807ee3d

Note

If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.

The changelog will now follow, showing the changes from last release.

This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/21846858892

📦 Uncategorized

  • [skip ci] Disable pytest timeout for Stable Diffusion device perf tests
  • Implement KV store-and-forward chain optimization for non-causal SDPA
  • [GPT-OSS] Add high throughput model to vLLM nightly
  • Update ResNet50 batch_size=32 performance target for Blackhole
  • Improve tracing tooling to provide the whole inputs for ttnn operations
  • SDXL Relax encoder2 perf targets
  • chore: update LLK submodule to 7e7cf4f
  • Set medgemma's max_prefill_chunk_size the same as gemma-3
  • Fix setuptools pkg_resources issue
  • Update SDXL VAE device perf targets after SDPA KV chain forwarding optimization
  • In post sdpa op, mcast to 13x10 grid
  • Removed program cache when no_dispatch
  • Fix FP32 precision loss in untilize for wide tensors
  • [skip ci] Remove Fabric Sanity Benchmark from BH post-commit tests
  • DeepSeek Blitz MOE routed expert
  • Bump blackhole deepseek blitz op tests timeout
  • Add fix for Qsr packet_tag breaking compilation
  • Use up-to-date main() declaration in all kernels & docs
  • [skip ci] #0: remove mamba from perf models yaml
  • Increase coverage of unpack reconfig
  • Add 4 chunks scatter_write and extra ring optimization to all_to_all_async_generic
  • Optimize decode for Llama3-70B for TG
  • [skip ci] Optimize pkg-resources patch
  • Add the accuracy_tips tech report