v0.36.1
·
19130 commits
to main
since this release
Metal
Wormhole Bringup
- Added some APIs to query device ethernet connectivity.
- Added first phase of ethernet data movement support, basic unit tests passing on N300.
API Changes
Notes not available.
Tools - Profiler
- Device only and host only profiling options for profile_this.py script
- Examples for fast dispatch device program profiling
Tools - Watcher
- Added kernel names/paths to watcher log file
Extra features
Notes not available.
Eager/ttNN
Infrastructure
- Added initial implementation of TTNN APIs
- Added functions to interface with torch: from_torch, to_torch
- Added functions to move tensor to/from device: to_device, from_device
- Added functions to change the layout of the tensor: to_layout
- Added matmul, add, sub, mul, reshape, permute and softmax operations
- Implemented Multi-Head-Attention using TTNN APIs
- Added 3 tutorials to showcase TTNN
- Updated Documentation to describe TTNN and its APIs
Operations
Following on-device operators are added to tt_lib.tensor module:
- interleave repeat
- triu
- tril
- rmsnorm
- groupnorm
- silu (update to be first-class unary operator)
Models
- For BERT demo, added loading of cached pre-processed weights (stored as TT tensors) to avoid conversion from Torch to TT tensors.
- Added demo for ResNet that executes on TT hardware. Demo takes images from ImageNet and processes them in batches of 8.