Skip to content

v0.40.0

Choose a tag to compare

@github-actions github-actions released this 09 Jan 20:01
· 18456 commits to main since this release

📦 Uncategorized

  • Opt LN_sharded and SMX_sharded
  • #1919: Turn existing allocator tests into gtests
  • Agrebenisan/fd perf opt
  • #3932: Rename unary op args which were input_a -> input, binary ops from input, other -> input_a, input_b
  • #3971: Fix TSLICE printing truncation when hitting MAX_COUNT
  • #0: Fix undefined variable error when running with watcher
  • #4141: Add GetPreferredNOCForDRAMRead, GetPreferredNOCForDRAMWrite and update all ops to use these apis
  • #3420: fix eth core init L1 bug
  • #0: Add ttnn founding engineers as CODEOWNERS of functional models
  • #0: Commonize logic between E2E and device perf functions/scripts. Enable assertions for device perf scripts/ci
  • Issue 4073: Fix for host-side hanging when an invalid DPRINT WAIT command is running on the device.
  • #0: Add tt-rkim as CODEOWNERS for setup_hugepages.py
  • #4003: implemented functional t5 model
  • #3003: commonized variable names across tnn tests. Removed ttnn.experimental. Added ttnn.unary and commonized the import of ttl unary ops
  • #0: Delete extra text in first docs page about being added to repo
  • write watcher log to built/ folder rather than kernel subfolder
  • Add Batch>1 fix for matmul blocking API
  • #4231: improve unary add, sub, mul and div implementation in SFPU. Add complex polar operator
  • #3493: sharded tensor support
  • REVERT #4231: Fine-tune the unary ops to improve performance
  • #0: Move setup_hugepages.py to release assets
  • #0: (MINOR) Update VERSION to 0.40.0
  • #4301: Fix link to announcements in README
  • #4301: Replace some more instances of Metal w/ Metalium in docs
  • Llk refactor uplift
  • #0: Fix TT-Metalium docs link in get_performance.rst
  • #0: uplift in device code
  • #4176: uplift umd plus tt_metal changes
  • init fw once
  • Merge v2 of untilize_with_halo, maxpool, and conv ops for Resnet-50
  • Backward ops for Metalium - part-2
  • #4211: Assert that hugepages number is greater than or equal to required, rather than equal to
  • Update resnet readme
  • Add Run Instructions for BERT_large sharded in readme
  • Add batch 20 for resnet-50
  • #4376: Support mixed precision for eltwise binary with prescaling
  • Increase timeout of slow dispatch unit tests and switch to Y_M_D format for ops logs
  • #0: point umd to main, comestic change
  • New tilize and straightforward vec gen in matmul kernel examples
  • #4216: Enable DPrint slow dispatch testing
  • #4376: Call llk reconfig functions in compute kernel apis for WH
  • #4336: #4386: Fix interleaved_to_sharded writer waiting on incorrect amount of data for uneven shards
  • #1433: removed Device* and MemoryConfig from DeviceStorage
  • #0: Increase fast dispatch post commit timeout and shorten full regressions because we no longer need that much time
  • #4003: added ttnn.mean, ttnn.rsqrt and ttnn.pow and deleted and got rid of ttl use in ttnn_functional_t5. Updated ttnn.Tensor to store shape as ttnn.Shape
  • Aliu/load base erisc
  • #4399: add spell checker script for docs spellchecking
  • #2134: Uplift UMD
  • #0: fix memory leaks found in test_sfpu via valgrind
  • Revert "#4399: add spell checker script spellcheck.sh should be read…
  • #0: update llk.rst for minor ReST syntax
  • #2934: Make one CommandQueue and one HW CommandQueue (SysmemWriter) per device
  • #4003: convert ttl.tennsor.Shape to tuple when using it in torch functions
  • #4211: Fix HP targeting issues in main from cq-per-device changes