Metal

API Changes

Top-level API to create a Program:
Program CreateProgram();
GetRuntimeArgs now returns a reference to underlying runtime args to allow for in-place updates. This results in noticeably better performance for host-bound workloads:
std::vector<uint32_t>& GetRuntimeArgs(const Program &program, KernelID kernel_id, const CoreCoord &logical_core);
Two other variants of updating runtime arguments that results in better host-side performance in certain situations:
- void UpdateRuntimeArg(const Program &program, KernelID kernel, const std::variant<CoreCoord, CoreRange, CoreRangeSet> &core_spec, size_t offset, uint32_t value);
- void SetRuntimeArgs(const Program &program, KernelID kernel, const std::vector< CoreCoord > & core_spec, const std::vector< std::vector<uint32_t> > &runtime_args);
(NOTE: UpdateRuntimeArg is getting removed by next release as it’s use as been superseded by the other functions)
GetCircularBufferConfig now returns a const reference: const CircularBufferConfig &GetCircularBufferConfig(Program &program, CircularBufferID cb_handle);
Updating circular buffer config parameters are done through separate 3 functions:
- void UpdateCircularBufferTotalSize(Program &program, CircularBufferID cb_handle, uint32_t total_size);
- void UpdateCircularBufferPageSize(Program &program, CircularBufferID cb_handle, uint8_t buffer_index, uint32_t page_size);
- void UpdateDynamicCircularBufferAddress(Program &program, CircularBufferID cb_handle, const Buffer &buffer);
Moved slow/host dispatch APIs to detail namespace:
- void LaunchProgram(Device *device, Program &program);
- void ReadFromBuffer(const Buffer &buffer, std::vector<uint32_t> &host_buffer);
- void WriteToBuffer(const Buffer &buffer, const std::vector<uint32_t> &host_buffer);

Tools - Profiler

Updating the path for all profiler artifacts to be under generated/profiler folder

ttNN

Infrastructure

Introduced ttnn.embedding to facilitate word embeddings
Added preprocess_parameters for generic conversion of torch parameters with caching
Added ttnn.experimental.gelu
Added ttnn.experimental.layer_norm
Updated program hash to be std::size_t and significantly sped up its computation

Operations

Support for split tensor into two has support for tensor [W, Z, Y, X] shape along Y in addition to existing X.
Support trunc function has fallback support equivalent to torch.trunc
Support power function with exponent which is not integral: tt_lib.tensor.power_fp()
Support for reshape operator on host for ROW_MAJOR layout

Models

Notes not available.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.37.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Metal

API Changes

Tools - Profiler

ttNN

Infrastructure

Operations

Models

Uh oh!