v0.35.0
·
19363 commits
to main
since this release
Metal
Wormhole Bringup
- Extended gtests to run on all available devices in Wormhole systems.
- Single device tests passing on remote chips.
API Changes
-
These 2 functions:
uint32_t CreateSemaphore(Program &program, const CoreRange &core_range, uint32_t initial_value)uint32_t CreateSemaphore(Program &program, const CoreRangeSet &core_range_set, uint32_t initial_value)
have been replaced by
uint32_t CreateSemaphore(Program &program, const std::variant<CoreRange,CoreRangeSet> &core_spec, uint32_t initial_value).
-
These 3 functions:
void SetRuntimeArgs(const Program &program, KernelID kernel, const CoreCoord &logical_core, const std::vector<uint32_t> &runtime_args)void SetRuntimeArgs(const Program &program, KernelID kernel, const CoreRange &core_range, const std::vector<uint32_t> &runtime_args)void SetRuntimeArgs(const Program &program, KernelID kernel, const CoreRangeSet &core_range_set, const std::vector<uint32_t> &runtime_args)
have been replaced by
void SetRuntimeArgs(const Program &program, KernelID kernel, const std::variant<CoreCoord, CoreRange, CoreRangeSet> &core_spec, const std::vector<uint32_t> &runtime_args)
-
These 2 functions:
KernelID CreateDataMovementKernel(Program &program, const std::string &file_name, const std::variant<CoreCoord, CoreRange, CoreRangeSet> &core_spec, const std::optional<DataMovementConfig> &config = {})KernelID CreateComputeKernel(Program &program, const std::string &file_name, const std::variant<CoreCoord, CoreRange, CoreRangeSet> &core_spec, const std::optional<ComputeConfig> &config = {})
have been replaced by:
KernelID CreateKernel(Program &program, const std::string &file_name, const std::variant<CoreCoord, CoreRange, CoreRangeSet> &core_spec, const std::variant<DataMovementConfig,ComputeConfig> & config)
Tools - Profiler
- Improved
profile_this.pylog management strategy to avoid conservative log folder checks from profiling
Extra features
- Runtime Compute Args: Arguments can be sent to Compute Kernels at runtime in the same way as DataMovement Kernels. The kernel uses the same
get_arg_val<type>(<index>)to retrieve it. The host uses the samett_metal::SetRuntimeArgs(Program program, KernelID kernel, const std::variant<CoreCoord, CoreRange, CoreRangeSet> & core_spec, const std::vector<uint32_t> &runtime_args), as the host used to communicate to DataMovement Kernels.
Eager (Ops)
There have been no notable changes to communicate in this release.
Models
- Moved code that implements and tests models from tests/models to top level models folder. In the models folder, models are separated into demos (working models with end2end demo code) and experimental (models that are under development).
- Added implementation of Falcon7B for GS and PyTorch demos for nanoGPT and T5
- Added BERT Large end2end demo on GS (set up for question answering)