You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are excited to share an update on the dim order feature in the ExecuTorch stack, which represents a significant step forward in how we handle tensor memory format at the IR level. This post consolidates all previous updates and highlights the progress made, ensuring that all progress is visible and celebrated.
What is a Dim Order?
A dim order is a tensor-level memory format representation that describes the layout of a dense tensor in memory. It serves as the source of truth for understanding the memory layout of input tensors across various components in ExecuTorch, particularly delegates and operators. It aims to replace torch.memory_format in the ExecuTorch stack for underlying memory representation. Its correctness is crucial for ensuring the accurate execution of tensor operations. For more details, refer to the PyTorch core documentation and ExecuTorch documentation.
Supporting Dim Order
Adding torch.tensor.dim_order() in PyTorch
We've added a new API to PyTorch, dim_order, which generates the dim order of a tensor in memory and provides functionality to detect ambiguity. With this API, one can confidently determine the dim order of your tensors and optimize your model lowering and performance. Read our original post to learn more about how to use dim_order and its benefits for your deep learning projects.
ET Export Flow: torch.memory_format in, Dim Order out
Support for dim order in the Edge dialect export flow has matured significantly. We have integrated dim order into the tensor IR and enabled its export from eager models to ExecuTorch models. Furthermore, we developed passes to replace operators requiring memory format inputs (e.g., to_copy) with our own performant functions taking dim order as input. Additionally, we implemented a verification mechanism to ensure the graph legally supports dim order. These updates enable support for multiple dim orders within a model graph, which is now the default behavior of the ExecuTorch export flow.
Dim Order Portable Operators and Runtime Support
At runtime, the dim order serves as the foundation for determining the memory format of input tensors. The memory format information for each runtime tensor, including strides, is derived from or relies on the dimension order. To ensure compatibility, all portable operators incorporate sanity checks to verify that the input tensor's dimension order aligns with their expectations. Furthermore, select operators now provide specialized support for contiguous and channels_last dimension orders. Additionally, every portable operator that accepts memory format as input has a corresponding variant based on dimension order. Serializing dim order for every ET managed tensor is supported. Tensor utility functions rely on dim order for calculating strides.
Delegate Support for Dim Order
Delegate support is critical for dim order functionality. We are thrilled to announce that several major delegates now dim order compatible, both AoT and at runtime. This includes XNNPACK, CoreML, Arm, QNN, Vulkan, MPS and MTK. This widespread support ensures compatibility and extends the functionality of ExecuTorch with dim order representation and operations. Delegates now have enough building blocks available in ET AoT and runtime to implement dim order related graph optimizations.
Example
Enabling/Disabling Dim Order
In ExecuTorch, using dim order is now the default behavior, so no specific configuration is required to enable it. For more information on exporting your model to ExecuTorch, please refer to our example.
If you need to temporarily disable dim order in your graph, you can set _skip_dim_order to True in the EdgeCompileConfig when exporting your model:
You can add custom export passes to modify the dim order of specific parts of your model easily. For an example of how to do this, see here.
Delegate Support
If you're a delegate owner looking to make your delegate implementation dim order compatible, or trying to avoid permute nodes from your delegate graph, you may find this post helpful.
Next Steps
The overarching goal is to enhance the ExecuTorch (ET) ahead-of-time (AoT) and runtime experience with dim orders. This involves further refining and enforcing Edge IR dim order guarantees, including those for delegates, and ensuring a seamless experience comparable to PyTorch.
To achieve this, first, we will ensure that all portable operators support tensor dim orders, at a minimum, those mapped directly to PyTorch-defined memory formats. Relevant tests will be implemented to validate this functionality.
Additionally, we will provide support to delegate authors in leveraging dim order, particularly to optimizing the graph locally or globally to minimize tensor permutations and copies.
Conclusion
The successful implementation of dim order in ExecuTorch represents a significant milestone in our journey to provide a robust and flexible framework for tensor memory layout representation. This achievement would not have been possible without the collective efforts of the PyTorch and ExecuTorch communities, and we are grateful for their dedication and support. If you have any questions, feedback, or suggestions for improvement, please leave your comment here, or open a discussion on ExecuTorch GitHub.
This discussion was converted from issue #8037 on January 31, 2025 18:15.
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
We are excited to share an update on the dim order feature in the ExecuTorch stack, which represents a significant step forward in how we handle tensor memory format at the IR level. This post consolidates all previous updates and highlights the progress made, ensuring that all progress is visible and celebrated.
What is a Dim Order?
A dim order is a tensor-level memory format representation that describes the layout of a dense tensor in memory. It serves as the source of truth for understanding the memory layout of input tensors across various components in ExecuTorch, particularly delegates and operators. It aims to replace torch.memory_format in the ExecuTorch stack for underlying memory representation. Its correctness is crucial for ensuring the accurate execution of tensor operations. For more details, refer to the PyTorch core documentation and ExecuTorch documentation.
Supporting Dim Order
Adding torch.tensor.dim_order() in PyTorch
We've added a new API to PyTorch, dim_order, which generates the dim order of a tensor in memory and provides functionality to detect ambiguity. With this API, one can confidently determine the dim order of your tensors and optimize your model lowering and performance. Read our original post to learn more about how to use dim_order and its benefits for your deep learning projects.
ET Export Flow: torch.memory_format in, Dim Order out
Support for dim order in the Edge dialect export flow has matured significantly. We have integrated dim order into the tensor IR and enabled its export from eager models to ExecuTorch models. Furthermore, we developed passes to replace operators requiring memory format inputs (e.g., to_copy) with our own performant functions taking dim order as input. Additionally, we implemented a verification mechanism to ensure the graph legally supports dim order. These updates enable support for multiple dim orders within a model graph, which is now the default behavior of the ExecuTorch export flow.
Dim Order Portable Operators and Runtime Support
At runtime, the dim order serves as the foundation for determining the memory format of input tensors. The memory format information for each runtime tensor, including strides, is derived from or relies on the dimension order. To ensure compatibility, all portable operators incorporate sanity checks to verify that the input tensor's dimension order aligns with their expectations. Furthermore, select operators now provide specialized support for contiguous and channels_last dimension orders. Additionally, every portable operator that accepts memory format as input has a corresponding variant based on dimension order. Serializing dim order for every ET managed tensor is supported. Tensor utility functions rely on dim order for calculating strides.
Delegate Support for Dim Order
Delegate support is critical for dim order functionality. We are thrilled to announce that several major delegates now dim order compatible, both AoT and at runtime. This includes XNNPACK, CoreML, Arm, QNN, Vulkan, MPS and MTK. This widespread support ensures compatibility and extends the functionality of ExecuTorch with dim order representation and operations. Delegates now have enough building blocks available in ET AoT and runtime to implement dim order related graph optimizations.
Example
Enabling/Disabling Dim Order
In ExecuTorch, using dim order is now the default behavior, so no specific configuration is required to enable it. For more information on exporting your model to ExecuTorch, please refer to our example.
If you need to temporarily disable dim order in your graph, you can set
_skip_dim_order
toTrue
in theEdgeCompileConfig
when exporting your model:Manipulating Dim Order AoT in the Graph
You can add custom export passes to modify the dim order of specific parts of your model easily. For an example of how to do this, see here.
Delegate Support
If you're a delegate owner looking to make your delegate implementation dim order compatible, or trying to avoid permute nodes from your delegate graph, you may find this post helpful.
Next Steps
Conclusion
The successful implementation of dim order in ExecuTorch represents a significant milestone in our journey to provide a robust and flexible framework for tensor memory layout representation. This achievement would not have been possible without the collective efforts of the PyTorch and ExecuTorch communities, and we are grateful for their dedication and support. If you have any questions, feedback, or suggestions for improvement, please leave your comment here, or open a discussion on ExecuTorch GitHub.
Great thanks @digantdesai and @larryliu0820 for continued support and discussion!
Beta Was this translation helpful? Give feedback.
All reactions