Replies: 4 comments
-
Hi @csn1800, thank you for your interest in the Vulkan delegate! As you've pointed out, the Vulkan delegate currently has a focus on Edge/Mobile SoCs. However, it should also be possible to access and run the Vulkan delegate on a Linux machine. In fact, we use Linux as a development environment when developing in the Meta internal development repository so I can confirm that it works. I have also been able to build and run tests for the Vulkan delegate via the open source repository.
To my knowledge, there are no known issues at the moment. However, our compute shaders, especially shaders for bottleneck operators such as matrix multiplication, convolution, etc. are not optimized for server environments and performance will be poor. We are currently focused on optimizing for mobile GPUs, but I want to eventually add compute shaders that are optimized for NVIDIA GPUs that take advantage of Vulkan extensions such as
Here are the steps I use to build and test the Vulkan delegate on Linux: cd ~/executorch
# Install ExecuTorch with the Vulkan delegate enabled
(rm -rf cmake-out && \
cmake . \
-DCMAKE_INSTALL_PREFIX=cmake-out \
-DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
-DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
-DEXECUTORCH_BUILD_EXTENSION_RUNNER_UTIL=ON \
-DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
-DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
-DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \
-DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
-DEXECUTORCH_BUILD_TESTS=ON \
-DEXECUTORCH_BUILD_VULKAN=ON \
-Bcmake-out && \
cmake --build cmake-out -j64 --target install)
# Build the Vulkan delegate test binary
(rm -rf cmake-out/backends/vulkan/test && \
cmake backends/vulkan/test \
-DCMAKE_INSTALL_PREFIX=cmake-out \
-DPYTHON_EXECUTABLE=python \
-Bcmake-out/backends/vulkan/test && \
cmake --build cmake-out/backends/vulkan/test -j16)
# Run the test binary
cmake-out/backends/vulkan/test/vulkan_compute_api_test --gtest_filter="*print*" Please try these steps out and let me know if it works for you. Regarding your last two questions:
Are you asking if there's an example of another application linking against the ExecuTorch runtime with the Vulkan SDK? If so, the llama runner example binary may be a good reference. Specifically, here is where I believe they are linking against the vulkan backend. Please let me know if you have any more questions! |
Beta Was this translation helpful? Give feedback.
-
Btw, a similar issue I assisted with recently was #7343. You can find some more information in the discussion in that issue as well, though the steps I provided are the same as the steps I gave in my earlier comment. |
Beta Was this translation helpful? Give feedback.
-
Thank you so much for the detailed response and the build steps. I am currently evaluating them in my environment and will update you with my observations as soon as possible.
My work involves ARM platforms with mobile GPUs such as Broadcom Videocore IV, and my question pertains to models for instance segmentation, specifically Detectron models. Additionally, I am interested in understanding the support for models (speech recognition/text) requiring:
Given the focus on mobile GPUs, it would be helpful to know if there are any ongoing or planned optimizations for these types of models on mobile platforms. Insights into any performance benchmarks or best practices for deploying such models would also be valuable.
Thank you for sharing the example and the ticket. The scenario I am focused on requires a C++ application to be linked with the ExecuTorch libraries and to use the C++/C APIs directly, without any Python abstraction. Any guidance you could provide in this context would be greatly appreciated. Apologies for not being clear in my earlier post. Additionally, if there are any specific examples or documentation that detail the process of linking C++ applications with ExecuTorch libraries, that would be extremely helpful. Understanding the nuances of integrating with the Vulkan backend in this setup is crucial for my project. |
Beta Was this translation helpful? Give feedback.
-
The Vulkan Delegate was started last year, and the focus in the first year was building the core components of the platform and adding initial implementations of several operators. The focus for this year will be optimization, both for latency and memory consumption. In particular, my specific focus this year will be to optimize 4-bit weight quantized matrix multiplication to improve performance on Transformer models.
We are currently working on optimizing weight quantized operators, but that may be different than what you mean here. In these quantized shaders a quantized weight value is converted back into a floating point value and the computation is performed in floating point. We currently do not have plans to work on any operators that perform only integer compute. However, if you are interested in adding compute shaders to the delegate to support your use-case, I can help guide you through that process.
This will be a lot tougher. We currently do not support dynamic graphs with loops and conditions. The ideal scenario for Vulkan delegate compute is a static graph that can be compiled into a single command buffer that doesn't need to be rebuilt across inferences. However, I'm not opposed to supporting dynamism in the future. What are some examples of the type of dynamism that you would need?
I see. The llama runner example binary that I called out in my previous comment should be a good reference. Although it is within our repository, to my knowledge it is a C++ binary that treats ExecuTorch as an external dependency that is installed on the system.
Unfortunately, I'm not super familiar with examples of an external C++ application linking to ExecuTorch libraries. Tagging some folks who might be able to provide a pointer: @mergennachin @kirklandsign @larryliu0820 |
Beta Was this translation helpful? Give feedback.
-
🚀 The feature, motivation and pitch
Hi PyTorch team,
I'm interested in using the ExecuTorch Vulkan backend on Linux. While the documentation mentions the Vulkan delegate being cross-platform, the current guides primarily focus on Android and iOS.
I haven't been able to find any specific instructions or examples for building and running ExecuTorch with the Vulkan backend on a Linux platform. Could you please provide any additional information or guidance on this?
Specifically, I'd be grateful if you could address the following:
Any help or pointers you can provide would be extremely helpful. Thank you for your time and consideration.
Alternatives
No response
Additional context
No response
RFC (Optional)
No response
cc @mergennachin @byjlw @SS-JIA @manuelcandales
Beta Was this translation helpful? Give feedback.
All reactions