Hexagon‑MLIR is an open‑source AI compiler stack that lets you easily compile and run Triton kernels and PyTorch models on Qualcomm Hexagon Neural Processing Units (NPUs).
This initiative complements our commercial toolchains by exploring an open‑source MLIR‑based compilation stack, giving developers a path to advance AI compilation capabilities through a more flexible and transparent approach.
- Triton Kernel Compilation & Execution: Compile Triton kernels and execute on Hexagon NPU targets
- PyTorch Model Compilation & Execution: Compile PyTorch models and execute on Hexagon NPU targets
- Performance Optimization: Leverage Hexagon-specific features for maximum performance, including:
- Multi-threading: Hexagon-optimized parallel execution of operations for improved performance
- Vector Processing: Optimized code generation targeting Hexagon Vector eXtensions (HVX) units
- TCM Utilization: Leverage Tightly Coupled Memory (TCM) for reduced memory latency
- DMA Optimization: Efficient DMA transfers between DDR and TCM memory spaces
- Matrix Processing (experimental): Leverage Qualcomm's Hexagon Kernel Library for matrix multiplication
- IR Inspection: Inspect and analyze IR lowering passes, helping you understand how your code is optimized
- 📖 User Guide - For instructions on how to download, setup our compiler and start running Triton kernels or PyTorch models on Hexagon NPUs
- 🎓 Tutorials - A set of tutorials on Triton kernels and PyTorch models
- ❓ FAQ - Frequently asked questions
- 🏗️ Developer Guide - Insights on how to develop, debug and profile using our compiler toolchain for Triton kernels and PyTorch models
Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.