Skip to content

Latest commit

 

History

History
32 lines (21 loc) · 1.5 KB

File metadata and controls

32 lines (21 loc) · 1.5 KB

🎓 Hexagon-MLIR Tutorials

Welcome to the Hexagon-MLIR tutorials! These hands-on examples will guide you through the process of writing, compiling, and executing Triton kernels and PyTorch models on Qualcomm Hexagon NPUs.

🏃‍♂️ Quick Start

📖 Start with Triton Tutorials

📖 Start with PyTorch Tutorials

🚀 What You'll Learn

These tutorials demonstrate how to leverage Qualcomm's Hexagon NPU targets for AI workloads. You'll discover how to:

Triton Kernels

  • Write Triton Kernels: Create kernels that run efficiently on Qualcomm Hexagon NPUs
  • Understand the Compilation Pipeline: Follow your code from Python through multiple IR transformations to optimized machine code
  • Optimize Performance: Leverage specific features like multi-threading, vector processing, and memory hierarchy optimization
  • Debug and Profile: Use built-in tools to analyze and improve your kernel performance

PyTorch Models

  • Use PyTorch Flow: Take PyTorch models and compile and execute in our flow
  • Understand the Compilation Pipeline: Follow your code from Python through multiple IR transformations to optimized machine code

🛠️ Prerequisites

Before diving into the tutorials, make sure you have:

  • ✅ Hexagon-MLIR framework installed (Installation Guide)
  • ✅ Python environment with required dependencies
  • ✅ Access to Hexagon hardware or simulator
  • ✅ Basic understanding of Python and tensor operations