Target Architecture: rv64gcv (QEMU User-mode)
This project implements a high-performance Canny edge detection pipeline, focusing on the migration from a scalar C++ baseline to a hand-optimized implementation using RISC-V Vector (RVV) intrinsics.
- Gaussian Blur: 5x5 Kernel (2D Convolution)
- Sobel Operator: Gx and Gy gradients (3x3 Kernels)
- Magnitude & Direction: L1/L2 Norms and 4-way direction quantization.
- NMS & Hysteresis: Non-Maximum Suppression and Double Thresholding.
The development process follows a standard embedded systems optimization lifecycle:
- Baseline: Develop a clean, portable scalar C++ implementation for functional reference.
- Measurement: Profile execution and analyze compiler output (from
-O0to-Ofast) to identify bottlenecks. - Optimization: Rewrite performance-critical kernels using RVV intrinsics.
- Verification: Ensure bit-exact equivalence between scalar and vector outputs across various Vector Lengths (VLEN).
- Environment: WSL2 (Ubuntu 24.04) or Arch Linux.
- Toolchain:
riscv64-linux-gnu-g++(configured with--with-arch=rv64gcv). - Emulation: QEMU with RVV 1.0 support.
- Testing: GoogleTest for host-side logic and assert-based testing for target emulation.
- Documentation: Doxygen.
While the project runs in emulation, the RVV optimizations interact with simulated digital hardware structures:
- Vector Datapath (VPUs): Intrinsics like
vaddutilize multiple ALUs in parallel, processing data batches rather than single pixels. - Vector Register File (VRF): Testing across different
VLENvalues simulates various hardware tiers. - Length Multiplier (LMUL): Grouping registers allows the hardware to handle larger continuous data chunks.
- Vector Length Agnostic (VLA) Programming: Utilizing
vsetvliinteracts with thevlControl and Status Register (CSR), allowing the logic to adapt to the physical hardware width dynamically.
.
├── assets/ # Input/Output raw images (width*height bytes)
├── build/ # Build artifacts (host/ for x86, target/ for RISC-V)
├── include/ # Header files (.hpp)
├── src/ # Implementation files (.cpp)
├── scripts/ # Automated scripts
├── tests/ # GoogleTest (host) and RVV equivalence tests
├── tools/ # Custom profiling or visualization tools
├── Makefile # Dual-target build system
└── Doxyfile # Doxygen configuration
The setup script automates the installation of the RISC-V GNU toolchain, QEMU, GoogleTest and Python packages. Windows OS users should use WSL2 with Ubuntu 24.04.
Run in terminal:
chmod +x scripts/phase_one_setup.sh
./scripts/setup.sh
source ~/.bashrcThe Makefile supports cross-compilation and automated testing.
| Command | Action | Purpose |
|---|---|---|
make all |
Full Build | Compiles the main pipeline (RISC-V) and runs Host tests. |
make canny_rv |
Target Build | Compiles the main Canny pipeline into a RISC-V .elf. |
make test |
Host Test | Compiles and runs GoogleTest natively for logic verification. |
make run |
Emulate | Executes the main pipeline on QEMU with vlen=512. |
make run-test NAME=x |
Unit Emulate | Compiles and runs tests/x.cpp on QEMU (e.g., make run-test NAME=sobel). |
make list-tests |
Discovery | Scans the tests/ folder and lists all testable .cpp files. |
make docs |
Documentation | Generates browsable HTML API documentation via Doxygen. |
make clean |
Cleanup | Wipes the build/ directory and resets the environment. |
Configuration Overrides: You can override hardware parameters directly from the command line:
make run VLEN=128 # Simulate lower-end hardware
make run VLEN=256 # Simulate mid-range hardware