v1.1: Vision 42/42 + Enhanced Vision 19/19 across openvx-mark, opencv-mark, and rustVX #76
Workflow file for this run
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| name: CI | |
| on: | |
| push: | |
| branches: [main] | |
| # Run CI on pull requests targeting any base branch, not just main. | |
| # This keeps stacked PR workflows covered (a PR's base may be another | |
| # feature branch, e.g. an umbrella branch or a previous PR in a stack). | |
| pull_request: | |
| # Auto-cancel superseded runs on the same ref so a rapid push series | |
| # (e.g. force-push during PR review) doesn't queue 3+ stale runs and | |
| # starve the GitHub Actions runner pool. main pushes are exempt — we | |
| # always want a clean signal on main. | |
| concurrency: | |
| group: ${{ github.workflow }}-${{ github.ref }} | |
| cancel-in-progress: ${{ github.ref != 'refs/heads/main' }} | |
| # ============================================================================ | |
| # Architecture | |
| # | |
| # Phase 1 (parallel) — four independent build jobs: | |
| # * Three OpenVX-impl jobs (MIVisionX, Khronos sample, rustVX). Each: | |
| # 1. Builds the implementation from source. | |
| # 2. Stages a self-contained artifact: <impl>-stage/lib + <impl>-stage/include. | |
| # 3. Builds openvx-mark against the just-built impl. | |
| # 4. Runs a quick smoke benchmark as a "local unit test" — catches | |
| # build-link breakage and missing-symbol issues immediately, | |
| # scoped to the specific impl, without waiting for the slower | |
| # comparison job downstream. | |
| # 5. Uploads the staged artifact for the comparison job to consume. | |
| # * One OpenCV-baseline job (opencv-mark companion binary). Differs from | |
| # the OpenVX jobs because OpenCV is apt-installable and opencv-mark has | |
| # no OpenVX dependency — see the build-opencv job below for the shape. | |
| # Stages its smoke JSON directly (no impl tarball needed). | |
| # | |
| # Per-impl feature-set policy | |
| # --------------------------- | |
| # Not every impl ships the full OpenVX 1.3.1 conformance surface, so each | |
| # bench is scoped to the feature sets that impl actually implements: | |
| # | |
| # * MIVisionX — `vision,framework`. AMD's runtime exports the 42 | |
| # Vision Conformance kernels but **does NOT export | |
| # most of the 19 Enhanced Vision APIs** (Bilateral- | |
| # Filter, HOG*, Tensor*, Select, ScalarOperation, | |
| # etc.). With `enhanced_vision` enabled, the per- | |
| # benchmark dlsym shim in openvx_optional_apis.h | |
| # would dutifully report 19 SKIPPED rows, which is | |
| # accurate but uninformative noise on every run — | |
| # so we omit it. | |
| # * Khronos sample — `vision,enhanced_vision,framework`. CTS-conformant | |
| # reference impl; ships both profiles. | |
| # * rustVX — `vision,enhanced_vision,framework`. CTS-conformant | |
| # for Vision (5923/5923) and Enhanced Vision | |
| # (1235/1235) per the rustVX README. | |
| # * opencv-mark — `vision,enhanced_vision` (no `framework`; cv:: has | |
| # no graph runtime to measure). All 79 + 19 = 98 | |
| # OpenCV-side benchmarks run. | |
| # | |
| # Phase 2 (single job, depends on all four Phase-1 jobs) — comparison. | |
| # 1. Downloads all three OpenVX impl artifacts onto a single runner; | |
| # apt-installs OpenCV on that same runner. | |
| # 2. Builds openvx-mark × 3 (one per OpenVX impl) so all binaries link | |
| # against the same openvx-mark source tree at the same commit. | |
| # Builds opencv-mark from the same source tree. | |
| # 3. Runs the full benchmark against each impl using that impl's | |
| # feature-set policy (above). Same hardware = fair cross-vendor | |
| # comparison. `compare_reports.py` joins by (name, mode, resolution) | |
| # and silently drops rows not on both sides, so enhanced_vision | |
| # rows naturally appear in pairs where both impls produced them | |
| # (Khronos↔OpenCV, rustVX↔OpenCV, Khronos↔rustVX) and are absent | |
| # from MIVisionX↔* pairs. | |
| # 4. Generates six pairwise comparison reports: | |
| # OpenVX-vs-OpenVX: | |
| # * MIVisionX vs Khronos sample | |
| # * MIVisionX vs rustVX | |
| # * Khronos sample vs rustVX | |
| # OpenVX-vs-OpenCV (the "does adopting OpenVX pay off?" trio): | |
| # * MIVisionX vs OpenCV | |
| # * Khronos sample vs OpenCV | |
| # * rustVX vs OpenCV | |
| # 5. Posts each report to the job summary and uploads as an artifact. | |
| # | |
| # Inspired by the layered build/perf-gate design in rustVX's conformance CI: | |
| # https://github.com/kiritigowda/rustVX/blob/main/.github/workflows/conformance.yml | |
| # ============================================================================ | |
| jobs: | |
| # -------------------------------------------------------------------------- | |
| # Phase 1 — MIVisionX (AMD OpenVX, CPU backend) | |
| # -------------------------------------------------------------------------- | |
| build-mivisionx: | |
| name: Build MIVisionX (CPU) + smoke test | |
| runs-on: ubuntu-22.04 | |
| steps: | |
| - name: Checkout openvx-mark | |
| uses: actions/checkout@v4 | |
| - name: Install dependencies | |
| run: | | |
| sudo apt-get update | |
| sudo apt-get install -y build-essential cmake git python3 | |
| # Why -DCMAKE_CXX_FLAGS_RELEASE override (the "optimized kernels" knob): | |
| # | |
| # MIVisionX's amd_openvx/openvx/ago/ago_haf_cpu_*.cpp files contain | |
| # hand-written AVX2 intrinsics (_mm256_*) for the CPU-side "Hardware | |
| # Acceleration Functions" — these are the OPTIMIZED kernel paths. | |
| # However, MIVisionX's own top-level CMakeLists.txt appends ONLY | |
| # `-msse4.2` to CMAKE_CXX_FLAGS, with no -mavx2/-mfma and no | |
| # __attribute__((target("avx2"))) on any function. With just -msse4.2 | |
| # the compiler can still emit the AVX2 intrinsics in those specific | |
| # call sites, but it CANNOT auto-vectorise the surrounding scalar / | |
| # loop code beyond SSE4.2, can't use FMA, can't use BMI/BMI2 — so | |
| # the per-kernel dispatch glue, address arithmetic, and any kernel | |
| # code that's not hand-vectorised stays at SSE4.2 throughput. That's | |
| # the "base kernel" path the umbrella PR description points at. | |
| # | |
| # By overriding CMAKE_CXX_FLAGS_RELEASE we get -O3 -DNDEBUG plus | |
| # x86-64-v3 (= SSE4.2 + AVX + AVX2 + BMI + BMI2 + FMA + LZCNT + POPCNT), | |
| # which is the conservative-portable AMD64 baseline modern compilers | |
| # ship for since gcc 11. GitHub Actions Ubuntu 22.04 runners use Intel | |
| # Xeon or AMD EPYC CPUs which all support x86-64-v3. | |
| # | |
| # MIVisionX still appends `-msse4.2` to CMAKE_CXX_FLAGS (we don't | |
| # override CMAKE_CXX_FLAGS, only the per-config Release variant), | |
| # so the final compile line is "-O3 -DNDEBUG -march=x86-64-v3 | |
| # -msse4.2". -march wins for code-gen ceiling; the dup -msse4.2 is | |
| # redundant but harmless. | |
| - name: Build MIVisionX (CPU backend, optimized) | |
| run: | | |
| set -euo pipefail | |
| git clone --depth 1 --branch develop \ | |
| https://github.com/ROCm/MIVisionX.git /tmp/mivisionx-src | |
| mkdir -p /tmp/mivisionx-src/build | |
| cd /tmp/mivisionx-src/build | |
| cmake \ | |
| -DBACKEND=CPU \ | |
| -DNEURAL_NET=OFF \ | |
| -DLOOM=OFF \ | |
| -DMIGRAPHX=OFF \ | |
| -DCMAKE_BUILD_TYPE=Release \ | |
| -DCMAKE_CXX_FLAGS_RELEASE="-O3 -DNDEBUG -march=x86-64-v3" \ | |
| -DCMAKE_INSTALL_PREFIX=/tmp/mivisionx-install \ | |
| .. | |
| make -j$(nproc) | |
| make install | |
| # Sanity-print the actual compile flags the make rules used — | |
| # surfaces in CI logs so a reviewer can confirm AVX2 made it | |
| # into the build (look for `-march=x86-64-v3` in the cmake echo). | |
| grep -h 'CXX_FLAGS' CMakeFiles/openvx.dir/flags.make 2>/dev/null \ | |
| | head -2 || true | |
| - name: Stage MIVisionX artifact | |
| id: stage | |
| run: | | |
| set -euo pipefail | |
| mkdir -p mivisionx-stage/lib mivisionx-stage/include | |
| LIB_SRC=$(dirname "$(find /tmp/mivisionx-install -name 'libopenvx.so' | head -1)") | |
| echo "MIVisionX libraries discovered in: $LIB_SRC" | |
| # Copy ALL libopenvx* / libvxu* entries (libopenvx.so symlink, | |
| # libopenvx.so.1 SONAME symlink, libopenvx.so.X.Y.Z real file) | |
| # preserving symlinks (-P) so ld.so can follow the SONAME chain. | |
| # Without versioned files the linker reports | |
| # "libopenvx.so.1: cannot open shared object file". | |
| find "$LIB_SRC" -maxdepth 1 -name 'libopenvx*' -exec cp -P {} mivisionx-stage/lib/ \; | |
| find "$LIB_SRC" -maxdepth 1 -name 'libvxu*' -exec cp -P {} mivisionx-stage/lib/ \; | |
| cp -r /tmp/mivisionx-install/include/mivisionx/. mivisionx-stage/include/ | |
| echo "--- staged lib ---" | |
| ls -la mivisionx-stage/lib | |
| echo "--- staged include (top-level) ---" | |
| ls -la mivisionx-stage/include | |
| { | |
| echo "lib_dir=$(pwd)/mivisionx-stage/lib" | |
| echo "include_dir=$(pwd)/mivisionx-stage/include" | |
| } >> "$GITHUB_OUTPUT" | |
| - name: Build openvx-mark (smoke) | |
| run: | | |
| set -euo pipefail | |
| mkdir -p build-smoke | |
| cd build-smoke | |
| cmake \ | |
| -DCMAKE_BUILD_TYPE=Release \ | |
| -DOPENVX_INCLUDES=${{ steps.stage.outputs.include_dir }} \ | |
| -DOPENVX_LIB_DIR=${{ steps.stage.outputs.lib_dir }} \ | |
| .. | |
| cmake --build . -j$(nproc) | |
| # Smoke covers the `vision` + `framework` feature sets only. | |
| # MIVisionX's runtime exports the 42 Vision Conformance kernels | |
| # but does NOT export most of the 19 Enhanced Vision APIs | |
| # (BilateralFilter, HOG*, Tensor*, Select, ScalarOperation, etc.). | |
| # With `enhanced_vision` enabled, the per-benchmark dlsym shim in | |
| # openvx_optional_apis.h would dutifully report 19 SKIPPED rows | |
| # on every run — accurate but uninformative noise. The Khronos | |
| # sample, rustVX, and opencv-mark smoke jobs DO exercise | |
| # `enhanced_vision` because those impls actually ship it. | |
| - name: Run smoke benchmark (vision + framework, VGA × 5 iters, single-threaded) | |
| # Smoke is advisory — if a specific impl crashes inside a | |
| # specific kernel the artifact upload (which the compare job | |
| # depends on) must still happen so vendor-vs-vendor signal | |
| # isn't lost. | |
| continue-on-error: true | |
| run: | | |
| set -eo pipefail | |
| cd build-smoke | |
| export LD_LIBRARY_PATH=${{ steps.stage.outputs.lib_dir }}:${LD_LIBRARY_PATH:-} | |
| # Timer self-test up front so a sloppy runner clock fails | |
| # loud before we trust a smoke timing number. | |
| ./openvx-mark --validate-timing | |
| # `--threads 1` matches the Phase 2 compare config — same | |
| # apples-to-apples threading policy on smoke and full bench | |
| # so smoke timings are interpretable as a coarse preview. | |
| ./openvx-mark --feature-set vision,framework \ | |
| --resolution VGA --iterations 5 --warmup 1 --threads 1 \ | |
| --output-dir smoke-results | |
| - name: Upload MIVisionX artifact | |
| if: always() | |
| uses: actions/upload-artifact@v4 | |
| with: | |
| name: impl-mivisionx | |
| path: mivisionx-stage/ | |
| retention-days: 1 | |
| - name: Upload MIVisionX smoke results | |
| if: always() | |
| uses: actions/upload-artifact@v4 | |
| with: | |
| name: smoke-results-mivisionx | |
| path: build-smoke/smoke-results/ | |
| if-no-files-found: ignore | |
| # -------------------------------------------------------------------------- | |
| # Phase 1 — Khronos OpenVX sample implementation | |
| # -------------------------------------------------------------------------- | |
| build-khronos-sample: | |
| name: Build Khronos sample + smoke test | |
| runs-on: ubuntu-22.04 | |
| steps: | |
| - name: Checkout openvx-mark | |
| uses: actions/checkout@v4 | |
| - name: Install dependencies | |
| run: | | |
| sudo apt-get update | |
| sudo apt-get install -y build-essential cmake git python3 | |
| # Khronos sample is a reference impl (no SIMD intrinsics), so | |
| # most of the perf budget rides on whatever compiler auto-vec the | |
| # build picks up. Build.py honours CFLAGS / CXXFLAGS from the | |
| # environment, so we use those to upgrade the compile baseline | |
| # to x86-64-v3 (= AVX2 + FMA + BMI2 + LZCNT + POPCNT), matching | |
| # what the MIVisionX build above gets. No fairness claim that | |
| # the sample becomes "competitive" — it's a reference — just | |
| # that it's being measured at the SAME compile baseline as | |
| # MIVisionX so the cross-impl comparison isn't contaminated by | |
| # one side getting better auto-vec than the other. | |
| - name: Build Khronos OpenVX sample (Release, x86-64-v3) | |
| run: | | |
| set -euo pipefail | |
| git clone --recursive --depth 1 \ | |
| https://github.com/KhronosGroup/OpenVX-sample-impl.git /tmp/khronos-src | |
| cd /tmp/khronos-src | |
| export CFLAGS="-O3 -march=x86-64-v3 ${CFLAGS:-}" | |
| export CXXFLAGS="-O3 -march=x86-64-v3 ${CXXFLAGS:-}" | |
| echo "CFLAGS = ${CFLAGS}" | |
| echo "CXXFLAGS= ${CXXFLAGS}" | |
| python3 Build.py --os=Linux --arch=64 --conf=Release | |
| - name: Stage Khronos sample artifact | |
| id: stage | |
| run: | | |
| set -euo pipefail | |
| mkdir -p khronos-stage/lib khronos-stage/include | |
| LIB_SRC=$(dirname "$(find /tmp/khronos-src -name 'libopenvx.so' -not -path '*/build/*' | head -1)") | |
| echo "Khronos libraries discovered in: $LIB_SRC" | |
| # Same approach as MIVisionX: copy all libopenvx* / libvxu* entries | |
| # preserving symlinks so ld.so can follow the SONAME chain. | |
| find "$LIB_SRC" -maxdepth 1 -name 'libopenvx*' -exec cp -P {} khronos-stage/lib/ \; | |
| find "$LIB_SRC" -maxdepth 1 -name 'libvxu*' -exec cp -P {} khronos-stage/lib/ \; | |
| cp -r /tmp/khronos-src/api-docs/include/. khronos-stage/include/ | |
| echo "--- staged lib ---" | |
| ls -la khronos-stage/lib | |
| echo "--- staged include (top-level) ---" | |
| ls -la khronos-stage/include | |
| { | |
| echo "lib_dir=$(pwd)/khronos-stage/lib" | |
| echo "include_dir=$(pwd)/khronos-stage/include" | |
| } >> "$GITHUB_OUTPUT" | |
| - name: Build openvx-mark (smoke) | |
| run: | | |
| set -euo pipefail | |
| mkdir -p build-smoke | |
| cd build-smoke | |
| cmake \ | |
| -DCMAKE_BUILD_TYPE=Release \ | |
| -DOPENVX_INCLUDES=${{ steps.stage.outputs.include_dir }} \ | |
| -DOPENVX_LIB_DIR=${{ steps.stage.outputs.lib_dir }} \ | |
| .. | |
| cmake --build . -j$(nproc) | |
| # Khronos sample is a CTS-conformant reference impl that ships | |
| # both the Vision (42 kernels) and Enhanced Vision (19 kernels) | |
| # profiles, so we exercise `vision,enhanced_vision,framework` at | |
| # smoke time. `continue-on-error: true` keeps the artifact upload | |
| # alive if any specific kernel crashes mid-run; the comparison job | |
| # downstream handles whichever JSON files actually got produced. | |
| - name: Run smoke benchmark (vision + enhanced_vision + framework, VGA × 5 iters) | |
| continue-on-error: true | |
| run: | | |
| set -eo pipefail | |
| cd build-smoke | |
| export LD_LIBRARY_PATH=${{ steps.stage.outputs.lib_dir }}:${LD_LIBRARY_PATH:-} | |
| ./openvx-mark --validate-timing | |
| ./openvx-mark --feature-set vision,enhanced_vision,framework \ | |
| --resolution VGA --iterations 5 --warmup 1 --threads 1 \ | |
| --output-dir smoke-results | |
| - name: Upload Khronos sample artifact | |
| if: always() | |
| uses: actions/upload-artifact@v4 | |
| with: | |
| name: impl-khronos-sample | |
| path: khronos-stage/ | |
| retention-days: 1 | |
| - name: Upload Khronos sample smoke results | |
| if: always() | |
| uses: actions/upload-artifact@v4 | |
| with: | |
| name: smoke-results-khronos-sample | |
| path: build-smoke/smoke-results/ | |
| if-no-files-found: ignore | |
| # -------------------------------------------------------------------------- | |
| # Phase 1 — rustVX (Rust OpenVX implementation) | |
| # | |
| # rustVX ships a single libopenvx_ffi.so that exports the full vx*/vxu* | |
| # symbol set. openvx-mark's CMake uses find_library(NAMES openvx) and | |
| # find_library(NAMES vxu) — so we symlink the two classic Khronos lib | |
| # names to the FFI .so during staging, without modifying rustVX's own | |
| # build output. | |
| # | |
| # SIMD config: AVX2 + `-C target-cpu=x86-64-v3`, matching what rustVX's | |
| # own CI ships. We deliberately skip the alignment-pad RUSTFLAGS used in | |
| # rustVX's PR-vs-main perf gate — those exist to make rustVX-vs-rustVX | |
| # bench numbers invariant to .text shifts, which is irrelevant for the | |
| # vendor-vs-vendor comparison this workflow runs. | |
| # -------------------------------------------------------------------------- | |
| build-rustvx: | |
| name: Build rustVX + smoke test | |
| runs-on: ubuntu-22.04 | |
| steps: | |
| - name: Checkout openvx-mark | |
| uses: actions/checkout@v4 | |
| - name: Install dependencies | |
| run: | | |
| sudo apt-get update | |
| sudo apt-get install -y build-essential cmake git | |
| - name: Install Rust toolchain | |
| run: | | |
| set -euo pipefail | |
| curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs \ | |
| | sh -s -- -y --default-toolchain stable | |
| source "$HOME/.cargo/env" | |
| rustc --version | |
| cargo --version | |
| - name: Build rustVX (release, AVX2) | |
| run: | | |
| set -euo pipefail | |
| source "$HOME/.cargo/env" | |
| git clone --depth 1 \ | |
| https://github.com/kiritigowda/rustVX.git /tmp/rustvx-src | |
| cd /tmp/rustvx-src | |
| case "$(uname -m)" in | |
| x86_64|amd64) | |
| FEATURES="openvx-core/sse2 openvx-core/avx2 openvx-vision/sse2 openvx-vision/avx2" | |
| export RUSTFLAGS="-C target-cpu=x86-64-v3" | |
| ;; | |
| aarch64|arm64) | |
| FEATURES="openvx-core/neon openvx-vision/neon" | |
| export RUSTFLAGS="" | |
| ;; | |
| *) | |
| FEATURES="" | |
| export RUSTFLAGS="" | |
| ;; | |
| esac | |
| echo "Architecture : $(uname -m)" | |
| echo "Cargo features: ${FEATURES:-<none>}" | |
| echo "RUSTFLAGS : ${RUSTFLAGS:-<none>}" | |
| if [ -n "$FEATURES" ]; then | |
| cargo build --release -p openvx-ffi --features "$FEATURES" | |
| else | |
| cargo build --release -p openvx-ffi | |
| fi | |
| - name: Stage rustVX artifact (with libopenvx / libvxu symlinks) | |
| id: stage | |
| run: | | |
| set -euo pipefail | |
| mkdir -p rustvx-stage/lib rustvx-stage/include | |
| cp /tmp/rustvx-src/target/release/libopenvx_ffi.so rustvx-stage/lib/ | |
| # Classic Khronos library names so openvx-mark's find_library picks | |
| # them up. Symlinks survive upload-artifact@v4 (it preserves them | |
| # within tar), so the comparison job downstream sees the same. | |
| ( | |
| cd rustvx-stage/lib | |
| ln -sf libopenvx_ffi.so libopenvx.so | |
| ln -sf libopenvx_ffi.so libvxu.so | |
| ) | |
| cp -r /tmp/rustvx-src/include/. rustvx-stage/include/ | |
| echo "--- staged lib ---" | |
| ls -la rustvx-stage/lib | |
| echo "--- staged include (top-level) ---" | |
| ls -la rustvx-stage/include | |
| { | |
| echo "lib_dir=$(pwd)/rustvx-stage/lib" | |
| echo "include_dir=$(pwd)/rustvx-stage/include" | |
| } >> "$GITHUB_OUTPUT" | |
| - name: Build openvx-mark (smoke) | |
| run: | | |
| set -euo pipefail | |
| mkdir -p build-smoke | |
| cd build-smoke | |
| cmake \ | |
| -DCMAKE_BUILD_TYPE=Release \ | |
| -DOPENVX_INCLUDES=${{ steps.stage.outputs.include_dir }} \ | |
| -DOPENVX_LIB_DIR=${{ steps.stage.outputs.lib_dir }} \ | |
| .. | |
| cmake --build . -j$(nproc) | |
| # rustVX is CTS-conformant for both Vision (5923/5923) and | |
| # Enhanced Vision (1235/1235), so we exercise the full | |
| # `vision,enhanced_vision,framework` surface at smoke time. This | |
| # is the impl that gives the headline "all 19 enhanced_vision | |
| # kernels produce real measurements" cell in the comparison | |
| # table — every other OpenVX backend either omits the profile | |
| # (MIVisionX) or has known per-kernel quirks. | |
| - name: Run smoke benchmark (vision + enhanced_vision + framework, VGA × 5 iters) | |
| continue-on-error: true | |
| run: | | |
| set -eo pipefail | |
| cd build-smoke | |
| export LD_LIBRARY_PATH=${{ steps.stage.outputs.lib_dir }}:${LD_LIBRARY_PATH:-} | |
| ./openvx-mark --validate-timing | |
| ./openvx-mark --feature-set vision,enhanced_vision,framework \ | |
| --resolution VGA --iterations 5 --warmup 1 --threads 1 \ | |
| --output-dir smoke-results | |
| - name: Upload rustVX artifact | |
| if: always() | |
| uses: actions/upload-artifact@v4 | |
| with: | |
| name: impl-rustvx | |
| path: rustvx-stage/ | |
| retention-days: 1 | |
| - name: Upload rustVX smoke results | |
| if: always() | |
| uses: actions/upload-artifact@v4 | |
| with: | |
| name: smoke-results-rustvx | |
| path: build-smoke/smoke-results/ | |
| if-no-files-found: ignore | |
| # -------------------------------------------------------------------------- | |
| # Phase 1 — OpenCV baseline (companion binary `opencv-mark`) | |
| # | |
| # OpenCV is the de facto vision baseline. This job exists so we can answer | |
| # "does adopting OpenVX actually pay off vs the cv:: code I already have?" | |
| # at the per-kernel level, on the same CI hardware as every OpenVX impl. | |
| # | |
| # Differs from the OpenVX impl jobs in two ways: | |
| # 1. OpenCV is apt-installable (no from-source build), so this job is | |
| # much shorter — install, configure parent CMake, build, smoke. | |
| # 2. There is no impl-tarball staging step. opencv-mark IS the binary | |
| # that runs the OpenCV-side measurements; there is no separate | |
| # "link openvx-mark against this libopenvx.so" rebuild downstream. | |
| # The Phase 2 comparison job re-runs opencv-mark itself (after a | |
| # fresh apt-install of OpenCV) for strict same-runner fairness vs | |
| # the per-impl benches — see compare job's `Build & bench | |
| # opencv-mark` step. | |
| # | |
| # The smoke run here is fast feedback only (catches build/link breakage | |
| # in <1 min on every PR); the comparison-grade FHD × 20 iter benchmark | |
| # lives in Phase 2 alongside the OpenVX impl benches. | |
| # -------------------------------------------------------------------------- | |
| build-opencv: | |
| name: Build opencv-mark (OpenCV baseline) + smoke test | |
| runs-on: ubuntu-22.04 | |
| steps: | |
| - name: Checkout openvx-mark | |
| uses: actions/checkout@v4 | |
| - name: Install dependencies (OpenCV 4 from apt) | |
| run: | | |
| sudo apt-get update | |
| sudo apt-get install -y build-essential cmake git python3 \ | |
| libopencv-dev | |
| # Sanity-print the OpenCV version that pkg-config sees so | |
| # comparison reports later can be cross-referenced against | |
| # exactly this version string. | |
| pkg-config --modversion opencv4 || true | |
| - name: Configure & build opencv-mark | |
| run: | | |
| set -euo pipefail | |
| mkdir -p build-opencv | |
| cd build-opencv | |
| # Parent CMake auto-includes opencv-mark/ when OpenCV is found. | |
| # No OPENVX_* flags needed — opencv-mark has no OpenVX dep. | |
| cmake -DCMAKE_BUILD_TYPE=Release .. | |
| cmake --build . --target opencv-mark -j$(nproc) | |
| # Fail loudly if the binary somehow didn't get produced (e.g. | |
| # OpenCV detection silently no-op'd). This is the exact failure | |
| # mode that PR #1's first CI run was missing. | |
| test -x opencv-mark/opencv-mark \ | |
| || { echo "ERROR: opencv-mark binary not built — OpenCV likely not detected by CMake"; exit 1; } | |
| # `--help` doubles as a version probe — it prints the opencv-mark | |
| # version line and the linked OpenCV version up top. PR1's CLI | |
| # does not implement a dedicated `--version` flag yet. | |
| ./opencv-mark/opencv-mark --help | head -3 | |
| # Same shape as the OpenVX-impl smokes (VGA × 5 iters, 1 warmup) | |
| # so timing noise stays comparable. Not continue-on-error — | |
| # opencv-mark has no impl-side quirks to tolerate; if a kernel | |
| # breaks here it's our bug. | |
| # | |
| # Feature-set: `vision,enhanced_vision`. opencv-mark has 1:1 | |
| # coverage of both profiles (42 vision + 19 enhanced = 61 | |
| # kernels) — that's the entire OpenCV-side surface this CI | |
| # exercises. `framework` is intentionally omitted (OpenCV has | |
| # no graph runtime to measure; the framework benches that | |
| # depend on `vxProcessGraph` semantics are OpenVX-only). | |
| - name: Run smoke benchmark (vision + enhanced_vision, VGA × 5 iters) | |
| run: | | |
| set -eo pipefail | |
| cd build-opencv | |
| # Timer self-test up front — same gate that runs in the | |
| # Phase 2 compare job. Catches a borked runner clock at | |
| # smoke time so we don't waste a full FHD bench cycle. | |
| ./opencv-mark/opencv-mark --validate-timing | |
| # `--threads 1` for symmetry with the smokes that run | |
| # against single-threaded OpenVX impls — keeps the smoke | |
| # comparable in shape to the cross-impl ones, even though | |
| # the smoke itself is just a "did it build & did it run?" | |
| # check, not a perf claim. | |
| ./opencv-mark/opencv-mark --feature-set vision,enhanced_vision \ | |
| --resolution VGA --iterations 5 --warmup 1 --threads 1 \ | |
| --output-dir smoke-results | |
| - name: Upload opencv-mark smoke results | |
| if: always() | |
| uses: actions/upload-artifact@v4 | |
| with: | |
| name: smoke-results-opencv | |
| path: build-opencv/smoke-results/ | |
| if-no-files-found: ignore | |
| # -------------------------------------------------------------------------- | |
| # Phase 2 — Pairwise comparison | |
| # | |
| # Pulls all three OpenVX implementation artifacts onto the same runner, | |
| # plus apt-installs OpenCV, so every benchmark is exercised on identical | |
| # hardware. Builds openvx-mark once per OpenVX impl (against this commit's | |
| # source tree, not pre-built artifacts — keeps the comparison binary | |
| # identical apart from the linked OpenVX lib), builds opencv-mark from | |
| # the same source tree, runs the full feature-set bench against each, | |
| # and emits six pairwise comparison reports: | |
| # | |
| # OpenVX-vs-OpenVX (3): | |
| # * MIVisionX over Khronos sample — AMD over reference | |
| # * MIVisionX over rustVX — AMD over Rust impl | |
| # * rustVX over Khronos sample — Rust impl over reference | |
| # | |
| # OpenVX-vs-OpenCV (3) — "does adopting OpenVX pay off?": | |
| # * MIVisionX over OpenCV — best-tuned OpenVX vs cv:: | |
| # * Khronos sample over OpenCV — reference OpenVX vs cv:: | |
| # * rustVX over OpenCV — Rust OpenVX vs cv:: | |
| # | |
| # `if: always()` + per-download `continue-on-error` + per-bench | |
| # `if: always() && steps.detect...` so a single failed build still | |
| # surfaces the comparison signal for whichever other impls are | |
| # available, instead of losing all visibility. | |
| # -------------------------------------------------------------------------- | |
| compare: | |
| name: Pairwise comparison (MIVisionX, Khronos, rustVX, OpenCV) | |
| runs-on: ubuntu-22.04 | |
| needs: | |
| - build-mivisionx | |
| - build-khronos-sample | |
| - build-rustvx | |
| - build-opencv | |
| if: always() | |
| steps: | |
| - name: Checkout openvx-mark | |
| uses: actions/checkout@v4 | |
| - name: Install dependencies | |
| run: | | |
| sudo apt-get update | |
| # libopencv-dev is needed so the Phase 2 `Build & bench | |
| # opencv-mark` step can re-link opencv-mark on this runner. | |
| # Strictly same-hardware fairness vs the per-impl benches. | |
| sudo apt-get install -y build-essential cmake git python3 \ | |
| libopencv-dev | |
| pkg-config --modversion opencv4 || true | |
| - name: Download MIVisionX artifact | |
| uses: actions/download-artifact@v4 | |
| with: | |
| name: impl-mivisionx | |
| path: ${{ github.workspace }}/impl/mivisionx | |
| continue-on-error: true | |
| - name: Download Khronos sample artifact | |
| uses: actions/download-artifact@v4 | |
| with: | |
| name: impl-khronos-sample | |
| path: ${{ github.workspace }}/impl/khronos | |
| continue-on-error: true | |
| - name: Download rustVX artifact | |
| uses: actions/download-artifact@v4 | |
| with: | |
| name: impl-rustvx | |
| path: ${{ github.workspace }}/impl/rustvx | |
| continue-on-error: true | |
| - name: Detect available implementations | |
| id: detect | |
| run: | | |
| set -euo pipefail | |
| for impl in mivisionx khronos rustvx; do | |
| lib="${{ github.workspace }}/impl/$impl/lib/libopenvx.so" | |
| if [ -e "$lib" ]; then | |
| echo "$impl: AVAILABLE ($lib)" | |
| chmod -R u+rwX "${{ github.workspace }}/impl/$impl/lib" | |
| echo "${impl}=true" >> "$GITHUB_OUTPUT" | |
| else | |
| echo "$impl: MISSING (artifact download failed or build job did not produce it)" | |
| echo "${impl}=false" >> "$GITHUB_OUTPUT" | |
| fi | |
| done | |
| # ----- Per-impl build + benchmark (FHD, 20 iter, 5 warmup) ----- | |
| # | |
| # Each per-impl bench uses `if: always() && steps.detect...` because | |
| # GitHub Actions treats any explicit `if:` without `always()` as | |
| # implicit `success()` — meaning a crash in MIVisionX bench would | |
| # skip the Khronos / rustVX bench steps entirely and we'd lose all | |
| # comparison signal. With `always()` the three benches stay | |
| # independent and the comparison job downstream handles whichever | |
| # JSON files actually got produced. | |
| # | |
| # `--threads 1` is passed EXPLICITLY (it's also the default — but | |
| # we want the CI compare config to be self-documenting). Rationale: | |
| # | |
| # * MIVisionX CPU backend, Khronos sample, and rustVX are all | |
| # fundamentally single-threaded per kernel — none of them have | |
| # an internal thread pool on the CPU path. | |
| # * OpenCV, by contrast, will happily spawn nproc threads via | |
| # TBB/OpenMP if left at its default. Without the `--threads 1` | |
| # pin, the OpenCV side would get an unfair (nproc)x parallelism | |
| # boost just from defaults — the comparison would no longer be | |
| # "OpenVX kernel vs OpenCV kernel" but "1-thread OpenVX vs | |
| # n-thread OpenCV". `--threads 1` calls cv::setNumThreads(1) | |
| # for opencv-mark and sets OMP_NUM_THREADS=1 in the env for | |
| # anything OpenMP-using downstream. | |
| # | |
| # Feature set is per-impl (see the architecture comment block | |
| # at the top of this file for the full policy): | |
| # * MIVisionX — `vision,framework` (no enhanced_vision; | |
| # AMD's runtime doesn't export the APIs) | |
| # * Khronos sample — `vision,enhanced_vision,framework` | |
| # * rustVX — `vision,enhanced_vision,framework` | |
| # * opencv-mark — `vision,enhanced_vision` (no framework; | |
| # OpenCV has no graph runtime to measure) | |
| # `compare_reports.py` joins by (name, mode, resolution) and | |
| # silently drops rows not on both sides, so enhanced_vision | |
| # rows naturally appear in pairs where both impls produced them | |
| # (Khronos↔OpenCV, rustVX↔OpenCV, Khronos↔rustVX) and are absent | |
| # from MIVisionX↔* pairs. | |
| - name: Build & bench against MIVisionX (single-threaded, FHD × 20) | |
| if: always() && steps.detect.outputs.mivisionx == 'true' | |
| run: | | |
| set -euo pipefail | |
| mkdir -p build-mivisionx | |
| cd build-mivisionx | |
| cmake \ | |
| -DCMAKE_BUILD_TYPE=Release \ | |
| -DOPENVX_INCLUDES=${{ github.workspace }}/impl/mivisionx/include \ | |
| -DOPENVX_LIB_DIR=${{ github.workspace }}/impl/mivisionx/lib \ | |
| .. | |
| cmake --build . -j$(nproc) | |
| export LD_LIBRARY_PATH=${{ github.workspace }}/impl/mivisionx/lib:${LD_LIBRARY_PATH:-} | |
| # Timer self-test first — gates the rest of the bench. If the | |
| # runner clock is sloppy, our timing numbers are meaningless | |
| # and we'd rather know about it now than ship bad data. | |
| ./openvx-mark --validate-timing | |
| ./openvx-mark --feature-set vision,framework \ | |
| --resolution FHD --iterations 20 --warmup 5 --threads 1 \ | |
| --output-dir results | |
| # Sentinel-set dump for cross-impl numerical verification — | |
| # see scripts/cross_verify_outputs.py. Runs the kernel set | |
| # ONCE (no timing, no warmup) so it's cheap, then the | |
| # downstream verify step compares this dump against the | |
| # OpenCV dump for correctness. | |
| ./openvx-mark --dump-outputs dump-mivisionx --seed 42 | |
| - name: Build & bench against Khronos sample (single-threaded, FHD × 20) | |
| if: always() && steps.detect.outputs.khronos == 'true' | |
| # `continue-on-error: true` so a crash inside a single | |
| # enhanced_vision kernel (the reference impl has known per- | |
| # kernel quirks under heavy use) doesn't take out the | |
| # comparison signal for whichever kernels did complete. | |
| # `openvx-mark` only writes its JSON at end-of-run, but the | |
| # surrounding job steps still upload artifacts as long as we | |
| # reach them. | |
| continue-on-error: true | |
| run: | | |
| set -eo pipefail | |
| mkdir -p build-khronos | |
| cd build-khronos | |
| cmake \ | |
| -DCMAKE_BUILD_TYPE=Release \ | |
| -DOPENVX_INCLUDES=${{ github.workspace }}/impl/khronos/include \ | |
| -DOPENVX_LIB_DIR=${{ github.workspace }}/impl/khronos/lib \ | |
| .. | |
| cmake --build . -j$(nproc) | |
| export LD_LIBRARY_PATH=${{ github.workspace }}/impl/khronos/lib:${LD_LIBRARY_PATH:-} | |
| ./openvx-mark --validate-timing | |
| ./openvx-mark --feature-set vision,enhanced_vision,framework \ | |
| --resolution FHD --iterations 20 --warmup 5 --threads 1 \ | |
| --output-dir results | |
| ./openvx-mark --dump-outputs dump-khronos --seed 42 || true | |
| - name: Build & bench against rustVX (single-threaded, FHD × 20) | |
| if: always() && steps.detect.outputs.rustvx == 'true' | |
| # rustVX is CTS-conformant for both Vision (5923/5923) and | |
| # Enhanced Vision (1235/1235), so all 42 + 19 kernels should | |
| # actually produce real measurements here. This row is the | |
| # headline cell for "what does a fully-conformant OpenVX impl | |
| # look like vs OpenCV on the same hardware?". | |
| # `continue-on-error: true` is a belt-and-suspenders safety | |
| # in case any one kernel surfaces a regression mid-bench — | |
| # the artifact upload (which downstream comparisons depend | |
| # on) must still happen. | |
| continue-on-error: true | |
| run: | | |
| set -eo pipefail | |
| mkdir -p build-rustvx | |
| cd build-rustvx | |
| cmake \ | |
| -DCMAKE_BUILD_TYPE=Release \ | |
| -DOPENVX_INCLUDES=${{ github.workspace }}/impl/rustvx/include \ | |
| -DOPENVX_LIB_DIR=${{ github.workspace }}/impl/rustvx/lib \ | |
| .. | |
| cmake --build . -j$(nproc) | |
| export LD_LIBRARY_PATH=${{ github.workspace }}/impl/rustvx/lib:${LD_LIBRARY_PATH:-} | |
| ./openvx-mark --validate-timing | |
| ./openvx-mark --feature-set vision,enhanced_vision,framework \ | |
| --resolution FHD --iterations 20 --warmup 5 --threads 1 \ | |
| --output-dir results | |
| ./openvx-mark --dump-outputs dump-rustvx --seed 42 || true | |
| # opencv-mark has no OpenVX dependency, so no OPENVX_* flags and no | |
| # detect-step gate — it only needs `libopencv-dev` (already installed | |
| # above). Same FHD × 20 iter × 5 warmup × --threads 1 shape as the | |
| # OpenVX benches so per-kernel speedups are directly comparable. | |
| # | |
| # Feature-set is `vision,enhanced_vision` — opencv-mark has 1:1 | |
| # coverage of both profiles (79 + 19 = 98 OpenCV-side benchmarks | |
| # total). `framework` is intentionally omitted because OpenCV has | |
| # no graph runtime to measure (the framework benches that depend | |
| # on `vxProcessGraph` / virtual-image fusion semantics are | |
| # OpenVX-only by design). `compare_reports.py` ignores rows that | |
| # only exist on one side, so framework rows naturally don't | |
| # appear in OpenCV pairwise tables. | |
| - name: Build & bench opencv-mark (single-threaded, FHD × 20) | |
| if: always() | |
| id: bench_opencv | |
| run: | | |
| set -euo pipefail | |
| mkdir -p build-opencv-bench | |
| cd build-opencv-bench | |
| cmake -DCMAKE_BUILD_TYPE=Release .. | |
| cmake --build . --target opencv-mark -j$(nproc) | |
| test -x opencv-mark/opencv-mark \ | |
| || { echo "ERROR: opencv-mark not built — OpenCV detection failed in compare job"; exit 1; } | |
| ./opencv-mark/opencv-mark --validate-timing | |
| ./opencv-mark/opencv-mark --feature-set vision,enhanced_vision \ | |
| --resolution FHD --iterations 20 --warmup 5 --threads 1 \ | |
| --output-dir results | |
| ./opencv-mark/opencv-mark --dump-outputs dump-opencv --seed 42 | |
| # ----- Cross-impl numerical verification ----- | |
| # | |
| # We have one dump-* directory per impl that produced a build. | |
| # Run scripts/cross_verify_outputs.py for each (opencv, openvx) | |
| # pair so a reviewer can see at a glance whether MIVisionX, | |
| # Khronos sample, and rustVX agree with OpenCV at the pixel | |
| # level — proves the timing comparison rows below are honest | |
| # apples-to-apples and not "OpenCV is faster because it's | |
| # silently computing the wrong thing". | |
| # | |
| # The verifier exits non-zero on any kernel exceeding its | |
| # per-kernel tolerance; we collect all three reports into the | |
| # step summary first, then fail the step at the end if any | |
| # report failed. That way a single divergence on one impl | |
| # doesn't hide the other two impls' results. | |
| - name: Cross-impl output verification (OpenCV ↔ each OpenVX impl) | |
| if: always() | |
| run: | | |
| set -euo pipefail | |
| # numpy is the only Python dep — used by the verifier for | |
| # array compare + PSNR. apt's python3-numpy on ubuntu-22.04 | |
| # is fine and avoids a pip wheel download. | |
| sudo apt-get install -y python3-numpy | |
| mkdir -p comparisons | |
| OPENCV_DUMP=build-opencv-bench/dump-opencv | |
| { | |
| echo "" | |
| echo "---" | |
| echo "" | |
| echo "## Cross-impl numerical verification" | |
| echo "" | |
| echo "Sentinel kernel suite (VGA × 1 run, no timing) dumped by" | |
| echo "\`--dump-outputs\` on each binary; \`scripts/cross_verify_outputs.py\`" | |
| echo "loads both dumps and computes max-abs-diff + PSNR + exact-%" | |
| echo "per kernel. Tolerances are tuned per kernel (see \`RULES\` in" | |
| echo "the script). Numbers prove inputs are byte-identical (the" | |
| echo "\`_input_u8\` row) and kernels are semantically equivalent." | |
| echo "" | |
| } >> "$GITHUB_STEP_SUMMARY" | |
| OVERALL=0 | |
| for impl in mivisionx khronos rustvx; do | |
| VX_DUMP="build-${impl}/dump-${impl}" | |
| if [ ! -d "$OPENCV_DUMP" ] || [ ! -d "$VX_DUMP" ]; then | |
| echo "skipping verify for $impl: missing dump dir ($VX_DUMP or $OPENCV_DUMP)" | |
| echo "_Skipped \`$impl\` verify — dump directory missing._" >> "$GITHUB_STEP_SUMMARY" | |
| continue | |
| fi | |
| set +e | |
| python3 scripts/cross_verify_outputs.py \ | |
| "$OPENCV_DUMP" "$VX_DUMP" \ | |
| --left-label "OpenCV" --right-label "${impl}" \ | |
| --json comparisons/cross-verify-${impl}.json \ | |
| >> "$GITHUB_STEP_SUMMARY" | |
| rc=$? | |
| set -e | |
| if [ "$rc" -ne 0 ]; then OVERALL=1; fi | |
| echo "" >> "$GITHUB_STEP_SUMMARY" | |
| done | |
| # Surface OVERALL into a step-level marker — the job stays | |
| # green on a divergence (so reviewers still see the timing | |
| # comparison) but the row is annotated and an artifact link | |
| # is uploaded below. | |
| if [ "$OVERALL" -ne 0 ]; then | |
| echo "::warning::Cross-impl verification flagged ≥1 divergence — see job summary" | |
| fi | |
| # ----- Pairwise comparisons ----- | |
| # | |
| # Each comparison is oriented as "<candidate> over <baseline>" so | |
| # the speedup column reads as `candidate / baseline` (>1.00x = | |
| # candidate is faster). The orientation is deliberate: | |
| # | |
| # OpenVX-vs-OpenVX trio — "how much faster is the more-tuned | |
| # impl than the reference": | |
| # * MIVisionX over Khronos sample (AMD over reference) | |
| # * MIVisionX over rustVX (AMD over Rust impl) | |
| # * rustVX over Khronos sample (Rust impl over reference) | |
| # | |
| # OpenVX-vs-OpenCV trio — "does adopting OpenVX pay off vs cv::": | |
| # * MIVisionX over OpenCV | |
| # * Khronos sample over OpenCV | |
| # * rustVX over OpenCV | |
| # | |
| # Mechanically, `scripts/compare_reports.py` computes | |
| # speedup = throughput(arg2) / throughput(arg1) | |
| # so the candidate is passed as the SECOND positional arg. | |
| # | |
| # The step does two things: | |
| # 1. Runs `compare_reports.py` once per pair to produce a | |
| # per-kernel detail .md in comparisons/. These also become | |
| # the `benchmark-comparisons` artifact for downstream tools. | |
| # 2. Invokes `scripts/ci_pairwise_summary.py` once to render | |
| # an organized GitHub Step Summary — TL;DR speedup matrix | |
| # at top, two grouped headline tables, and the per-kernel | |
| # detail tables collapsed inside <details> blocks. See the | |
| # script docstring for the config schema; this used to be a | |
| # ~115-line bash + inline-Python block and rendered ~600 | |
| # lines into the summary. | |
| - name: Pairwise comparisons | |
| if: always() | |
| run: | | |
| set -euo pipefail | |
| mkdir -p comparisons | |
| # Per-impl JSON report paths (parallel arrays keyed by impl id). | |
| IDS=(mivisionx khronos rustvx opencv) | |
| PATHS=( | |
| "build-mivisionx/results/benchmark_results.json" | |
| "build-khronos/results/benchmark_results.json" | |
| "build-rustvx/results/benchmark_results.json" | |
| "build-opencv-bench/results/benchmark_results.json" | |
| ) | |
| LABELS=( | |
| "MIVisionX (AMD OpenVX)" | |
| "Khronos sample" | |
| "rustVX" | |
| "OpenCV" | |
| ) | |
| # The 6 pairs, "<candidate> <baseline>". Order matches the | |
| # rendered summary table order: OpenVX-vs-OpenCV (headline | |
| # question) first, then OpenVX-vs-OpenVX. | |
| PAIRS=( | |
| "mivisionx opencv" | |
| "khronos opencv" | |
| "rustvx opencv" | |
| "mivisionx khronos" | |
| "mivisionx rustvx" | |
| "rustvx khronos" | |
| ) | |
| # Phase 1 — per-kernel detail .md per pair where both inputs | |
| # exist. Missing-input pairs are silently skipped here; the | |
| # summary script renders a friendly "_Detail missing_" note | |
| # for them inside the collapsed <details> block. | |
| path_of() { | |
| for i in "${!IDS[@]}"; do | |
| if [ "${IDS[$i]}" = "$1" ]; then echo "${PATHS[$i]}"; return; fi | |
| done | |
| } | |
| for pair in "${PAIRS[@]}"; do | |
| read -r CAND BASE <<< "$pair" | |
| CAND_PATH=$(path_of "$CAND") | |
| BASE_PATH=$(path_of "$BASE") | |
| OUT="comparisons/${CAND}-over-${BASE}" | |
| if [ -f "$CAND_PATH" ] && [ -f "$BASE_PATH" ]; then | |
| python3 scripts/compare_reports.py "$BASE_PATH" "$CAND_PATH" --output "$OUT" | |
| else | |
| echo "Skipping detail for ${CAND}-over-${BASE}: missing ${CAND_PATH} or ${BASE_PATH}" | |
| fi | |
| done | |
| # Phase 2 — render the organized step summary. The config | |
| # below is the only place pair-grouping & intent text lives; | |
| # the helper handles matrix rendering, headline tables, and | |
| # the collapsed <details> blocks. | |
| cat > /tmp/pairwise-config.json <<'JSON' | |
| { | |
| "reports": { | |
| "mivisionx": {"label": "MIVisionX (AMD OpenVX)", "path": "build-mivisionx/results/benchmark_results.json"}, | |
| "khronos": {"label": "Khronos sample", "path": "build-khronos/results/benchmark_results.json"}, | |
| "rustvx": {"label": "rustVX", "path": "build-rustvx/results/benchmark_results.json"}, | |
| "opencv": {"label": "OpenCV", "path": "build-opencv-bench/results/benchmark_results.json"} | |
| }, | |
| "groups": [ | |
| { | |
| "title": "OpenVX-vs-OpenCV — does adopting OpenVX pay off vs cv::?", | |
| "intent": "Speedup reads as `<OpenVX impl> / OpenCV`. Values >1.00x mean adopting that OpenVX impl pays off vs writing the equivalent directly in OpenCV — the headline question this comparison phase exists to answer. Ordered most-tuned (MIVisionX) → reference (Khronos sample) → Rust impl (rustVX) so the table walks the realistic best→worst range of the trade-off.", | |
| "pairs": [["mivisionx", "opencv"], ["khronos", "opencv"], ["rustvx", "opencv"]] | |
| }, | |
| { | |
| "title": "OpenVX-vs-OpenVX — cross-implementation", | |
| "intent": "Speedup reads as `<candidate> / <baseline>`. MIVisionX (AMD, most-tuned) compared against both reference impls, then rustVX vs Khronos sample (Rust impl over reference).", | |
| "pairs": [["mivisionx", "khronos"], ["mivisionx", "rustvx"], ["rustvx", "khronos"]] | |
| } | |
| ], | |
| "detail_dir": "comparisons" | |
| } | |
| JSON | |
| python3 scripts/ci_pairwise_summary.py --config /tmp/pairwise-config.json \ | |
| >> "$GITHUB_STEP_SUMMARY" | |
| echo "--- comparison artifacts ---" | |
| ls -la comparisons/ || true | |
| - name: Upload per-impl benchmark results | |
| if: always() | |
| uses: actions/upload-artifact@v4 | |
| with: | |
| name: benchmark-results | |
| path: | | |
| build-mivisionx/results/ | |
| build-khronos/results/ | |
| build-rustvx/results/ | |
| build-opencv-bench/results/ | |
| if-no-files-found: ignore | |
| - name: Upload pairwise comparisons | |
| if: always() | |
| uses: actions/upload-artifact@v4 | |
| with: | |
| name: benchmark-comparisons | |
| path: comparisons/ | |
| if-no-files-found: ignore | |
| # Sentinel kernel dumps — uploaded so a reviewer can re-run | |
| # `scripts/cross_verify_outputs.py` locally against any pair | |
| # without re-running the whole CI build, and so the raw .bin | |
| # files are inspectable after the fact for any divergence the | |
| # verifier flagged. | |
| - name: Upload sentinel output dumps | |
| if: always() | |
| uses: actions/upload-artifact@v4 | |
| with: | |
| name: cross-verify-dumps | |
| path: | | |
| build-mivisionx/dump-mivisionx/ | |
| build-khronos/dump-khronos/ | |
| build-rustvx/dump-rustvx/ | |
| build-opencv-bench/dump-opencv/ | |
| if-no-files-found: ignore |