Skip to content

Commit 38fdfb1

Browse files
authored
Framework Mark v1.0 — graph framework benchmarks, OpenVX Framework Score (#13)
First major release of openvx-mark. Adds a vendor-neutral suite of *framework benchmarks* that measure what only the OpenVX graph runtime can do (graph-vs-immediate dividend, virtual-image fusion, scheduling parallelism, async dispatch overhead and concurrency, verify cost vs depth, per-node VX_NODE_PERFORMANCE attribution), plus a single-number OpenVX Framework Score that summarises the framework dividend across scenarios. Five scenarios: - GraphDividend_Box3x3_x4 / GraphDividend_MixedFilters - VerifyChain_Box3x3 (sweeps --framework-chain-depths) - ParallelBranches_Box3x3 (K=4 independent branches) - Async_Single_Box3x3_x4 (vxScheduleGraph overhead) - Async_Concurrent_Box3x3_x2 (graph overlap) Plus per-node VX_NODE_PERFORMANCE attribution on graph_dividend chains (node_count, node_sum_ms, graph_perf_ms, fusion_ratio). OpenVX Framework Score: equal-weight geomean of graph_speedup, virtual_dividend, parallelism_efficiency, concurrency_speedup. >1.0 means the OpenVX graph framework adds aggregate value over a kernel-only baseline. Cross-vendor comparison: both C++ --compare and Python compare_reports.py add a Framework Score row plus a direction-aware Framework Metrics Comparison table where ratio >1.00 always means the second implementation is better. Backward compatible: kernel results emit an empty framework_metrics array; existing JSON consumers continue working unchanged. Default ./openvx-mark runs are byte-identical to before; framework benchmarks are opt-in via --feature-set framework or --feature-set everything. CI runs framework benchmarks for every vendor in a dedicated step and posts the headline metrics to the GitHub Actions job summary. See CHANGELOG.md for the full release notes and v2 backlog. Constituent PRs: #5 plumbing, #6 graph_dividend, #7 verify_chain, #8 parallel_branches, #9 async_streaming, #10 Framework Score, #11 per-node attribution, #12 rollup, #13 doc finalization (this PR).
1 parent d678161 commit 38fdfb1

15 files changed

Lines changed: 1985 additions & 20 deletions

.github/workflows/ci.yml

Lines changed: 65 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,12 @@ name: CI
33
on:
44
push:
55
branches: [main]
6+
# Run CI on pull requests targeting any base branch, not just main.
7+
# This is necessary for stacked PR workflows where a PR's base is
8+
# another feature branch (e.g. an umbrella branch or a previous PR
9+
# in a stack); without this, only the final umbrella -> main PR
10+
# would get CI coverage.
611
pull_request:
7-
branches: [main]
812

913
jobs:
1014
benchmark-khronos-mivisionx:
@@ -46,6 +50,15 @@ jobs:
4650
export LD_LIBRARY_PATH=${{ steps.khronos.outputs.lib_dir }}:$LD_LIBRARY_PATH
4751
./openvx-mark --resolution VGA --iterations 10 --warmup 3
4852
53+
- name: Run framework benchmarks (Khronos)
54+
if: always()
55+
run: |
56+
cd build-khronos
57+
export LD_LIBRARY_PATH=${{ steps.khronos.outputs.lib_dir }}:$LD_LIBRARY_PATH
58+
./openvx-mark --feature-set framework --resolution VGA \
59+
--iterations 10 --warmup 3 --quiet \
60+
--output-dir framework_results 2>&1 | tee ../khronos-framework.log
61+
4962
# --- MIVisionX (AMD OpenVX) ---
5063
- name: Build MIVisionX (CPU backend)
5164
run: |
@@ -77,6 +90,15 @@ jobs:
7790
export LD_LIBRARY_PATH=${{ steps.mivisionx.outputs.lib_dir }}:$LD_LIBRARY_PATH
7891
./openvx-mark --resolution VGA --iterations 10 --warmup 3
7992
93+
- name: Run framework benchmarks (MIVisionX)
94+
if: always()
95+
run: |
96+
cd build-mivisionx
97+
export LD_LIBRARY_PATH=${{ steps.mivisionx.outputs.lib_dir }}:$LD_LIBRARY_PATH
98+
./openvx-mark --feature-set framework --resolution VGA \
99+
--iterations 10 --warmup 3 --quiet \
100+
--output-dir framework_results 2>&1 | tee ../mivisionx-framework.log
101+
80102
# --- Compare Results ---
81103
- name: Compare benchmark results
82104
run: |
@@ -98,6 +120,32 @@ jobs:
98120
cat comparison.md >> "$GITHUB_STEP_SUMMARY"
99121
fi
100122
123+
- name: Post framework benchmarks to job summary
124+
if: always()
125+
run: |
126+
{
127+
echo "## Framework Benchmarks"
128+
echo ""
129+
for vendor in khronos mivisionx; do
130+
log="${vendor}-framework.log"
131+
if [ -f "$log" ] && grep -q "Framework Benchmarks" "$log"; then
132+
echo "### ${vendor}"
133+
echo ""
134+
echo '```'
135+
# Print from the "Framework Benchmarks" header line through
136+
# the closing ===== separator that the binary always emits.
137+
awk '/Framework Benchmarks \([0-9]+\):/{flag=1} flag' "$log"
138+
echo '```'
139+
echo ""
140+
else
141+
echo "### ${vendor}"
142+
echo ""
143+
echo "_No framework benchmarks registered (or all skipped) at this commit._"
144+
echo ""
145+
fi
146+
done
147+
} >> "$GITHUB_STEP_SUMMARY"
148+
101149
- name: Upload Khronos results
102150
if: always()
103151
uses: actions/upload-artifact@v4
@@ -114,6 +162,22 @@ jobs:
114162
path: build-mivisionx/benchmark_results/
115163
if-no-files-found: ignore
116164

165+
- name: Upload Khronos framework results
166+
if: always()
167+
uses: actions/upload-artifact@v4
168+
with:
169+
name: framework-results-khronos-sample
170+
path: build-khronos/framework_results/
171+
if-no-files-found: ignore
172+
173+
- name: Upload MIVisionX framework results
174+
if: always()
175+
uses: actions/upload-artifact@v4
176+
with:
177+
name: framework-results-mivisionx-cpu
178+
path: build-mivisionx/framework_results/
179+
if-no-files-found: ignore
180+
117181
- name: Upload comparison report
118182
if: always()
119183
uses: actions/upload-artifact@v4

CHANGELOG.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# Changelog
2+
3+
All notable changes to **openvx-mark** are documented here.
4+
5+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project follows semantic versioning where the major version tracks backward compatibility of the JSON report schema.
6+
7+
## [Unreleased]
8+
9+
## [1.0.0] — Framework Mark v1
10+
11+
The first major openvx-mark release that benchmarks the OpenVX **graph framework** itself, not just individual kernels. Adds a new family of *framework benchmarks* — scenarios that exercise the OpenVX graph runtime (verification, virtual-image fusion, parallel scheduling, async dispatch, per-node attribution) and that **no per-kernel benchmark can surface** — alongside the existing 60-kernel suite, which is unchanged.
12+
13+
### Added — Framework benchmarks (opt-in)
14+
15+
Run with `--feature-set framework` (only framework scenarios) or `--feature-set everything` (kernels + framework). Default `./openvx-mark` runs are unchanged.
16+
17+
- **`GraphDividend_Box3x3_x4`** and **`GraphDividend_MixedFilters`** — time the same N-node chain three ways (sum of immediate `vxu*` calls, graph with real intermediates, graph with virtual intermediates) and emit `sum_immediate_ms`, `graph_real_ms`, `graph_virtual_ms`, `graph_speedup`, `virtual_dividend`. The headline `graph_speedup > 1.0` is the framework dividend.
18+
- **`VerifyChain_Box3x3`** — sweeps chain depths (configurable via `--framework-chain-depths`, default `1,4,16,64`) and reports per-N create / verify / first-process / steady-process timings, plus regression-derived `verify_per_node_ms`, `verify_intercept_ms`, and `first_process_overhead_ms`.
19+
- **`ParallelBranches_Box3x3`** — K = 4 independent Box3x3 nodes sharing one input image, compared against K back-to-back `vxuBox3x3` immediate calls. Reports `parallelism_speedup` and `parallelism_efficiency` (where 1.0 = perfect K-way parallelism).
20+
- **`Async_Single_Box3x3_x4`** — quantifies the per-call cost of `vxScheduleGraph` + `vxWaitGraph` vs `vxProcessGraph` on the same graph. Reports `async_overhead_ratio` (lower is better).
21+
- **`Async_Concurrent_Box3x3_x2`** — schedules two independent graphs concurrently and reports `concurrency_speedup` — direct evidence of whether the runtime overlaps independent work.
22+
- **Per-node `VX_NODE_PERFORMANCE` attribution** on both `GraphDividend_*` chains: emits `node_count`, `node_sum_ms`, `graph_perf_ms`, and `fusion_ratio` (`node_sum_ms / graph_perf_ms`). `≈ 1.0` = strict back-to-back, `> 1.0` = fusion / overlap detected, `≈ node_count` = the runtime reports graph time per node and isn't attributing per-node performance.
23+
24+
### Added — OpenVX Framework Score
25+
26+
A new dimensionless headline number, computed as the **equal-weight geometric mean** of every `graph_speedup`, `virtual_dividend`, `parallelism_efficiency`, and `concurrency_speedup` value produced by framework benchmarks. **`framework_score > 1.0` means the OpenVX graph framework adds aggregate value over a kernel-only baseline.** Lower-is-better metrics and the scenario-specific `fusion_ratio` are intentionally excluded so the score has a single monotonic interpretation. Only emitted when framework benchmarks are run.
27+
28+
Surfaced everywhere the Vision Score appears:
29+
30+
- Terminal summary: `OpenVX Framework Score: <x>x (geomean of <N> framework metrics)`.
31+
- JSON `scores.framework_score` and `scores.framework_metric_count`.
32+
- Markdown report's Composite Scores table plus a new dedicated **Framework Benchmarks** section listing every metric per scenario with its unit and direction.
33+
- Both the C++ `--compare` path and `scripts/compare_reports.py` add a Framework Score row to **Conformance & Scores** and a new **Framework Metrics Comparison** table whose ratio column is direction-aware (so `> 1.00` always means the second implementation is better).
34+
35+
### Added — Plumbing
36+
37+
- New `FrameworkMetric` struct: `{name, value, unit, higher_is_better}`. `BenchmarkResult` gains a `framework_metrics` vector (empty for kernel results — backward-compatible).
38+
- New `BenchmarkCase::framework_run` callback: framework benchmarks own their entire timing loop and return a populated `BenchmarkResult`. Existing 60-kernel codepath is untouched.
39+
- New CLI flag `--framework-chain-depths` for `verify_chain` depth sweeps.
40+
- New `--feature-set` values: `framework` (only) and `everything` (kernels + framework).
41+
- CI workflow runs framework benchmarks for every vendor (Khronos sample-impl, MIVisionX) in a dedicated step and posts the headline metrics to the GitHub Actions job summary.
42+
43+
### Changed
44+
45+
- `BenchmarkRunner::runAll` dispatches to `framework_run` when set, with a pre-check for required kernels (so framework cases skip cleanly on implementations missing Box3x3 etc.).
46+
- README adds a Framework Benchmarks section, glossary entries for every framework metric, and a Framework Score entry. Example terminal summary updated.
47+
- JSON schema adds the `scores.framework_score`, `scores.framework_metric_count`, and per-result `framework_metrics` array. Existing kernel results emit an empty `framework_metrics` array. **No breaking change** for tools that consumed the previous schema.
48+
49+
### Notes for implementers
50+
51+
- `fusion_ratio` is implementation-quality-dependent: a value `≈ node_count` (e.g. `4.0` on a 4-node chain) usually means the runtime is reporting whole-graph time on every node. Useful cross-vendor signal in its own right; intentionally excluded from the Framework Score because not every conformant runtime populates `VX_NODE_PERFORMANCE` cleanly.
52+
- `concurrency_speedup < 1.0` at small resolutions is expected and meaningful — it means async dispatch overhead exceeds concurrency gain at that work size.
53+
- Pipelined streaming via the optional `vx_khr_pipelining` extension is intentionally out of scope for v1; only standard OpenVX APIs are used.
54+
55+
### v2 backlog (separate future PRs)
56+
57+
- `vxMapImagePatch` / `vxUnmapImagePatch` round-trip cost (host ↔ device tax).
58+
- User-kernel dispatch tax via `vxAddUserKernel` no-op.
59+
- Context lifecycle stress (`vxCreateContext` / `vxReleaseContext` × N).
60+
- Determinism under load (single-graph CV% while K other graphs are scheduled).
61+
- NN / extension-gated benchmarks.
62+
63+
See [`docs/framework-mark-plan.md`](docs/framework-mark-plan.md) for the full v1 design rationale.
64+
65+
---
66+
67+
## Pre-1.0
68+
69+
Earlier work — the kernel-only suite, output verification, MIVisionX CI, and version-independent build — landed in PRs #1#4 on `main`. There is no formal changelog entry for those releases; see git history.

CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,7 @@ set(BENCHMARK_SOURCES
112112
src/benchmarks/immediate_benchmarks.cpp
113113
src/benchmarks/pipeline_vision.cpp
114114
src/benchmarks/pipeline_feature.cpp
115+
src/benchmarks/framework_benchmarks.cpp
115116
)
116117

117118
add_executable(openvx-mark ${BENCHMARK_SOURCES})

0 commit comments

Comments
 (0)