Commit 08c28b6
feat(enhanced-vision): implement vxMin/vxMax (Phase 1) (#11)
* feat(enhanced-vision): implement vxMin/vxMax (Phase 1)
Adds full graph-mode and immediate-mode support for the OpenVX 1.3
Enhanced Vision pixel-wise minimum and maximum kernels:
- New `min_image` / `max_image` core routines and `vxu_min_impl` /
`vxu_max_impl` immediate-mode dispatchers in `openvx-core::vxu_impl`,
covering both `VX_DF_IMAGE_U8` and `VX_DF_IMAGE_S16` formats with
matching-format/dimension validation.
- `vxMinNode`, `vxMaxNode`, `vxuMin`, `vxuMax` exports in
`openvx-core::unified_c_api`, wired into the graph kernel dispatcher
via the new `org.khronos.openvx.min` / `.max` cases.
- Kernel signature entries in `openvx-core::c_api::standard_kernels`
and `openvx-vision::kernel_enums::VISION_KERNELS`, plus
`VxKernel::Min` / `VxKernel::Max` enum variants.
- `MinKernel` / `MaxKernel` registered in
`openvx-vision::register_all_kernels`.
- Rust-side unit tests for `min_image` / `max_image` (basic,
dim-mismatch).
CTS: builds with `OPENVX_USE_ENHANCED_VISION=ON` and passes 8/8
filtered tests (`Min.*:Max.*` — Immediate U8, Graph U8, Immediate S16,
Graph S16 each); the Khronos report now records this as a partial
Enhanced Vision profile pass.
Link stubs for the rest of the Enhanced Vision feature set
(`Bilateral`, `LBP`, `MatchTemplate`, `NonMaxSuppression`, `HOG*`,
`ScalarOperation`, `Select`, `vxuCopy`, `vxuHoughLinesP`, `Tensor*`
kernels and tensor-handle helpers) are added so the CTS binary links
under `-DOPENVX_USE_ENHANCED_VISION=ON`. They return `NULL` /
`VX_ERROR_NOT_IMPLEMENTED` and will be replaced by real
implementations in subsequent phases. The Phase-1 CI filter
(`Min.*:Max.*`) does not exercise them.
CI:
- New `enhanced-vision` job filtered to `Min.*:Max.*`.
- Existing CTS build now passes `-DOPENVX_USE_ENHANCED_VISION=ON`
explicitly.
README:
- Conformance status now lists Enhanced Vision (8/8) alongside
baseline and Vision profile counts.
- Adds the new `enhanced-vision` job badge to the per-job status
table.
Co-authored-by: Cursor <cursoragent@cursor.com>
* chore: untrack target/ build artifacts
The whole `target/` tree (510 files, including a stale Linux
`libopenvx_ffi.so` that pre-dated the rust workspace move) was
committed before `.gitignore` listed it. `.gitignore` already has
`target/`, so untracking is a one-shot cleanup — `cargo build` will
recreate the directory locally and the gitignore rule will keep it
out from now on. No source / CI changes; CI builds rustVX from
scratch and uploads its own `target/release/libopenvx_ffi.so` as a
workflow artifact, so removing the stale checked-in copy is safe.
Co-authored-by: Cursor <cursoragent@cursor.com>
* ci: fix invalid YAML in enhanced-vision job name
`name: enhanced-vision (Phase 1: Min/Max)` is malformed YAML — the
unquoted `: ` inside the value parses as a key indicator and GitHub
Actions rejects the workflow with `Invalid workflow file
.github/workflows/conformance.yml#L337`. Quote the string and
replace the inner colon with an em-dash.
Co-authored-by: Cursor <cursoragent@cursor.com>
* ci(benchmark): show speedup of rustVX over Khronos sample
The `compare_reports.py` script computes
`Speedup = throughput(report_b) / throughput(report_a)`
and labels the column ">1.00 means report_b is faster". The CI was
passing rustVX as `report_a` and Khronos as `report_b`, so the
Speedup column was actually showing how much faster the *Khronos
sample* was than rustVX — the inverse of the inline comment claim.
Swap the argument order (Khronos = baseline / report_a, rustVX =
candidate / report_b) so the column now reads as
"rustVX over Khronos" with >1.00x meaning rustVX wins.
Also prepend a headline summary to the GitHub Actions job summary
that aggregates per-benchmark speedups into:
- geomean and median speedup of rustVX over Khronos
- count of benchmarks compared
- rustVX-faster vs Khronos-sample-faster counts
- best and worst per-benchmark speedup (with kernel/mode/resolution)
- a one-line "rustVX is N.NNx faster" / "N.NNx slower" verdict
Followed by the existing detailed comparison table from
`compare_reports.py`. Validated locally on synthetic JSON.
README: tweak the benchmark callout to mention the new headline.
Co-authored-by: Cursor <cursoragent@cursor.com>
* perf(integral_image): native u32 stores in inner loop
The Phase-1 PR's openvx-mark CI run flagged `IntegralImage` as 4.13x
slower than the Khronos sample (rustVX 1.44ms vs Khronos 0.35ms at
VGA, CV 0.4% — a stable, real gap, not noise).
Root cause was the inner loop in `vxu_impl::integral_image`:
- Every pixel read of the row-above value did 4 byte loads from
`dst.data_mut()` and reassembled a `u32` via `from_le_bytes`,
each guarded by a `if offset + 4 <= len` bounds check.
- Every write decomposed the result with `to_le_bytes()` and
stored 4 individual bytes through `dst_data[offset+i] = b[i]`,
again behind a bounds check.
- Source pixels went through `Image::get_pixel(x, y)` which itself
bounds-checks and `unwrap_or(&0)`s on every call.
That defeated the optimiser's ability to emit native aligned 32-bit
loads/stores and added two redundant bounds checks per pixel.
Fix:
- Validate buffer sizes once up front (returning
`VX_ERROR_INVALID_DIMENSION` if undersized rather than silently
skipping pixels), then reinterpret the destination byte buffer
as `&mut [u32]` via `from_raw_parts_mut`. rustVX only ships on
little-endian hosts (x86_64 / aarch64) so the on-disk layout is
preserved; a `debug_assert!(cfg!(target_endian = "little"))`
keeps that contract honest if a big-endian target is ever added.
- Split the dst into "previous row" / "current row" slices via
`split_at_mut` so the borrow checker sees disjoint ranges; the
optimiser then emits a tight scalar loop with native u32 ops.
- Hoist `src.data()` to a `&[u8]` slice and index it directly,
eliminating the per-pixel `get_pixel` bounds check.
Local microbench (clang -O2 calling vxuIntegralImage 200x at VGA):
before: 1.4425 ms/call, 213 MP/s (CI value)
after: 0.2428 ms/call, 1265 MP/s
≈6x faster locally; should beat the Khronos sample (0.35ms / 880
MP/s on the same GHA hardware) once CI re-runs.
Conformance preserved: 9/9 IntegralImage CTS tests pass with the
new code (filter `Integral.*`); 17/17 with `Integral.*:Min.*:Max.*`.
The two other "losses" the headline reported on the previous CI run
are not real:
- `LaplacianPyramid` reported Khronos at 0.0016ms/call = 1.6µs at
VGA, which is physically impossible for a multi-level pyramid
build — that's a Khronos sample no-op / lazy evaluation, not a
rustVX deficit.
- `Magnitude` was 1.01x slower (2.65ms vs 2.62ms, CV 0.3%/0.8%) —
well within measurement noise.
Both are tracked separately for follow-up; this commit fixes the
only verified clean gap.
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: Cursor <cursoragent@cursor.com>1 parent 8291c16 commit 08c28b6
520 files changed
Lines changed: 1015 additions & 2002 deletions
File tree
- .github/workflows
- openvx-core/src
- openvx-vision
- src
- tests
- target
- debug
- .fingerprint
- cfg-if-b8c685c3ec20d4c2
- equivalent-364f9b6ad821cc98
- fixedbitset-d0957bd767b39c2c
- hashbrown-48f27f6043dc50b7
- indexmap-0baa63e1d33dab96
- libc-3871bc5681dc8fd0
- libc-c5f1dd3bf733624d
- libc-e7d3cec3bfcf91a0
- lock_api-1c3bbf92d1550d7f
- log-eb59abea57a4ada2
- once_cell-f031b7db5ef08372
- openvx-buffer-6b3087490e740e75
- openvx-core-8d5b046e66be1a0e
- openvx-ffi-d192a2d1949a7654
- openvx-image-9b542a7fae925b9c
- openvx-vision-9c6af6d3dfc68589
- parking_lot-0d043da7d37e5ebb
- parking_lot_core-0b9c9b58b128f550
- parking_lot_core-d8b900c1eeb10f02
- parking_lot_core-f3897c75372e727e
- petgraph-f69e1831b0e6e809
- proc-macro2-0454c554b14b2896
- proc-macro2-59fb65d883d68442
- proc-macro2-eca6b6b3659d092d
- quote-522f3ff9ee457532
- quote-ca0183ca5fc2f7d2
- quote-e7e3cb3fa7dec76d
- scopeguard-f9270d062a6d4284
- smallvec-9a1bf4e4f89d02f3
- syn-24f6ccf389adcd33
- thiserror-68b386c88d53b475
- thiserror-8718ae72baf018d2
- thiserror-d1efd4102e41cb4c
- thiserror-impl-29fd9f6d4fc8991f
- unicode-ident-2046154b3dc3f7c4
- build
- libc-3871bc5681dc8fd0
- libc-c5f1dd3bf733624d
- parking_lot_core-0b9c9b58b128f550
- parking_lot_core-d8b900c1eeb10f02
- proc-macro2-59fb65d883d68442
- proc-macro2-eca6b6b3659d092d
- quote-522f3ff9ee457532
- quote-e7e3cb3fa7dec76d
- thiserror-68b386c88d53b475
- thiserror-d1efd4102e41cb4c
- deps
- incremental
- openvx_buffer-3acxyf6oolue4
- s-hhjvv0owbs-18fat4h-d4optae8eljsqkkbfy95p18ni
- openvx_core-0urmed9s4ndl7
- s-hhjvuu8uyo-123f2sa-2fk4ie1tou84ni5aox3dda9ns
- openvx_ffi-1o8hmiikunz8t
- s-hhjvv10ykw-0urexgr-etajmpt9qjhc5y3021o0tp53a
- openvx_image-2hqgttte7iwaz
- s-hhjvv0owkz-1r4xttb-8ca0vgu30gtjbuzmdgkmqb9yq
- openvx_vision-21wgfejgu69bn
- s-hhjvv0sae1-1y15m9e-6z02cvay7hbsz9zbgoy1q6k7v
- release
- .fingerprint
- cfg-if-d93cbb7a0fc64597
- equivalent-a33f90f0e1324044
- fixedbitset-aa39c9c0ba72742c
- hashbrown-d86f3978847d07e5
- indexmap-b80d4e70608e2642
- libc-6223fcab79ac4f71
- libc-edf0d34ab5d0e3a4
- libc-ef3a64e19ced48a1
- lock_api-52c2a7b84d1b49e5
- log-17538fffd1d89ef5
- once_cell-4d22a588ad61cb01
- openvx-buffer-958e2f9a810b2d4f
- openvx-core-2aca26941ac0a37f
- openvx-ffi-dbe8182ad06dbfb5
- openvx-image-1c0cabda6acb7582
- openvx-vision-c491eddea49cce67
- parking_lot-f5017e3f9bb27701
- parking_lot_core-23219f7db82d4aa4
- parking_lot_core-a1adf6637615b922
- parking_lot_core-b88e22e93ac4ee46
- petgraph-c02b057a88a69a2c
- proc-macro2-09a9c2d5277a398f
- proc-macro2-6264fe01e08c7b8b
- proc-macro2-986587f949cb3616
- quote-1e859e3c81e25afd
- quote-6462f4c5c001e9d8
- quote-c93d442eb9933478
- scopeguard-e2a46a01604b4060
- smallvec-e07ee3139771e6d2
- syn-c6f6cd2e1f4efb3d
- thiserror-050a95dea439b9aa
- thiserror-8ad750852b6fccaf
- thiserror-d4c9c354dea10a78
- thiserror-impl-afaa4bc9624f82c5
- unicode-ident-eb594fc5b763ee92
- build
- libc-6223fcab79ac4f71
- libc-edf0d34ab5d0e3a4
- parking_lot_core-23219f7db82d4aa4
- parking_lot_core-b88e22e93ac4ee46
- proc-macro2-6264fe01e08c7b8b
- proc-macro2-986587f949cb3616
- quote-1e859e3c81e25afd
- quote-c93d442eb9933478
- thiserror-8ad750852b6fccaf
- thiserror-d4c9c354dea10a78
- deps
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
58 | | - | |
| 58 | + | |
| 59 | + | |
59 | 60 | | |
60 | 61 | | |
61 | 62 | | |
| |||
325 | 326 | | |
326 | 327 | | |
327 | 328 | | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
328 | 356 | | |
329 | 357 | | |
330 | 358 | | |
| |||
458 | 486 | | |
459 | 487 | | |
460 | 488 | | |
461 | | - | |
462 | | - | |
463 | | - | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
464 | 495 | | |
465 | | - | |
| 496 | + | |
466 | 497 | | |
467 | 498 | | |
468 | 499 | | |
469 | 500 | | |
470 | 501 | | |
471 | 502 | | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
472 | 573 | | |
473 | 574 | | |
474 | 575 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
| 17 | + | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
24 | 27 | | |
25 | 28 | | |
26 | 29 | | |
| |||
212 | 215 | | |
213 | 216 | | |
214 | 217 | | |
215 | | - | |
| 218 | + | |
216 | 219 | | |
217 | 220 | | |
218 | 221 | | |
| |||
231 | 234 | | |
232 | 235 | | |
233 | 236 | | |
| 237 | + | |
234 | 238 | | |
235 | 239 | | |
236 | 240 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
310 | 310 | | |
311 | 311 | | |
312 | 312 | | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
313 | 316 | | |
314 | 317 | | |
315 | 318 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
114 | 114 | | |
115 | 115 | | |
116 | 116 | | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
117 | 120 | | |
118 | 121 | | |
119 | 122 | | |
| |||
0 commit comments