feat(tensor): implement all TensorOp kernels (214/214 passing)#40
Merged
Merged
Conversation
added 2 commits
May 25, 2026 15:06
…nsorHandle, vxCreateTensorFromHandle, vxCreateVirtualTensor - Fix tensor attribute constants (0x81500 series instead of 0x00) - Implement vxMapTensorPatch with ROI offset and stride calculation - Implement vxUnmapTensorPatch returning VX_SUCCESS - Implement vxSwapTensorHandle copying data from new pointer - Implement vxCreateTensorFromHandle copying external data - Fix vxCreateVirtualTensor to extract context from graph properly Tensor category: 12/12 passing
Implement 7 tensor operation kernels in vxu_impl.rs: - vxu_tensor_add_impl — elementwise add with Q78/U8/S8, wrap vs saturate - vxu_tensor_subtract_impl — elementwise subtract - vxu_tensor_multiply_impl — elementwise multiply with Q78 fixed-point, scale factor, rounding policy - vxu_tensor_convert_depth_impl — depth conversion with norm/offset scalars - vxu_tensor_table_lookup_impl — LUT lookup using ARRAYS registry - vxu_tensor_transpose_impl — generic N-D transpose swapping two dimensions - vxu_tensor_matrix_multiply_impl — 2D matrix multiply with transpose flags for A/B/C and optional C addition Fix critical bugs: - vxCopyTensorPatch write path: use mutable HashMap access instead of immutable reference cast to avoid UB and silent write failures - Q78/S8 wrap behavior: use bit-level truncation (value & mask) instead of Rust's saturating as i16/i8 - Tensor attribute constants: correct VX_TENSOR_* values to follow (VX_TYPE_TENSOR << 8) | local_index pattern - Matrix multiply dimension validation: use CTS column-major layout where dims[0] is inner/fast dimension, dims[1] is outer/slow dimension - Graph dispatch for tensor_matrix_multiply: allow null C tensor (optional) - Graph execution: add exception for tensor_matrix_multiply param 2 being optional (param_id == 0 should not be an error) Also add kernel registrations in c_api.rs (enums 0x41–0x47) and wire up all tensor op dispatch in unified_c_api.rs. CI: add tensor-ops job (Tensor + TensorOp categories) to conformance.yml.
This was referenced May 28, 2026
kiritigowda
pushed a commit
that referenced
this pull request
May 28, 2026
Update the OpenVX 1.3.1 coverage plan to current state: - Mark P2 (Base API + UDO, 10 funcs) as COMPLETE (#16, #18, #23, #24) - Mark P3 (Enhanced Vision non-tensor, 14 funcs) as COMPLETE (#35, #36, #39) - Mark P4 (Tensor kernels, 14 funcs) as COMPLETE (#40) - Mark P5a (Control-flow nodes, 2 funcs) as COMPLETE (#41) - Update conformance tally: 6,786 / 6,786 tests passing (100%) - Add open issues status review (#38 stale, #20–#22 should close) - Update coverage trajectory: ~300/361 (~83%) implemented - Refresh risks and tracking labels Co-authored-by: Kiriti <kiriti@example.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR implements the full TensorOp category for OpenVX Enhanced Vision conformance.
What's implemented
Tensor category (12/12 passing):
TensorOp category (214/214 passing):
Key bug fixes
Test results
CI changes
Added tensor-ops job to conformance.yml testing Tensor.* and TensorOp.* filters.
Note: remaining gaps are ControlFlow (186 tests), GraphEnhanced (14 tests), and GraphDelayTensor (3 tests).