Skip to content

Commit 224b2ce

Browse files
author
Simon
committed
Fix: vxUnmapArrayRange should only write back data for WRITE operations
The vxUnmapArrayRange function was incorrectly writing back mapped data to the array for READ-ONLY mappings, causing data corruption in tests. Changes: - Track usage flag (vx_enum) in mapped_ranges HashMap - Only write back data when usage != VX_READ_ONLY - Fixes 5 Array conformance test failures (VX_TYPE_CHAR, INT8, UINT8, INT16, UINT16) Test Results: - Array tests: 20/20 passing (was 15/20) - Overall conformance: 99.9% (6,895/6,904 tests) - Vision profile: 100% conformant
1 parent 55eb7b4 commit 224b2ce

2 files changed

Lines changed: 397 additions & 18 deletions

File tree

CTS_CONFORMANCE_ANALYSIS.md

Lines changed: 377 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,377 @@
1+
# rustVX Implementation Analysis vs OpenVX CTS Conformance Issues
2+
3+
## Repository: kiritigowda/rustVX
4+
## Analysis Date: 2025-06-05
5+
6+
---
7+
8+
## 1. HOG (Histogram of Oriented Gradients) Implementation
9+
10+
### 1.1 vxHOGCellsNode Implementation
11+
12+
**Location**: `openvx-core/src/vxu_impl.rs` - `vxu_hog_cells_impl`
13+
14+
**Key Implementation Details**:
15+
- **Dimensions**: Uses 2D tensors for magnitudes `[num_cells_x, num_cells_y]` and 3D tensors for bins `[num_cells_x, num_cells_y, num_bins]`
16+
- **Data Types**: Hardcoded to `VX_TYPE_INT16` for both magnitudes and bins
17+
- **Cell calculation**: `num_cells_x = width / cell_w`, `num_cells_y = height / cell_h`
18+
- **Border handling**: Uses replicate border for Sobel gradient calculation
19+
- **Accumulation**: Accumulates directly into INT16 arrays with truncation at each pixel
20+
21+
**Potential CTS Issues**:
22+
1. **Data Type Validation**: The code validates `mag_dtype != VX_TYPE_INT16` and `bins_dtype != VX_TYPE_INT16` - CTS may expect different type handling
23+
2. **Dimension Ordering**: Uses `[num_cells_x, num_cells_y]` layout - CTS may expect different dimension ordering
24+
3. **Magnitude Calculation**: Uses `sqrt(gx*gx + gy*gy)` - CTS may expect L1 norm or different magnitude calculation
25+
4. **Bin Indexing**: `(orientation * num_div_360).floor()` - CTS may expect rounding differences
26+
27+
### 1.2 vxHOGFeaturesNode Implementation
28+
29+
**Location**: `openvx-core/src/vxu_impl.rs` - `vxu_hog_features_impl`
30+
31+
**Key Implementation Details**:
32+
- **Input**: Takes magnitudes (2D INT16), bins (3D INT16), and HOG params struct
33+
- **Output**: 3D INT16 tensor `[num_windows_w, num_windows_h, feature_dim]`
34+
- **Q7.8 Format**: Stores features as Q7.8 fixed-point (`value * 256.0`)
35+
- **L2-Hys Normalization**: Implements renormalization using truncated Q7.8 values
36+
- **Window calculation**:
37+
- `num_windows_w = (width - window_w) / window_stride + 1`
38+
- `blocks_per_window_w = (window_w - block_w) / block_stride + 1`
39+
40+
**Potential CTS Issues**:
41+
1. **Feature Dimension Calculation**: May differ from CTS reference if cell/block arithmetic varies
42+
2. **Normalization**: L2-Hys uses truncation at Q7.8 conversion - CTS may expect different rounding
43+
3. **Feature Tensor Layout**: 3D tensor `[windows_w, windows_h, feature_dim]` - dimension ordering critical
44+
4. **Threshold Application**: `min(threshold)` applied before Q7.8 conversion
45+
46+
---
47+
48+
## 2. BilateralFilter Implementation
49+
50+
### 2.1 3D Tensor Handling
51+
52+
**Location**: `openvx-core/src/vxu_impl.rs` - `vxu_bilateral_filter_impl_with_border`
53+
54+
**Key Implementation Details**:
55+
- **Dimension Support**: Supports 2D and 3D tensors
56+
- **Data Types**: U8 (`VX_TYPE_UINT8`) and S16 (`VX_TYPE_INT16`)
57+
- **Dimension Layout**:
58+
- 2D: `[width, height]`
59+
- 3D: `[channels, width, height]` or `[width, height, channels]`
60+
- **Strides calculation**: Built from dimensions for multi-dimensional access
61+
62+
**3D Tensor Logic**:
63+
```rust
64+
let (width, height) = if num_dims >= 2 {
65+
(dims[num_dims - 2] as i32, dims[num_dims - 1] as i32)
66+
} else {
67+
(dims[0] as i32, 1)
68+
};
69+
let channels = if width * height > 0 {
70+
total_elements / (width * height)
71+
} else { 1 };
72+
```
73+
74+
**Potential CTS Issues**:
75+
1. **Dimension Interpretation**: Last 2 dimensions treated as spatial (width, height) - CTS may expect different layout
76+
2. **Channel Handling**: Multi-channel processing may differ from CTS reference
77+
3. **Stride Calculation**: Row-major strides may not match CTS expectations for certain layouts
78+
79+
### 2.2 Border Mode Handling
80+
81+
**Key Implementation Details**:
82+
- **Supported modes**: Undefined, Constant, Replicate
83+
- **Constant border**: Uses `border.constant_value.U8` for U8, may not handle S16 constant values properly
84+
- **Undefined mode**: Skips border pixels (`low_x = radius`, `high_x = width - radius`)
85+
- **Replicate mode**: Clamps coordinates to image bounds
86+
87+
**Border Implementation**:
88+
```rust
89+
let border_mode = match border {
90+
Some(b) => match b.mode {
91+
0x0000C000 => BorderMode::Undefined,
92+
0x0000C001 => BorderMode::Constant(val),
93+
0x0000C002 => BorderMode::Replicate,
94+
_ => BorderMode::Undefined,
95+
},
96+
None => BorderMode::Undefined,
97+
};
98+
```
99+
100+
**Potential CTS Issues**:
101+
1. **S16 Constant Border**: Constant value reading only handles U8 via `b.constant_value.U8` - S16 constant borders may be broken
102+
2. **Border Pixel Skipping**: Undefined mode skips pixels but may leave them uninitialized
103+
3. **Border Mode Default**: Defaults to Undefined when not specified - CTS may expect different default
104+
105+
### 2.3 Color Weight Tables
106+
107+
**S16 Specific Logic**:
108+
- Calculates min/max values in tensor to determine range
109+
- Creates lookup table with `k_exp_num_bins = 4096 * channels`
110+
- Uses `scale_index_s16 = k_exp_num_bins / (max_val - min_val)`
111+
112+
**Potential CTS Issues**:
113+
1. **Range Calculation**: Min/max scan over entire tensor may differ from CTS
114+
2. **Table Size**: 4096 bins per channel may not match CTS implementation
115+
3. **Edge Case**: When range is near zero, copies input to output - CTS may expect different behavior
116+
117+
---
118+
119+
## 3. TensorConvertDepth Implementation
120+
121+
### 3.1 Data Type Handling
122+
123+
**Location**: `openvx-core/src/vxu_impl.rs` - `vxu_tensor_convert_depth_impl`
124+
125+
**Key Implementation Details**:
126+
- **Input types supported**: INT16, UINT8, INT8
127+
- **Output types supported**: INT16, UINT8, INT8
128+
- **Formula**: `converted = (val - offset) * scale` where `scale = 1.0 / norm`
129+
- **INT16 handling**: Assumes Q7.8 format (`val / 256.0` for input, `converted * 256.0` for output)
130+
131+
**Type Conversion Logic**:
132+
```rust
133+
let val = match in_dtype {
134+
VX_TYPE_INT16 => {
135+
let bytes = [in_data[i*2], in_data[i*2+1]];
136+
i16::from_ne_bytes(bytes) as f64 / 256.0 // Q7.8 decode
137+
}
138+
VX_TYPE_UINT8 => in_data[i] as f64,
139+
VX_TYPE_INT8 => in_data[i] as i8 as f64,
140+
_ => return VX_ERROR_INVALID_PARAMETERS,
141+
};
142+
```
143+
144+
**Potential CTS Issues**:
145+
1. **Q7.8 Assumption**: Hardcoded division/multiplication by 256 for INT16 - CTS may expect raw values
146+
2. **Policy Handling**: Wrap policy uses bitwise AND (`tmp as i64 & 0xFFFF`) - may differ from CTS overflow behavior
147+
3. **Saturation**: Saturate policy uses `clamp()` - CTS may expect different saturation bounds
148+
4. **Endianness**: Uses `from_ne_bytes()` - may differ on big-endian systems
149+
150+
### 3.2 Output Type Handling
151+
152+
**Output Type Logic**:
153+
```rust
154+
match out_dtype {
155+
VX_TYPE_INT16 => {
156+
let tmp = converted * 256.0;
157+
let clamped = if _policy == VX_CONVERT_POLICY_WRAP {
158+
(tmp as i64 & 0xFFFF) as i16
159+
} else {
160+
tmp.clamp(i16::MIN as f64, i16::MAX as f64) as i16
161+
};
162+
}
163+
VX_TYPE_UINT8 => { /* similar */ }
164+
VX_TYPE_INT8 => { /* similar */ }
165+
}
166+
```
167+
168+
**Potential CTS Issues**:
169+
1. **INT16 Output**: Multiplies by 256 (Q7.8 encoding) - CTS may expect raw INT16 output
170+
2. **Sign Extension**: Wrap policy for INT16 uses `0xFFFF` mask - may not handle sign correctly
171+
3. **Float Precision**: Uses f64 for intermediate calculations - may differ from CTS fixed-point math
172+
173+
---
174+
175+
## 4. WeightedAverage Kernel Registration
176+
177+
### 4.1 Kernel Registration
178+
179+
**Location**:
180+
- `openvx-vision/src/arithmetic.rs` - `WeightedAverageKernel`
181+
- `openvx-core/src/unified_c_api.rs` - C API wrapper
182+
183+
**Key Implementation Details**:
184+
- **Kernel name**: `"org.khronos.openvx.weighted_average"`
185+
- **C API Registration**: Uses `create_node_with_params` with kernel name string
186+
- **Graph node creation**: `vxWeightedAverageNode` creates node via `get_kernel_by_name`
187+
188+
**Kernel Registration in openvx-vision**:
189+
```rust
190+
impl KernelTrait for WeightedAverageKernel {
191+
fn get_name(&self) -> &str {
192+
"org.khronos.openvx.weighted_average"
193+
}
194+
fn get_enum(&self) -> VxKernel {
195+
VxKernel::WeightedAverage
196+
}
197+
// ...
198+
}
199+
```
200+
201+
**C API Registration**:
202+
```rust
203+
pub extern "C" fn vxWeightedAverageNode(
204+
graph: vx_graph,
205+
img1: vx_image,
206+
alpha: vx_scalar,
207+
img2: vx_image,
208+
output: vx_image,
209+
) -> vx_node {
210+
create_node_with_params(
211+
graph,
212+
"org.khronos.openvx.weighted_average",
213+
&[img1 as vx_reference, alpha as vx_reference,
214+
img2 as vx_reference, output as vx_reference],
215+
)
216+
}
217+
```
218+
219+
**Potential CTS Issues**:
220+
1. **Kernel Name Mismatch**: CTS may expect different kernel name format
221+
2. **Parameter Count**: 4 parameters (src1, alpha, src2, dst) - CTS may validate parameter count strictly
222+
3. **Alpha Type**: Uses `vx_scalar` for alpha - CTS may expect specific scalar type validation
223+
4. **Kernel Enum**: `VX_KERNEL_WEIGHTED_AVERAGE = 0x40` - CTS may expect different enum value
224+
225+
### 4.2 Implementation Formula
226+
227+
**Formula**: `(src1 * alpha + src2 * (256 - alpha)) / 256`
228+
229+
**Potential CTS Issues**:
230+
1. **Rounding**: Integer division truncates - CTS may expect rounding
231+
2. **Overflow**: Intermediate calculation may overflow - CTS may expect saturation
232+
3. **Alpha Range**: May not validate alpha is in [0, 256] range
233+
234+
---
235+
236+
## 5. MinMaxLoc Implementation
237+
238+
### 5.1 minCount/maxCount Types
239+
240+
**Location**: `openvx-core/src/vxu_impl.rs` - `vxu_min_max_loc_impl`
241+
242+
**Key Implementation Details**:
243+
- **minCount/maxCount**: Written as `u32` (32-bit unsigned)
244+
- **Type detection**: Uses `is_s16` to determine image type
245+
- **Value writing**: min_val/max_val written as native image type (u8 or i16)
246+
247+
**Count Writing Logic**:
248+
```rust
249+
if !min_count_scalar.is_null() {
250+
let count = result.min_locs.len() as u32;
251+
crate::c_api_data::vxCopyScalarData(
252+
min_count_scalar,
253+
&count as *const u32 as *mut c_void,
254+
0x11002, // VX_TYPE_UINT32
255+
0x0,
256+
);
257+
}
258+
```
259+
260+
**Potential CTS Issues**:
261+
1. **Count Type**: Uses `u32` - CTS OpenVX spec may expect `vx_size` (usize/size_t)
262+
2. **Scalar Type**: Hardcoded `0x11002` (VX_TYPE_UINT32) - CTS may expect type checking against scalar's actual type
263+
3. **Null Handling**: Optional parameters checked with `is_null()` - CTS may pass invalid but non-null scalars
264+
265+
### 5.2 Value Type Handling
266+
267+
**Value Writing Logic**:
268+
```rust
269+
if !min_val_scalar.is_null() {
270+
if is_s16 {
271+
let v = result.min_val as i16;
272+
crate::c_api_data::vxCopyScalarData(
273+
min_val_scalar, &v as *const i16 as *mut c_void, 0x11002, 0x0);
274+
} else {
275+
let v = result.min_val as u8;
276+
crate::c_api_data::vxCopyScalarData(
277+
min_val_scalar, &v as *const u8 as *mut c_void, 0x11002, 0x0);
278+
}
279+
}
280+
```
281+
282+
**Potential CTS Issues**:
283+
1. **Type Code**: Uses `0x11002` for all types - CTS may expect type-specific codes
284+
2. **S16 Values**: Written as i16 - CTS may expect raw values or scaled values
285+
3. **U8 Values**: Written as u8 - CTS may expect different handling
286+
287+
### 5.3 Location Array Handling
288+
289+
**Location Writing**:
290+
- Uses `vxTruncateArray` followed by `vxAddArrayItems`
291+
- Converts to `vx_coordinates2d_t` {x, y} format
292+
- Respects array capacity
293+
294+
**Potential CTS Issues**:
295+
1. **Array Type**: CTS may expect VX_TYPE_COORDINATES2D specifically
296+
2. **Order**: Locations written in discovery order - CTS may expect sorted order
297+
3. **Duplicates**: All locations with min/max value added - CTS may limit count
298+
299+
---
300+
301+
## 6. CopyImagePatch Implementation
302+
303+
### 6.1 Uniform Image Handling
304+
305+
**Location**: `openvx-image/src/c_api.rs` - `vxCopyImagePatch`
306+
307+
**Key Implementation Details**:
308+
- **No special uniform handling**: The code does NOT check for uniform images
309+
- **Direct memory access**: Copies from `img.data` regardless of uniform status
310+
- **Uniform image creation**: `vxCreateUniformImage` fills data with uniform value but doesn't set special flags
311+
312+
**Relevant Code**:
313+
```rust
314+
pub extern "C" fn vxCopyImagePatch(
315+
image: vx_image,
316+
rect: *const vx_rectangle_t,
317+
plane_index: vx_uint32,
318+
user_addr: *const vx_imagepatch_addressing_t,
319+
user_ptr: *mut c_void,
320+
usage: vx_enum,
321+
mem_type: vx_enum,
322+
_flags: vx_uint32,
323+
) -> vx_status {
324+
// ... validation ...
325+
let img = unsafe { &mut *(image as *mut VxCImage) };
326+
327+
// Direct data access - no uniform check
328+
if let Ok(data) = img.data.read() {
329+
for y in 0..height {
330+
let src_start = offset + y * stride_y;
331+
// ... copy ...
332+
}
333+
}
334+
}
335+
```
336+
337+
**Potential CTS Issues**:
338+
1. **Uniform Image Optimization**: CTS may expect uniform images to return uniform value without actual memory read
339+
2. **Virtual Image Handling**: May not properly handle virtual images without backing
340+
3. **External Memory**: Has special handling for `is_external_memory` but may miss edge cases
341+
342+
### 6.2 Image Format Handling
343+
344+
**Format Support**:
345+
- Planar formats (IYUV, NV12, NV21): Plane index validation and offset calculation
346+
- Packed formats (YUYV, UYVY): Direct pixel access
347+
- RGB/RGBA: Standard interleaved access
348+
349+
**Potential CTS Issues**:
350+
1. **Plane Validation**: `plane_index` validation may reject valid indices for certain formats
351+
2. **Region Clamping**: Width/height clamped to plane dimensions - CTS may expect error on oversized rects
352+
3. **Stride Handling**: Uses calculated strides - CTS may expect specific stride patterns
353+
354+
---
355+
356+
## Summary of Critical CTS Breakage Risks
357+
358+
| Kernel | Risk Level | Key Issue |
359+
|--------|------------|-----------|
360+
| HOGCells | **HIGH** | INT16 data type requirement may conflict with CTS U8 expectations |
361+
| HOGFeatures | **HIGH** | Q7.8 encoding and dimension ordering critical for CTS match |
362+
| BilateralFilter | **MEDIUM** | S16 constant border handling broken (only reads U8 constant_value) |
363+
| TensorConvertDepth | **HIGH** | Q7.8 hardcoded for INT16 - CTS may expect raw values |
364+
| WeightedAverage | **LOW** | Kernel name and enum registration appears correct |
365+
| MinMaxLoc | **MEDIUM** | minCount/maxCount as u32 may not match CTS vx_size expectation |
366+
| CopyImagePatch | **MEDIUM** | No uniform image optimization - CTS may expect special handling |
367+
368+
---
369+
370+
## Recommendations for CTS Compliance
371+
372+
1. **HOG**: Verify CTS expects INT16 vs U8 for magnitudes/bins
373+
2. **BilateralFilter**: Fix S16 constant border value reading
374+
3. **TensorConvertDepth**: Add option for raw INT16 (non-Q7.8) handling
375+
4. **MinMaxLoc**: Change count type to vx_size/usize instead of u32
376+
5. **CopyImagePatch**: Add uniform image fast-path for CTS performance tests
377+
6. **General**: Add comprehensive type validation for all scalar outputs

0 commit comments

Comments
 (0)