Skip to content

Conversation

@r-abishek
Copy link
Member

@r-abishek r-abishek commented Nov 14, 2025

This PR adds nearest neighbors padding for gaussian filter augmentation.
This is similar to the fixes in box/median already in ToT.
Current gaussian filter ToT version had border pixels fading out without NN padding, the PR attempts to fix the same.

Performance profiling to ensure no significant changes in performance due to addition of computation on nearest neighbor padding.
(Significant performance boost in the F32 PKD3 HIP variant observed)

  • Add updated QA support for testing the HOST and HIP Backends for various layout types - F32 test support introduced, U8 test support updated.
  • Update the compute functions for gaussian filter to use fmadd on HOST Backend.
  • Template the load filter functions significantly on HOST Backend - 36 helper functions reduced to 3 templated functions.
  • Introduce enums to refer to a particular image border-edge (left, right, bottom and top edges)
  • Separate the compute functions of all bit depths on HIP Backend (Avoids additional bit depth conversions)
  • HIP Backend upgrades also help in performance improvements of specific F32 variants compared to existing ToT version : Specifically 37% - 82% for kernel Sizes 5,7 and 9 for PKD3 variants.
image

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds nearest-neighbor padding support for the Gaussian filter augmentation to prevent border pixels from fading out. The changes include API updates, kernel optimizations for both HIP and HOST backends, and enhanced QA test support.

Key Changes

  • Added borderType parameter to gaussian filter API functions with REPLICATE (nearest-neighbor) support
  • Introduced RpptImageBorderEdge enum for identifying image border edges
  • Templated helper functions on HIP backend (reduced from 36 functions to 3 templated functions)
  • Updated QA test suite to enable gaussian filter testing for F32 and U8 formats
  • Performance optimizations showing 37-82% improvement for specific F32 PKD3 variants on kernel sizes 5, 7, and 9

Reviewed Changes

Copilot reviewed 9 out of 22 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
api/rppt_tensor_filter_augmentations.h Added borderType parameter to gaussian filter API signatures
api/rppdefs.h Introduced RpptImageBorderEdge enum for edge identification
utilities/test_suite/rpp_test_suite_image.h Removed GAUSSIAN_FILTER from nonQACases to enable QA testing
utilities/test_suite/HOST/Tensor_image_host.cpp Added borderType validation and parameter passing
utilities/test_suite/HIP/Tensor_image_hip.cpp Added borderType validation and parameter passing
src/modules/tensor/rppt_tensor_filter_augmentations.cpp Added borderType parameter validation
src/modules/tensor/hip/kernel/gaussian_filter.cpp Major refactoring with templated functions and NN padding logic
src/modules/tensor/cpu/kernel/gaussian_filter.cpp Enhanced with templated load functions and padding support
src/include/common/cpu/rpp_cpu_simd_load_store.hpp Added store24 functions for various data types
docs/data/doxygenOutputs/*.png Updated documentation images

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

{
rpp_hip_load24_pkd3_to_uchar8_pln3(srcPtr + srcIdx, src_smem_channel);
}
if ((id_x_i > roiBeginX) && ((id_x_i + 7 + padLength) < roiWidth) && (id_y_i > roiBeginY) && (id_y_i < roiHeight))
Copy link

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition id_x_i > roiBeginX uses strict inequality, but should use >= for proper boundary checking. When id_x_i == roiBeginX, this is a valid position that doesn't need padding, but the current condition forces it into the padding path.

Suggested change
if ((id_x_i > roiBeginX) && ((id_x_i + 7 + padLength) < roiWidth) && (id_y_i > roiBeginY) && (id_y_i < roiHeight))
if ((id_x_i >= roiBeginX) && ((id_x_i + 7 + padLength) < roiWidth) && (id_y_i > roiBeginY) && (id_y_i < roiHeight))

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

@codecov
Copy link

codecov bot commented Nov 14, 2025

Codecov Report

❌ Patch coverage is 96.64804% with 12 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/modules/tensor/cpu/kernel/gaussian_filter.cpp 90.98% 12 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #639      +/-   ##
===========================================
- Coverage    88.33%   88.13%   -0.19%     
===========================================
  Files          195      195              
  Lines        82723    82479     -244     
===========================================
- Hits         73068    72692     -376     
- Misses        9655     9787     +132     
Files with missing lines Coverage Δ
api/rppdefs.h 71.72% <ø> (ø)
src/include/common/cpu/rpp_cpu_filter.hpp 100.00% <100.00%> (ø)
src/include/common/cpu/rpp_cpu_simd_load_store.hpp 93.04% <100.00%> (-0.67%) ⬇️
src/modules/tensor/hip/kernel/gaussian_filter.cpp 99.71% <ø> (ø)
...odules/tensor/rppt_tensor_filter_augmentations.cpp 97.95% <100.00%> (+0.01%) ⬆️
src/modules/tensor/cpu/kernel/gaussian_filter.cpp 85.49% <90.98%> (+0.17%) ⬆️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kiritigowda kiritigowda self-assigned this Nov 14, 2025
@kiritigowda
Copy link
Collaborator

@r-abishek please resolve conflicts

Copy link
Contributor

@rrawther rrawther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SundarRajan28 : Would like to review this PR with you before merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:precheckin enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants