Skip to content

refactor: avoid to allocate device memory for mean and std in every loop#58

Merged
ktro2828 merged 1 commit intomainfrom
refactor/detector/allocate-mean-std
Oct 2, 2025
Merged

refactor: avoid to allocate device memory for mean and std in every loop#58
ktro2828 merged 1 commit intomainfrom
refactor/detector/allocate-mean-std

Conversation

@ktro2828
Copy link
Copy Markdown
Owner

@ktro2828 ktro2828 commented Oct 2, 2025

Description

This pull request refactors the configuration and preprocessing logic for all 2D detector and segmenter classes to improve efficiency and clarity. The main change is moving the image normalization parameters (mean and std) to device memory once during configuration construction, rather than reloading and copying them on every preprocessing call. Additionally, constructors now use move semantics for configuration objects, ensuring better resource management.

Configuration and Device Memory Improvements:

  • Refactored all config structs (Detector2dConfig, InstanceSegmenter2dConfig, PanopticSegmenter2dConfig, SemanticSegmenter2dConfig) to upload mean and std arrays to device memory in their constructors, replacing host-side vectors with device pointers (CudaUniquePtr<float[]>). This avoids repeated device memory allocations and copies during preprocessing. [1] [2] [3] [4]

Constructor and Resource Management Updates:

  • Updated all detector and segmenter class constructors to accept configuration objects via move semantics (&&), and store them using std::move, ensuring efficient resource transfer and ownership. [1] [2] [3] [4] [5] [6] [7] [8]

Preprocessing Efficiency:

  • Simplified the preprocess methods in all detector and segmenter classes to use the device-side mean and std arrays directly, removing redundant host-to-device memory operations and related temporary allocations. [1] [2] [3] [4]

How was this PR tested?

  • Confirmed build passed
  • Confirmed some projects worked including yolox, deimv2, pidnet, eomt

Notes for reviewers

None.

Effects on system behavior

None.

Signed-off-by: ktro2828 <kotaro.uetake@tier4.jp>
Copilot AI review requested due to automatic review settings October 2, 2025 20:47
@ktro2828 ktro2828 linked an issue Oct 2, 2025 that may be closed by this pull request
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors configuration and preprocessing logic for 2D detector and segmenter classes to improve performance by avoiding repeated device memory allocations. The main optimization moves image normalization parameters (mean and std) to device memory once during configuration construction instead of reallocating them on every preprocessing call.

  • Refactored all config structs to upload mean and std arrays to device memory in constructors
  • Updated constructors to use move semantics for better resource management
  • Simplified preprocessing methods to use pre-allocated device memory directly

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
mmros/include/mmros/detector/detector2d.hpp Added constructor to Detector2dConfig for device memory allocation and updated class constructor signature
mmros/include/mmros/detector/instance_segmenter2d.hpp Added constructor to InstanceSegmenter2dConfig for device memory allocation and updated class constructor signature
mmros/include/mmros/detector/panoptic_segmenter2d.hpp Added constructor to PanopticSegmenter2dConfig for device memory allocation and updated class constructor signature
mmros/include/mmros/detector/semantic_segmenter2d.hpp Added constructor to SemanticSegmenter2dConfig for device memory allocation and updated class constructor signature
mmros/src/detector/detector2d.cpp Updated constructor to use move semantics and simplified preprocessing to use device-allocated arrays
mmros/src/detector/instance_segmeter2d.cpp Updated constructor to use move semantics and simplified preprocessing to use device-allocated arrays
mmros/src/detector/panoptic_segmenter2d.cpp Updated constructor to use move semantics and simplified preprocessing to use device-allocated arrays
mmros/src/detector/semantic_segmenter2d.cpp Updated constructor to use move semantics and simplified preprocessing to use device-allocated arrays

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +41 to +44
Detector2dConfig(
const std::vector<double> & _mean, const std::vector<double> & _std,
archetype::BoxFormat2D _box_format, double _score_threshold)
: box_format(_box_format), score_threshold(_score_threshold)
Copy link

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constructor should validate that the mean and std vectors have the same size and are not empty before allocating device memory. Consider adding size validation to prevent runtime errors.

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +40
InstanceSegmenter2dConfig(
const std::vector<double> & _mean, const std::vector<double> & _std,
archetype::BoxFormat2D _box_format, double _score_threshold)
: box_format(_box_format), score_threshold(_score_threshold)
Copy link

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constructor should validate that the mean and std vectors have the same size and are not empty before allocating device memory. Consider adding size validation to prevent runtime errors.

Copilot uses AI. Check for mistakes.
Comment on lines +39 to +42
PanopticSegmenter2dConfig(
const std::vector<double> & _mean, const std::vector<double> & _std,
archetype::BoxFormat2D _box_format, double _score_threshold)
: box_format(_box_format), score_threshold(_score_threshold)
Copy link

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constructor should validate that the mean and std vectors have the same size and are not empty before allocating device memory. Consider adding size validation to prevent runtime errors.

Copilot uses AI. Check for mistakes.
{
std::vector<double> mean; //!< Image mean.
std::vector<double> std; //!< Image std.
SemanticSegmenter2dConfig(const std::vector<double> & _mean, const std::vector<double> & _std)
Copy link

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constructor should validate that the mean and std vectors have the same size and are not empty before allocating device memory. Consider adding size validation to prevent runtime errors.

Copilot uses AI. Check for mistakes.
@ktro2828 ktro2828 merged commit 1998b95 into main Oct 2, 2025
1 check failed
@ktro2828 ktro2828 deleted the refactor/detector/allocate-mean-std branch October 2, 2025 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[PERF] Inefficient memory allocation in preprocessing loop

2 participants