Skip to content

Commit 85d08ea

Browse files
brianzheng206lucasreljicEdwardius
authored
Brian/deep object detection (#26)
* Initial implementation of deep object detection * switched to onnx inferencing * Added onnxruntime gpu to cmake * pre-commit fixes * Added ort_backend integration, Fixed detection parsing issues, fixed config issues and removed other launch files TODO: Allow dynamic batch sizes. Optimize! * Initial deep_ort_gpu plugin with internal library issues * Functioning onnxruntime with CUDA as EP * Initial deep_ort_gpu plugin with internal library issues * Functioning onnxruntime with CUDA as execution provider * Removed requirement for cuda toolkit * batch size and other stuff? * hopefully fixed segfault * working cuda inference * working tensorrt ep * new stuff * cleaning up cmake * fixed cuda * added libnvinfer, trt ep should work * caching * should be working * fixed batch size * optional engine caching * some cleanup * tensorrt version * precommit and cleanup * added lifecycle node * loading backend at runtime * working detection * dynamic tensor output and configurable activation * addressing pr comments * node improvements * some changes * docstrings and remapping * Fixed yolov8 layout detection bug * Added ImageMarker publisher for foxglove * added tests * precommit * opencv build fix * start the deep clean, get rid of batching * get rid of main, ort backend io bind * sorted out the params situation * sorted out the param situation * nuke backend manager, clean up params * deep node base inheritance, multi raw image support, cleanup * fixed msg naming * removed multi output, clean up readme * update readme * fix tests * why was this here * fix logging? * last changes, and i switched the yolo dimensions xd * fix build * pls fix build error * test * cv bridge? --------- Co-authored-by: lucasreljic <lucas.reljic@gmail.com> Co-authored-by: Eddy Zhou <edsteredward@gmail.com>
1 parent 2c676a4 commit 85d08ea

34 files changed

+3822
-76
lines changed

.devcontainer/Dockerfile

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,6 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
3434
&& rm -rf /var/lib/apt/lists/*
3535
ENV DEBIAN_FRONTEND=interactive
3636

37-
# Install Tensorrt Runtime
3837
RUN curl -fsSL -o cuda-keyring_1.1-1_all.deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb \
3938
&& dpkg -i cuda-keyring_1.1-1_all.deb \
4039
&& apt-get update && apt-get install -y --no-install-recommends \
@@ -59,9 +58,6 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
5958
curl \
6059
&& rm -rf /var/lib/apt/lists/*
6160

62-
# Install pre-commit
63-
RUN python3 -m pip install --no-cache-dir pre-commit==3.8.0
64-
6561
# Set working dir (matches VSCode workspace)
6662
WORKDIR /deep_ros_ws
6763

@@ -117,6 +113,9 @@ RUN groupadd --gid ${USER_GID} ${USERNAME} \
117113
# Set the default user. Omit if you want to keep the default as root.
118114
USER $USERNAME
119115

116+
# Install pre-commit for the user
117+
RUN python3 -m pip install --no-cache-dir --user pre-commit==3.8.0
118+
120119
# Install Claude Code natively
121120
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
122121
RUN curl -fsSL https://claude.ai/install.sh | bash

DEVELOPING.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,75 @@ This project includes VS Code dev container configurations for easy ROS2 develop
2626
- **Build tools**: Includes `colcon` and `rosdep` for ROS development
2727
- **Extensions**: C++, CMake, Python, and XML support pre-installed
2828

29+
### Stopping Containers
30+
31+
After using "Rebuild and Reopen in Container", you can stop containers using:
32+
33+
**Method 1: VS Code Command (Recommended)**
34+
- Press `Ctrl+Shift+P` (or `Cmd+Shift+P` on Mac)
35+
- Type: `Dev Containers: Reopen Folder Locally`
36+
- This closes the container and returns to local mode
37+
38+
**Method 2: Stop Container**
39+
- Press `Ctrl+Shift+P`
40+
- Type: `Dev Containers: Stop Container`
41+
- This stops the container but keeps it available for later use
42+
43+
**Method 3: Using Docker Commands**
44+
From a terminal (outside the container):
45+
46+
```bash
47+
# List running containers
48+
docker ps
49+
50+
# Stop a specific container by name
51+
docker stop <container_name>
52+
53+
# Stop all dev containers
54+
docker ps -q --filter "name=vsc-deep_ros" | xargs -r docker stop
55+
56+
# Stop and remove containers
57+
docker stop <container_name>
58+
docker rm <container_name>
59+
```
60+
61+
**Method 4: Close VS Code Window**
62+
- Simply closing the VS Code window will stop the container when you exit
63+
- The container will remain stopped until you reopen the folder in container mode
64+
65+
### Restarting Containers
66+
67+
After stopping a container, you can restart it using:
68+
69+
**Method 1: VS Code Command (Recommended)**
70+
- Press `Ctrl+Shift+P` (or `Cmd+Shift+P` on Mac)
71+
- Type: `Dev Containers: Reopen in Container`
72+
- This will start the existing container or create a new one if needed
73+
74+
**Method 2: Rebuild and Reopen**
75+
- Press `Ctrl+Shift+P`
76+
- Type: `Dev Containers: Rebuild and Reopen in Container`
77+
- Use this if you want to rebuild the container from scratch (e.g., after Dockerfile changes)
78+
79+
**Method 3: Using Docker Commands**
80+
From a terminal (outside the container):
81+
82+
```bash
83+
# List all containers (including stopped)
84+
docker ps -a
85+
86+
# Start a stopped container by name
87+
docker start <container_name>
88+
89+
# Start and attach to container
90+
docker start <container_name>
91+
docker attach <container_name>
92+
```
93+
94+
**Method 4: Reopen Folder in VS Code**
95+
- If VS Code detects a devcontainer configuration, it will prompt you to "Reopen in Container"
96+
- Click the notification or use the command palette option
97+
2998
### Common Commands
3099

31100
Inside the container, you can do ros2 commands, colcon commands, rosdep, etc.

deep_msgs/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ find_package(rosidl_default_generators REQUIRED)
2727

2828
rosidl_generate_interfaces(${PROJECT_NAME}
2929
"msg/MultiImage.msg"
30-
"msg/MultiImageRaw.msg"
30+
"msg/MultiImageCompressed.msg"
3131
DEPENDENCIES std_msgs sensor_msgs
3232
)
3333

deep_msgs/msg/MultiImage.msg

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
11
# MultiImage.msg
2-
# A message that carries multiple compressed images together
3-
42
std_msgs/Header header
5-
sensor_msgs/CompressedImage[] images
3+
sensor_msgs/Image[] images
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# MultiImageCompressed.msg
2+
# A message that carries multiple compressed images together
3+
4+
std_msgs/Header header
5+
sensor_msgs/CompressedImage[] images

deep_msgs/msg/MultiImageRaw.msg

Lines changed: 0 additions & 3 deletions
This file was deleted.
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
# Copyright (c) 2025-present WATonomous. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
cmake_minimum_required(VERSION 3.22)
16+
project(deep_object_detection)
17+
18+
if(NOT CMAKE_CXX_STANDARD)
19+
set(CMAKE_CXX_STANDARD 17)
20+
endif()
21+
22+
if(CMAKE_COMPILER_IS_GNUCXX OR CMAKE_CXX_COMPILER_ID MATCHES "Clang")
23+
add_compile_options(-Wall -Wextra -Wpedantic)
24+
endif()
25+
26+
find_package(ament_cmake REQUIRED)
27+
find_package(rclcpp REQUIRED)
28+
find_package(rclcpp_components REQUIRED)
29+
find_package(rclcpp_lifecycle REQUIRED)
30+
find_package(sensor_msgs REQUIRED)
31+
find_package(cv_bridge REQUIRED)
32+
find_package(vision_msgs REQUIRED)
33+
find_package(visualization_msgs REQUIRED)
34+
find_package(deep_core REQUIRED)
35+
find_package(deep_msgs REQUIRED)
36+
find_package(pluginlib REQUIRED)
37+
find_package(rcl_interfaces REQUIRED)
38+
find_package(OpenCV REQUIRED COMPONENTS core imgproc)
39+
40+
add_library(deep_object_detection_lib SHARED
41+
src/generic_postprocessor.cpp
42+
src/image_preprocessor.cpp
43+
src/deep_object_detection_node.cpp
44+
)
45+
46+
target_include_directories(deep_object_detection_lib PUBLIC
47+
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/include>
48+
$<INSTALL_INTERFACE:include>
49+
)
50+
51+
target_link_libraries(deep_object_detection_lib
52+
PUBLIC
53+
${rclcpp_TARGETS}
54+
${rclcpp_lifecycle_TARGETS}
55+
${deep_core_TARGETS}
56+
${pluginlib_TARGETS}
57+
${vision_msgs_TARGETS}
58+
${deep_msgs_TARGETS}
59+
${visualization_msgs_TARGETS}
60+
deep_core::deep_core_lib
61+
PRIVATE
62+
${rclcpp_components_TARGETS}
63+
${sensor_msgs_TARGETS}
64+
${cv_bridge_TARGETS}
65+
${rcl_interfaces_TARGETS}
66+
${OpenCV_LIBRARIES}
67+
)
68+
69+
target_include_directories(deep_object_detection_lib SYSTEM PUBLIC ${OpenCV_INCLUDE_DIRS})
70+
71+
rclcpp_components_register_nodes(deep_object_detection_lib "deep_object_detection::DeepObjectDetectionNode")
72+
73+
install(TARGETS deep_object_detection_lib
74+
ARCHIVE DESTINATION lib
75+
LIBRARY DESTINATION lib
76+
RUNTIME DESTINATION bin
77+
)
78+
79+
install(DIRECTORY include/
80+
DESTINATION include
81+
)
82+
83+
install(DIRECTORY config launch
84+
DESTINATION share/${PROJECT_NAME}
85+
)
86+
87+
if(BUILD_TESTING)
88+
find_package(deep_test REQUIRED)
89+
90+
# Unit tests
91+
add_deep_test(test_deep_object_detection_node test/test_deep_object_detection_node.cpp
92+
LIBRARIES
93+
deep_object_detection_lib
94+
deep_core::deep_core_lib
95+
)
96+
# Set explicit timeout for unit test
97+
set_tests_properties(test_deep_object_detection_node PROPERTIES TIMEOUT 10)
98+
99+
# Launch tests disabled by default
100+
# Set ENABLE_LAUNCH_TESTS=1 to enable them
101+
if(DEFINED ENV{ENABLE_LAUNCH_TESTS} AND "$ENV{ENABLE_LAUNCH_TESTS}" STREQUAL "1")
102+
find_package(launch_testing_ament_cmake REQUIRED)
103+
message(STATUS "Launch tests enabled (CPU/GPU/TensorRT)")
104+
add_deep_launch_test(test/launch_tests/test_deep_object_detection_cpu_backend.py TIMEOUT 60)
105+
add_deep_launch_test(test/launch_tests/test_deep_object_detection_gpu_backend.py TIMEOUT 60)
106+
add_deep_launch_test(test/launch_tests/test_deep_object_detection_tensorrt_backend.py TIMEOUT 60)
107+
else()
108+
message(STATUS "Launch tests disabled by default (set ENABLE_LAUNCH_TESTS=1 to enable)")
109+
endif()
110+
endif()
111+
112+
ament_package()

deep_object_detection/README.md

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
# Deep Object Detection Node
2+
3+
ROS 2 node for deep learning object detection using ONNX-compatible models.
4+
5+
This node provides model-agnostic object detection for ROS 2, supporting any ONNX-compatible detection model with configurable preprocessing, postprocessing, and multi-camera batch processing. All processing is lifecycle-managed for clean startup/shutdown.
6+
7+
## Architecture
8+
9+
```
10+
┌──────────────────────────────────────┐ ┌──────────────────────────────────────┐
11+
│ MultiImage / MultiImageCompressed │ ─────► │ DeepObjectDetectionNode │
12+
│ (multi-camera) │ │ (lifecycle node) │
13+
└──────────────────────────────────────┘ │ • Decode & preprocess │
14+
│ • Batch inference │
15+
│ • Postprocess & NMS │
16+
└───────────────┬──────────────────────┘
17+
│ publishes
18+
19+
┌──────────────────────────────────────┐ ┌──────────────────────────────────────┐
20+
│ Backend Plugins │ ◄───── │ Detection2DArray │
21+
│ • onnxruntime_cpu │ uses │ ImageMarker (optional) │
22+
│ • onnxruntime_gpu (CUDA/TRT) │ └──────────────────────────────────────┘
23+
└───────────────┬──────────────────────┘
24+
│ loads
25+
26+
┌──────────────────────────────────────┐
27+
│ ONNX Model + YAML Config │
28+
└──────────────────────────────────────┘
29+
30+
31+
```
32+
33+
## Running
34+
35+
```bash
36+
ros2 launch deep_object_detection deep_object_detection.launch.yaml \
37+
config_file:=/path/to/config.yaml
38+
```
39+
40+
Or run directly:
41+
42+
```bash
43+
ros2 run deep_object_detection deep_object_detection_node \
44+
--ros-args \
45+
--params-file /path/to/config.yaml
46+
```
47+
48+
## Parameters
49+
50+
### Required
51+
52+
- **`model_path`** (string): Absolute path to ONNX model file (e.g., `/workspaces/deep_ros/yolov8m.onnx`)
53+
- **`input_topic`** (string): MultiImage/MultiImageCompressed topic name to subscribe to
54+
55+
### Model Configuration
56+
57+
- **`Model.num_classes`** (int, default: 80): Number of detection classes
58+
- **`Model.bbox_format`** (string, default: "cxcywh"): Bounding box format (`cxcywh`, `xyxy`, or `xywh`)
59+
- **`Model.output_shape`** (array, optional): Expected model output shape `[batch, detections, features]` (e.g., `[1, 8400, 84]`)
60+
- **`class_names_path`** (string, optional): Absolute path to text file with class names, one per line (e.g., `/workspaces/deep_ros/deep_object_detection/config/coco_classes.txt`)
61+
62+
### Preprocessing
63+
64+
- **`Preprocessing.input_width`** (int, default: 640): Model input image width
65+
- **`Preprocessing.input_height`** (int, default: 640): Model input image height
66+
- **`Preprocessing.normalization_type`** (string, default: "scale_0_1"): Normalization method (`scale_0_1`, `imagenet`, `custom`, `none`)
67+
- **`Preprocessing.resize_method`** (string, default: "letterbox"): Image resizing method (`letterbox`, `resize`, `crop`, `pad`)
68+
- **`Preprocessing.color_format`** (string, default: "rgb"): Color format (`rgb` or `bgr`)
69+
- **`Preprocessing.mean`** (array, default: [0.0, 0.0, 0.0]): Mean values for custom normalization
70+
- **`Preprocessing.std`** (array, default: [1.0, 1.0, 1.0]): Standard deviation values for custom normalization
71+
- **`Preprocessing.pad_value`** (int, default: 114): Padding value for letterbox resizing
72+
73+
### Postprocessing
74+
75+
- **`Postprocessing.score_threshold`** (float, default: 0.65): Minimum confidence score
76+
- **`Postprocessing.nms_iou_threshold`** (float, default: 0.45): IoU threshold for NMS
77+
- **`Postprocessing.score_activation`** (string, default: "sigmoid"): Score activation (`sigmoid`, `softmax`, `none`)
78+
- **`Postprocessing.class_score_mode`** (string, default: "all_classes"): Class score mode (`all_classes` or `single_confidence`)
79+
- **`Postprocessing.enable_nms`** (bool, default: true): Enable non-maximum suppression
80+
- **`Postprocessing.class_score_start_idx`** (int, default: -1): Start index for class scores (-1 for auto)
81+
- **`Postprocessing.class_score_count`** (int, default: -1): Number of class scores (-1 for auto)
82+
83+
#### Postprocessing Layout
84+
85+
- **`Postprocessing.layout.batch_dim`** (int, default: 0): Batch dimension index
86+
- **`Postprocessing.layout.detection_dim`** (int, default: 1): Detection dimension index
87+
- **`Postprocessing.layout.feature_dim`** (int, default: 2): Feature dimension index
88+
- **`Postprocessing.layout.bbox_start_idx`** (int, default: 0): Bounding box start index
89+
- **`Postprocessing.layout.bbox_count`** (int, default: 4): Number of bbox coordinates
90+
- **`Postprocessing.layout.score_idx`** (int, default: 4): Score index
91+
- **`Postprocessing.layout.class_idx`** (int, default: 5): Class index
92+
93+
### Backend
94+
95+
- **`Backend.plugin`** (string, required): Backend plugin name (`onnxruntime_cpu` or `onnxruntime_gpu`)
96+
- **`Backend.execution_provider`** (string, default: "tensorrt"): Execution provider for GPU plugin (`cuda` or `tensorrt`)
97+
- **`Backend.device_id`** (int, default: 0): GPU device ID (for CUDA/TensorRT)
98+
- **`Backend.trt_engine_cache_enable`** (bool, default: true): Enable TensorRT engine caching
99+
- **`Backend.trt_engine_cache_path`** (string, default: "/tmp/deep_ros_ort_trt_cache"): TensorRT engine cache directory
100+
101+
### Input/Output
102+
103+
- **`use_compressed_images`** (bool, default: true): Use compressed images (MultiImageCompressed) vs uncompressed (MultiImage)
104+
- **`output_detections_topic`** (string, default: "/detections"): Output detections topic name
105+
106+
## Topics
107+
108+
### Key Topics
109+
110+
| Topic | Type | Description |
111+
|-------|------|-------------|
112+
| `input_topic` | `deep_msgs/MultiImage` or `deep_msgs/MultiImageCompressed` | Synchronized multi-camera input (compressed or uncompressed) |
113+
| `output_detections_topic` | `vision_msgs/Detection2DArray` | Detection results (one per image in batch, default: `/detections`) |
114+
| `/image_annotations` | `visualization_msgs/ImageMarker` | Visualization annotations with bounding boxes (optional) |
115+
116+
**Note:** The node only supports MultiImage/MultiImageCompressed messages. Individual camera topics are not supported.
117+
118+
### Key Services
119+
120+
| Service | Type | Description |
121+
|---------|------|-------------|
122+
| `/<node_name>/configure` | `lifecycle_msgs/srv/ChangeState` | Configure the lifecycle node |
123+
| `/<node_name>/activate` | `lifecycle_msgs/srv/ChangeState` | Activate the lifecycle node |
124+
| `/<node_name>/deactivate` | `lifecycle_msgs/srv/ChangeState` | Deactivate the lifecycle node |
125+
| `/<node_name>/cleanup` | `lifecycle_msgs/srv/ChangeState` | Cleanup the lifecycle node |
126+
| `/<node_name>/shutdown` | `lifecycle_msgs/srv/ChangeState` | Shutdown the lifecycle node |
127+
| `/<node_name>/get_state` | `lifecycle_msgs/srv/GetState` | Get current lifecycle state |

0 commit comments

Comments
 (0)