|
| 1 | +# Deep Object Detection Node |
| 2 | + |
| 3 | +ROS 2 node for deep learning object detection using ONNX-compatible models. |
| 4 | + |
| 5 | +This node provides model-agnostic object detection for ROS 2, supporting any ONNX-compatible detection model with configurable preprocessing, postprocessing, and multi-camera batch processing. All processing is lifecycle-managed for clean startup/shutdown. |
| 6 | + |
| 7 | +## Architecture |
| 8 | + |
| 9 | +``` |
| 10 | +┌──────────────────────────────────────┐ ┌──────────────────────────────────────┐ |
| 11 | +│ MultiImage / MultiImageCompressed │ ─────► │ DeepObjectDetectionNode │ |
| 12 | +│ (multi-camera) │ │ (lifecycle node) │ |
| 13 | +└──────────────────────────────────────┘ │ • Decode & preprocess │ |
| 14 | + │ • Batch inference │ |
| 15 | + │ • Postprocess & NMS │ |
| 16 | + └───────────────┬──────────────────────┘ |
| 17 | + │ publishes |
| 18 | + ▼ |
| 19 | +┌──────────────────────────────────────┐ ┌──────────────────────────────────────┐ |
| 20 | +│ Backend Plugins │ ◄───── │ Detection2DArray │ |
| 21 | +│ • onnxruntime_cpu │ uses │ ImageMarker (optional) │ |
| 22 | +│ • onnxruntime_gpu (CUDA/TRT) │ └──────────────────────────────────────┘ |
| 23 | +└───────────────┬──────────────────────┘ |
| 24 | + │ loads |
| 25 | + ▼ |
| 26 | +┌──────────────────────────────────────┐ |
| 27 | +│ ONNX Model + YAML Config │ |
| 28 | +└──────────────────────────────────────┘ |
| 29 | +
|
| 30 | +
|
| 31 | +``` |
| 32 | + |
| 33 | +## Running |
| 34 | + |
| 35 | +```bash |
| 36 | +ros2 launch deep_object_detection deep_object_detection.launch.yaml \ |
| 37 | + config_file:=/path/to/config.yaml |
| 38 | +``` |
| 39 | + |
| 40 | +Or run directly: |
| 41 | + |
| 42 | +```bash |
| 43 | +ros2 run deep_object_detection deep_object_detection_node \ |
| 44 | + --ros-args \ |
| 45 | + --params-file /path/to/config.yaml |
| 46 | +``` |
| 47 | + |
| 48 | +## Parameters |
| 49 | + |
| 50 | +### Required |
| 51 | + |
| 52 | +- **`model_path`** (string): Absolute path to ONNX model file (e.g., `/workspaces/deep_ros/yolov8m.onnx`) |
| 53 | +- **`input_topic`** (string): MultiImage/MultiImageCompressed topic name to subscribe to |
| 54 | + |
| 55 | +### Model Configuration |
| 56 | + |
| 57 | +- **`Model.num_classes`** (int, default: 80): Number of detection classes |
| 58 | +- **`Model.bbox_format`** (string, default: "cxcywh"): Bounding box format (`cxcywh`, `xyxy`, or `xywh`) |
| 59 | +- **`Model.output_shape`** (array, optional): Expected model output shape `[batch, detections, features]` (e.g., `[1, 8400, 84]`) |
| 60 | +- **`class_names_path`** (string, optional): Absolute path to text file with class names, one per line (e.g., `/workspaces/deep_ros/deep_object_detection/config/coco_classes.txt`) |
| 61 | + |
| 62 | +### Preprocessing |
| 63 | + |
| 64 | +- **`Preprocessing.input_width`** (int, default: 640): Model input image width |
| 65 | +- **`Preprocessing.input_height`** (int, default: 640): Model input image height |
| 66 | +- **`Preprocessing.normalization_type`** (string, default: "scale_0_1"): Normalization method (`scale_0_1`, `imagenet`, `custom`, `none`) |
| 67 | +- **`Preprocessing.resize_method`** (string, default: "letterbox"): Image resizing method (`letterbox`, `resize`, `crop`, `pad`) |
| 68 | +- **`Preprocessing.color_format`** (string, default: "rgb"): Color format (`rgb` or `bgr`) |
| 69 | +- **`Preprocessing.mean`** (array, default: [0.0, 0.0, 0.0]): Mean values for custom normalization |
| 70 | +- **`Preprocessing.std`** (array, default: [1.0, 1.0, 1.0]): Standard deviation values for custom normalization |
| 71 | +- **`Preprocessing.pad_value`** (int, default: 114): Padding value for letterbox resizing |
| 72 | + |
| 73 | +### Postprocessing |
| 74 | + |
| 75 | +- **`Postprocessing.score_threshold`** (float, default: 0.65): Minimum confidence score |
| 76 | +- **`Postprocessing.nms_iou_threshold`** (float, default: 0.45): IoU threshold for NMS |
| 77 | +- **`Postprocessing.score_activation`** (string, default: "sigmoid"): Score activation (`sigmoid`, `softmax`, `none`) |
| 78 | +- **`Postprocessing.class_score_mode`** (string, default: "all_classes"): Class score mode (`all_classes` or `single_confidence`) |
| 79 | +- **`Postprocessing.enable_nms`** (bool, default: true): Enable non-maximum suppression |
| 80 | +- **`Postprocessing.class_score_start_idx`** (int, default: -1): Start index for class scores (-1 for auto) |
| 81 | +- **`Postprocessing.class_score_count`** (int, default: -1): Number of class scores (-1 for auto) |
| 82 | + |
| 83 | +#### Postprocessing Layout |
| 84 | + |
| 85 | +- **`Postprocessing.layout.batch_dim`** (int, default: 0): Batch dimension index |
| 86 | +- **`Postprocessing.layout.detection_dim`** (int, default: 1): Detection dimension index |
| 87 | +- **`Postprocessing.layout.feature_dim`** (int, default: 2): Feature dimension index |
| 88 | +- **`Postprocessing.layout.bbox_start_idx`** (int, default: 0): Bounding box start index |
| 89 | +- **`Postprocessing.layout.bbox_count`** (int, default: 4): Number of bbox coordinates |
| 90 | +- **`Postprocessing.layout.score_idx`** (int, default: 4): Score index |
| 91 | +- **`Postprocessing.layout.class_idx`** (int, default: 5): Class index |
| 92 | + |
| 93 | +### Backend |
| 94 | + |
| 95 | +- **`Backend.plugin`** (string, required): Backend plugin name (`onnxruntime_cpu` or `onnxruntime_gpu`) |
| 96 | +- **`Backend.execution_provider`** (string, default: "tensorrt"): Execution provider for GPU plugin (`cuda` or `tensorrt`) |
| 97 | +- **`Backend.device_id`** (int, default: 0): GPU device ID (for CUDA/TensorRT) |
| 98 | +- **`Backend.trt_engine_cache_enable`** (bool, default: true): Enable TensorRT engine caching |
| 99 | +- **`Backend.trt_engine_cache_path`** (string, default: "/tmp/deep_ros_ort_trt_cache"): TensorRT engine cache directory |
| 100 | + |
| 101 | +### Input/Output |
| 102 | + |
| 103 | +- **`use_compressed_images`** (bool, default: true): Use compressed images (MultiImageCompressed) vs uncompressed (MultiImage) |
| 104 | +- **`output_detections_topic`** (string, default: "/detections"): Output detections topic name |
| 105 | + |
| 106 | +## Topics |
| 107 | + |
| 108 | +### Key Topics |
| 109 | + |
| 110 | +| Topic | Type | Description | |
| 111 | +|-------|------|-------------| |
| 112 | +| `input_topic` | `deep_msgs/MultiImage` or `deep_msgs/MultiImageCompressed` | Synchronized multi-camera input (compressed or uncompressed) | |
| 113 | +| `output_detections_topic` | `vision_msgs/Detection2DArray` | Detection results (one per image in batch, default: `/detections`) | |
| 114 | +| `/image_annotations` | `visualization_msgs/ImageMarker` | Visualization annotations with bounding boxes (optional) | |
| 115 | + |
| 116 | +**Note:** The node only supports MultiImage/MultiImageCompressed messages. Individual camera topics are not supported. |
| 117 | + |
| 118 | +### Key Services |
| 119 | + |
| 120 | +| Service | Type | Description | |
| 121 | +|---------|------|-------------| |
| 122 | +| `/<node_name>/configure` | `lifecycle_msgs/srv/ChangeState` | Configure the lifecycle node | |
| 123 | +| `/<node_name>/activate` | `lifecycle_msgs/srv/ChangeState` | Activate the lifecycle node | |
| 124 | +| `/<node_name>/deactivate` | `lifecycle_msgs/srv/ChangeState` | Deactivate the lifecycle node | |
| 125 | +| `/<node_name>/cleanup` | `lifecycle_msgs/srv/ChangeState` | Cleanup the lifecycle node | |
| 126 | +| `/<node_name>/shutdown` | `lifecycle_msgs/srv/ChangeState` | Shutdown the lifecycle node | |
| 127 | +| `/<node_name>/get_state` | `lifecycle_msgs/srv/GetState` | Get current lifecycle state | |
0 commit comments