christianrauch · christianrauch · Mar 15, 2025 · Oct 29, 2023
diff --git a/README.md b/README.md
@@ -123,20 +123,30 @@ The camera stream is configured once when the node starts via the following stat
 | `role`            | `string`              | configures the camera with a `StreamRole` (possible choices: `raw`, `still`, `video`, `viewfinder`) [default: `viewfinder`] |
 | `format`          | `string`              | a `PixelFormat` that is supported by the camera [default: auto]                                                             |
 | `width`, `height` | `integer`             | desired image resolution [default: auto]                                                                                    |
+| `sensor_mode`     | `string`              | desired raw sensor format resolution (format: `width:height`) [default: auto]                                               |
 
 
 The configuration is done in the following order:
 1. select camera via `camera`
 2. configure camera stream via `role`
 3. set the pixel format for the stream via `format`
 4. set the image resolution for the stream via `width` and `height`
+5. set the sensor mode resolution for the raw feed from camera to GPU
 
 Each stream role only supports a discrete set of data stream configurations as a combination of the image resolution and pixel format. The selected stream configuration is validated at the end of this sequence and adjusted to the closest valid stream configuration.
 
 By default, the node will select the first available camera and configures it with the default pixel format and image resolution. If a parameter has not been set, the node will print the available options and inform the user about the default choice.
 
 The node avoids memory copies of the image data by directly mapping from a camera pixel format to a ROS image format, with the exception of converting between "raw" and "compressed" image formats when requested by the user. As an effect, not all pixel formats that are supported by the camera may be supported by the ROS image message. Hence, the options for `format` are limited to pixel formats that are supported by the camera and the raw ROS image message.
 
+### Sensor Modes and Cropping
+
+Most camera modules provide different sensor modes, some of which provide a smaller resolution version of the full picture, whereas others have a limited field-of-view, e.g. by allowing for the selection of a 3:2 image from a 16:9 sensor, a more native 'digital zoom' effect by cropping the picture etc. The benefit of this is to perform compute intensive tasks at the source (the sensor), rather than downstream in the GPU or in application code. Unless we specify the sensor mode we want, libcamera will automatically select one. Unfortunately, it is not guaranteed that the selected mode will be a full sensor mode, leading to a potential cropping of the picture. 
+
+Example: When configuring a 640x480 output stream on a Raspberry Pi Camera Module v2, libcamera will automatically choose a heavily cropped 640x480 sensor mode, leading to a digital zoom of ~2.5x.
+
+See the [PiCamera2 Documentation, section 4.2.2.3](https://datasheets.raspberrypi.com/camera/picamera2-manual.pdf) for a detailed explanation. The [PiCamera Documentation](https://picamera.readthedocs.io/en/release-1.13/fov.html#sensor-modes) has a good visualization of the issue. Read the documentation of your camera to see what modes are available, and which (if any) raw formats are cropped.
+
 ### Dynamic Frame Configuration (controls)
 
 The dynamic parameters are created at runtime by inspecting the [controls](https://libcamera.org/api-html/namespacelibcamera_1_1controls.html) that are exposed by the camera. The set of exposed controls is camera-dependent. The ROS parameters are directly formatted from the exposed controls, hence the parameter names match the control names.

diff --git a/src/CameraNode.cpp b/src/CameraNode.cpp
@@ -47,6 +47,7 @@
 #include <rclcpp/publisher.hpp>
 #include <rclcpp/time.hpp>
 #include <rclcpp_components/register_node_macro.hpp>
+#include <regex>
 #include <sensor_msgs/image_encodings.hpp>
 #include <sensor_msgs/msg/camera_info.hpp>
 #include <sensor_msgs/msg/compressed_image.hpp>
@@ -150,6 +151,35 @@ get_role(const std::string &role)
   }
 }
 
+libcamera::Size
+get_sensor_format(const std::string &format_str)
+{
+  if (format_str.empty()) {
+    return {};
+  }
+
+  const std::regex pattern(R"((\d+):(\d+))");
+  std::smatch match;
+
+  if (std::regex_match(format_str, match, pattern)) {
+    try {
+      const int width = std::stoi(match[1].str());
+      const int height = std::stoi(match[2].str());
+
+      return libcamera::Size {static_cast<unsigned int>(width), static_cast<unsigned int>(height)};
+    }
+    catch (const std::out_of_range &) {
+      throw std::runtime_error("Invalid sensor_mode. Width or height exceeds maximum value of " +
+                               std::to_string(std::numeric_limits<int>::max()));
+    }
+    catch (const std::invalid_argument &) {
+      // Unexpected - throw exception below
+    }
+  }
+
+  // Throw exception as it was not possible to parse the format string
+  throw std::runtime_error("Invalid sensor_mode. Expected [width]:[height] but got " + format_str);
+}
 
 // The following function "compressImageMsg" is adapted from "CvImage::toCompressedImageMsg"
 // (https://github.com/ros-perception/vision_opencv/blob/066793a23e5d06d76c78ca3d69824a501c3554fd/cv_bridge/src/cv_bridge.cpp#L512-L535)
@@ -221,6 +251,13 @@ CameraNode::CameraNode(const rclcpp::NodeOptions &options)
   const uint32_t h = declare_parameter<int64_t>("height", {}, param_descr_ro);
   const libcamera::Size size {w, h};
 
+  // Raw format dimensions
+  rcl_interfaces::msg::ParameterDescriptor param_descr_sensor_mode;
+  param_descr_sensor_mode.description = "raw mode of the sensor";
+  param_descr_sensor_mode.additional_constraints = "string in format [width]:[height]";
+  param_descr_sensor_mode.read_only = true;
+  const libcamera::Size sensor_size = get_sensor_format(declare_parameter<std::string>("sensor_mode", {}, param_descr_sensor_mode));
+
   // camera info file url
   rcl_interfaces::msg::ParameterDescriptor param_descr_camera_info_url;
   param_descr_camera_info_url.description = "camera calibration info file url";
@@ -299,14 +336,20 @@ CameraNode::CameraNode(const rclcpp::NodeOptions &options)
   if (camera->acquire())
     throw std::runtime_error("failed to acquire camera");
 
+  std::vector<libcamera::StreamRole> roles {role};
+
+  // Add the RAW role if the sensor_size is defined
+  if (!sensor_size.isNull() && role != libcamera::StreamRole::Raw) {
+    roles.push_back(libcamera::StreamRole::Raw);
+  }
+
   // configure camera stream
   std::unique_ptr<libcamera::CameraConfiguration> cfg =
-    camera->generateConfiguration({role});
+    camera->generateConfiguration(roles);
 
-  if (!cfg)
-    throw std::runtime_error("failed to generate configuration");
+  if (!cfg || cfg->size() != roles.size())
+    throw std::runtime_error("failed to generate configuration for all roles");
 
-  assert(cfg->size() == 1);
   libcamera::StreamConfiguration &scfg = cfg->at(0);
 
   // list all camera formats, including those not supported by the ROS message
@@ -357,6 +400,12 @@ CameraNode::CameraNode(const rclcpp::NodeOptions &options)
     scfg.size = size;
   }
 
+  if (!sensor_size.isNull() && role != libcamera::StreamRole::Raw) {
+    libcamera::StreamConfiguration &modecfg = cfg->at(1);
+    modecfg.size = sensor_size;
+    RCLCPP_INFO_STREAM(get_logger(), "Sensor mode configuration: " << modecfg.toString());
+  }
+
   // store selected stream configuration
   const libcamera::StreamConfiguration selected_scfg = scfg;
 
@@ -392,6 +441,8 @@ CameraNode::CameraNode(const rclcpp::NodeOptions &options)
   const std::optional<std::string> model = props.get(libcamera::properties::Model);
   if (model)
     cname = model.value() + '_' + cname;
+  if (!sensor_size.isNull() && role != libcamera::StreamRole::Raw)
+    cname = cname + '_' + cfg->at(1).toString();
 
   // clean camera name of non-alphanumeric characters
   cname.erase(