JdeRobot
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 5 additions & 8 deletions b/‎CONTRIBUTING.md‎
Lines changed: 5 additions & 8 deletions
diff --git a/‎README.md‎
Lines changed: 5 additions & 2 deletions b/‎README.md‎
Lines changed: 5 additions & 2 deletions
diff --git a/‎detectionmetrics/models/tensorflow.py‎
Lines changed: 94 additions & 36 deletions b/‎detectionmetrics/models/tensorflow.py‎
Lines changed: 94 additions & 36 deletions
diff --git a/‎detectionmetrics/models/torch.py‎
Lines changed: 42 additions & 24 deletions b/‎detectionmetrics/models/torch.py‎
Lines changed: 42 additions & 24 deletions
@@ -28,22 +28,19 @@ Any JdeRobot project follows the same workflow when contributing.
 
 * **Find a problem or possible improvement for the project:** First of all, check that the feature/bug is not listed in the current open issues.
 
-* **Create an issue:** Create an issue in the project with the problem/improvement you will
-address. In this issue, explain carefully what you will be changing and how this changes will impact the project. Provide any complementary information to explain it (code samples, screenshots ...)
+* **Create an issue:** Create an issue in the project with the problem/improvement you will address. In this issue, explain carefully what you will be changing and how this changes will impact the project. Provide any complementary information to explain it (code samples, screenshots ...)
 
 The two following points are different depending on the permissions you have to the repo.
 * **[If you have write permission] Work in a separate branch always:** Create a new branch with a describable name (you can use the issue number as branch name "issue_xxx"). Create your commits in that branch making the apropiate changes. Please, use describable names as commit messages, so everyone can understand easily the changes you made.
 
 * **[If you only have read permission] Fork the project:** Fork the project. Work on that copy of the repo, making the desirable changes. Please, use describable names as commit messages, so everyone can understand easily the changes you made.
 
-* **Open a pull request:** A pull request is compulsory any time a new change wants to be added to the core or the project. After solving the issue, create a pull request with your branch. In this pull request include all the commits made,
- write a good description of the changes made and refer to the issue solved to make things easier to the maintainers. Include any additional resource that would be interesting (references, screenshots...). Link the PR with the issue
+* **Review and format your code:** Before submitting your PR, make sure that all the changes are properly formatted. All functions must have their docstring in [Sphinx format](https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html). In that way, the corresponding documentation will be generated automatically. Regarding code formatting, we use [Black](https://github.com/psf/black).
+
+* **Open a pull request:** A pull request is compulsory any time a new change wants to be added to the core or the project. After solving the issue, create a pull request with your branch. In this pull request include all the commits made, write a good description of the changes made and refer to the issue solved to make things easier to the maintainers. Include any additional resource that would be interesting (references, screenshots...). Link the PR with the issue.
 
 * **Testing and merging pull requests**
-Your pull request will be automatically tested by Travis CI. If any jobs have failed, you should fix them.
-To rerun the automatic builds just push changes to your branch on GitHub. No need to close that pull request and open a new one!
-Once all the builders are "green", one of DetectionMetrics's developers will review your code. Reviewer could ask you to modify your pull request.
- Please provide timely response for reviewers (within weeks, not months), otherwise you submission could be postponed or even rejected.
+One of DetectionMetrics's developers will review your code. Reviewer could ask you to modify your pull request. Please provide timely response for reviewers (within weeks, not months), otherwise you submission could be postponed or even rejected.
 
 * **[If you have write permission] Don't accept your own pull requests:** Wait for a project maintainer to accept the changes you made. They will probably comment the pull request with some feedback and will consider if it can be merge to the master branch.
  Be proactive and kind!
 
@@ -35,7 +35,7 @@ Now, we're excited to introduce ***DetectionMetrics v2***! While retaining the f
   <tr>
     <td rowspan="2">Segmentation</td>
     <td>Image</td>
-    <td>Rellis3D, GOOSE, custom GAIA format</td>
+    <td>Rellis3D, GOOSE, RUGD, WildScenes, custom GAIA format</td>
     <td>PyTorch, Tensorflow</td>
   </tr>
   <tr>
@@ -97,7 +97,10 @@ If you are using LiDAR, Open3D currently requires `torch==2.2*`.
 As of now, *DetectionMetrics* can either be used as a Python library or as a command-line application.
 
 ### Library
-You can check the `examples` directory for inspiration. If you are using *poetry*, you can run the scripts provided either by activating the created environment using `poetry shell` or directly running `poetry run python examples/<some_python_script.py>`.
+
+🧑‍🏫️ [Image Segmentation Tutorial](https://github.com/JdeRobot/DetectionMetrics/blob/master/examples/tutorial_image_segmentation.ipynb)
+
+You can check the `examples` directory for further inspiration. If you are using *poetry*, you can run the scripts provided either by activating the created environment using `poetry shell` or directly running `poetry run python examples/<some_python_script.py>`.
 
 ### Command-line interface
 DetectionMetrics currently provides a CLI with two commands, `dm_evaluate` and `dm_batch`. Thanks to the configuration in the `pyproject.toml` file, we can simply run `poetry install` from the root directory and use them without explicitly invoking the Python files. More details are provided in [DetectionMetrics website](https://jderobot.github.io/DetectionMetrics/v2/usage/#command-line-interface).
 
@@ -85,41 +85,72 @@ def get_computational_cost(
 
 def resize_image(
     image: tf.Tensor,
-    target_size: Tuple[int, int],
     method: str,
-    keep_aspect: bool = False,
+    width: Optional[int] = None,
+    height: Optional[int] = None,
+    closest_divisor: int = 16,
 ) -> tf.Tensor:
-    """Resize tensorflow image to target size
+    """Resize tensorflow image to target size. If only one dimension is provided, the
+    aspect ratio is preserved.
 
     :param image: Input image tensor
     :type image: tf.Tensor
-    :param target_size: Target size for the image
-    :type target_size: Tuple[int, int]
     :param method: Resizing method (e.g. bilinear, nearest)
     :type method: str
-    :param keep_aspect: Whether to keep aspect ratio when resizing images. If true, resize to match smaller sides size and crop center. Defaults to False
-    :type keep_aspect: bool, optional
+    :param width: Target width for resizing
+    :type width: Optional[int], optional
+    :param height: Target height for resizing
+    :type height: Optional[int], optional
+    :param closest_divisor: Closest divisor for the target size, defaults to 16. Only applies to the dimension not provided.
+    :type closest_divisor: int, optional
     :return: Resized image tensor
     :rtype: tf.Tensor
     """
-    # If keep_aspect is True, resize to match smaller side
-    if keep_aspect:
-        original_size = tf.cast(tf.shape(image)[:2], tf.float32)
-        resize_size = tf.cast(tf.convert_to_tensor(target_size), tf.float32)
-        scale_factor = tf.reduce_max(resize_size / original_size)
-        resize_size = tf.cast(tf.round(original_size * scale_factor), tf.int32)
-    else:
-        resize_size = target_size
+    old_size = tf.cast(tf.shape(image)[:2], tf.float32)
+    old_h = old_size[0]
+    old_w = old_size[1]
+
+    h, w = (old_h, old_w)
+    if width is None:
+        w = int((height / old_h) * old_w)
+        h = height
+    if height is None:
+        h = int((width / old_w) * old_h)
+        w = width
+
+    h = (h / closest_divisor) * closest_divisor
+    w = (w / closest_divisor) * closest_divisor
+    new_size = [int(h), int(w)]
+
+    image = tf_image.resize(
+        images=image, size=tf.cast(new_size, tf.int32), method=method
+    )
 
-    image = tf_image.resize(images=image, size=resize_size, method=method)
+    return image
 
-    # If keep_aspect is True, crop center to match target size
-    if keep_aspect:
-        y0 = (resize_size[0] - target_size[0]) // 2
-        x0 = (resize_size[1] - target_size[1]) // 2
-        image = tf_image.crop_to_bounding_box(
-            image, y0, x0, target_size[1], target_size[0]
-        )
+
+def crop_center(image: tf.Tensor, width: int, height: int) -> tf.Tensor:
+    """Crop tensorflow image center to target size
+
+    :param image: Input image tensor
+    :type image: tf.Tensor
+    :param width: Target width for cropping
+    :type width: int
+    :param height: Target width for cropping
+    :type height: int
+    :return: Cropped image tensor
+    :rtype: tf.Tensor
+    """
+    old_size = tf.cast(tf.shape(image)[:2], tf.float32)
+    old_h = old_size[0]
+    old_w = old_size[1]
+
+    offset_height = int((old_h - height) // 2)
+    offset_width = int((old_w - width) // 2)
+
+    image = tf.image.crop_to_bounding_box(
+        image, offset_height, offset_width, height, width
+    )
 
     return image
 
@@ -129,8 +160,10 @@ class ImageSegmentationTensorflowDataset:
 
     :param dataset: Image segmentation dataset
     :type dataset: ImageSegmentationDataset
-    :param image_size: Image size in pixels (width, height)
-    :type image_size: Tuple[int, int]
+    :param resize: Target size for resizing images, defaults to None
+    :type resize: Optional[Tuple[int, int]], optional
+    :param crop: Target size for center cropping images, defaults to None
+    :type crop: Optional[Tuple[int, int]], optional
     :param batch_size: Batch size, defaults to 1
     :type batch_size: int, optional
     :param splits: Splits to be used from the dataset, defaults to ["test"]
@@ -146,14 +179,16 @@ class ImageSegmentationTensorflowDataset:
     def __init__(
         self,
         dataset: ImageSegmentationDataset,
-        image_size: Tuple[int, int],
+        resize: Optional[Tuple[int, int]] = None,
+        crop: Optional[Tuple[int, int]] = None,
         batch_size: int = 1,
         splits: List[str] = ["test"],
         lut_ontology: Optional[dict] = None,
         normalization: Optional[dict] = None,
         keep_aspect: bool = False,
     ):
-        self.image_size = image_size
+        self.resize = resize
+        self.crop = crop
         self.normalization = None
         if normalization is not None:
             mean = tf.constant(normalization["mean"], dtype=tf.float32)
@@ -211,8 +246,20 @@ def read_image(self, fname: str, label=False) -> tf.Tensor:
             )
 
         # Resize (use NN to avoid interpolation when dealing with labels)
-        method = "nearest" if label else "bilinear"
-        image = resize_image(image, self.image_size, method, self.keep_aspect)
+        if self.resize is not None:
+            method = "nearest" if label else "bilinear"
+            image = resize_image(
+                image,
+                method=method,
+                width=self.resize.get("width", None),
+                height=self.resize.get("height", None),
+            )
+        if self.crop is not None:
+            image = crop_center(
+                image,
+                width=self.crop.get("width", None),
+                height=self.crop.get("height", None),
+            )
 
         # If label, round values to avoid interpolation artifacts
         if label:
@@ -283,18 +330,28 @@ def __init__(
         # Init transformation for input images
         def t_in(image):
             tensor = tf.convert_to_tensor(image)
-            tensor = resize_image(
-                tensor,
-                target_size=self.model_cfg["image_size"],
-                method="bilinear",
-                keep_aspect=self.model_cfg.get("keep_aspect", False),
-            )
+
+            if "resize" in self.model_cfg:
+                tensor = resize_image(
+                    method="bilinear",
+                    width=self.model_cfg["resize"].get("width", None),
+                    height=self.model_cfg["resize"].get("height", None),
+                )
+
+            if "crop" in self.model_cfg:
+                tensor = crop_center(
+                    tensor,
+                    width=self.model_cfg["crop"].get("width", None),
+                    height=self.model_cfg["crop"].get("height", None),
+                )
+
             tensor = tf.expand_dims(tensor, axis=0)
             if "normalization" in self.model_cfg:
                 mean = tf.constant(self.model_cfg["normalization"]["mean"])
                 std = tf.constant(self.model_cfg["normalization"]["std"])
                 tensor = tf.cast(tensor, tf.float32) / 255.0
                 tensor = (tensor - mean) / std
+
             return tensor
 
         self.t_in = t_in
@@ -366,7 +423,8 @@ def eval(
         # Get Tensorflow dataset
         dataset = ImageSegmentationTensorflowDataset(
             dataset,
-            image_size=self.model_cfg["image_size"],
+            resize=self.model_cfg.get("resize", None),
+            crop=self.model_cfg.get("crop", None),
             batch_size=self.model_cfg.get("batch_size", 1),
             splits=[split] if isinstance(split, str) else split,
             lut_ontology=lut_ontology,
 
@@ -163,39 +163,49 @@ def get_computational_cost(
 
 
 class CustomResize(torch.nn.Module):
-    """Custom rescale transformation for PyTorch
+    """Custom rescale transformation for PyTorch. If only one dimension is provided,
+    the aspect ratio is preserved.
 
-    :param target_size: Target size for the image
-    :type target_size: Tuple[int, int]
-    :param keep_aspect: Flag to keep aspect ratio
-    :type keep_aspect: bool, defaults to False
+    :param width: Target width for resizing
+    :type width: Optional[int], optional
+    :param height: Target height for resizing
+    :type height: Optional[int], optional
     :param interpolation: Interpolation mode for resizing (e.g. NEAREST, BILINEAR)
     :type interpolation: F.InterpolationMode, defaults to F.InterpolationMode.BILINEAR
+    :param closest_divisor: Closest divisor for the target size, defaults to 16. Only applies to the dimension not provided.
+    :type closest_divisor: int, optional
     """
 
     def __init__(
         self,
-        target_size: Tuple[int, int],
-        keep_aspect: bool = False,
+        width: Optional[int] = None,
+        height: Optional[int] = None,
         interpolation: F.InterpolationMode = F.InterpolationMode.BILINEAR,
+        closest_divisor: int = 16,
     ):
         super().__init__()
-        self.target_size = target_size
-        self.keep_aspect = keep_aspect
+        self.width = width
+        self.height = height
         self.interpolation = interpolation
+        self.closest_divisor = closest_divisor
 
     def forward(self, image: Image.Image) -> Image.Image:
-        new_size = self.target_size
-        if self.keep_aspect:
-            h, w = image.size
-            resize_factor = max((self.target_size[0] / h, self.target_size[1] / w))
-            new_size = int(h * resize_factor), int(w * resize_factor)
+        w, h = image.size
+        old_size = (h, w)
 
-        if new_size != image.size:
-            image = F.resize(image, new_size, self.interpolation)
+        if self.width is None:
+            w = int((self.height / image.size[1]) * image.size[0])
+            h = self.height
+        if self.height is None:
+            h = int((self.width / image.size[0]) * image.size[1])
+            w = self.width
+
+        h = round(h / self.closest_divisor) * self.closest_divisor
+        w = round(w / self.closest_divisor) * self.closest_divisor
+        new_size = (h, w)
 
-        if self.keep_aspect:
-            image = F.center_crop(image, self.target_size)
+        if new_size != old_size:
+            image = F.resize(image, new_size, self.interpolation)
 
         return image
 
@@ -376,22 +386,30 @@ def __init__(
         self.transform_input = []
         self.transform_label = []
 
-        if "image_size" in self.model_cfg:
+        if "resize" in self.model_cfg:
             self.transform_input += [
                 CustomResize(
-                    tuple(self.model_cfg["image_size"]),
-                    keep_aspect=self.model_cfg.get("keep_aspect", False),
+                    width=self.model_cfg["resize"].get("width", None),
+                    height=self.model_cfg["resize"].get("height", None),
                     interpolation=F.InterpolationMode.BILINEAR,
                 )
             ]
             self.transform_label += [
                 CustomResize(
-                    tuple(self.model_cfg["image_size"]),
-                    keep_aspect=self.model_cfg.get("keep_aspect", False),
+                    width=self.model_cfg["resize"].get("width", None),
+                    height=self.model_cfg["resize"].get("height", None),
                     interpolation=F.InterpolationMode.NEAREST,
                 )
             ]
 
+        if "crop" in self.model_cfg:
+            crop_size = (
+                self.model_cfg["crop"]["height"],
+                self.model_cfg["crop"]["width"],
+            )
+            self.transform_input += [transforms.CenterCrop(crop_size)]
+            self.transform_label += [transforms.CenterCrop(crop_size)]
+
         try:
             self.transform_input += [
                 transforms.ToImage(),
@@ -447,7 +465,7 @@ def inference(self, image: Image.Image) -> Image.Image:
                         dict(
                             ori_shape=tensor.shape[2:],
                             img_shape=tensor.shape[2:],
-                            pad_shape=image.shape[2:],
+                            pad_shape=tensor.shape[2:],
                             padding_size=[0, 0, 0, 0],
                         )
                     ]