Skip to content

Commit 9da163f

Browse files
authored
Merge pull request #302 from JdeRobot/issue-301
Update pre-processing pipeline for image segmentation
2 parents a3e72ac + ff522ae commit 9da163f

File tree

8 files changed

+183
-102
lines changed

8 files changed

+183
-102
lines changed

CONTRIBUTING.md

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -28,22 +28,19 @@ Any JdeRobot project follows the same workflow when contributing.
2828

2929
* **Find a problem or possible improvement for the project:** First of all, check that the feature/bug is not listed in the current open issues.
3030

31-
* **Create an issue:** Create an issue in the project with the problem/improvement you will
32-
address. In this issue, explain carefully what you will be changing and how this changes will impact the project. Provide any complementary information to explain it (code samples, screenshots ...)
31+
* **Create an issue:** Create an issue in the project with the problem/improvement you will address. In this issue, explain carefully what you will be changing and how this changes will impact the project. Provide any complementary information to explain it (code samples, screenshots ...)
3332

3433
The two following points are different depending on the permissions you have to the repo.
3534
* **[If you have write permission] Work in a separate branch always:** Create a new branch with a describable name (you can use the issue number as branch name "issue_xxx"). Create your commits in that branch making the apropiate changes. Please, use describable names as commit messages, so everyone can understand easily the changes you made.
3635

3736
* **[If you only have read permission] Fork the project:** Fork the project. Work on that copy of the repo, making the desirable changes. Please, use describable names as commit messages, so everyone can understand easily the changes you made.
3837

39-
* **Open a pull request:** A pull request is compulsory any time a new change wants to be added to the core or the project. After solving the issue, create a pull request with your branch. In this pull request include all the commits made,
40-
write a good description of the changes made and refer to the issue solved to make things easier to the maintainers. Include any additional resource that would be interesting (references, screenshots...). Link the PR with the issue
38+
* **Review and format your code:** Before submitting your PR, make sure that all the changes are properly formatted. All functions must have their docstring in [Sphinx format](https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html). In that way, the corresponding documentation will be generated automatically. Regarding code formatting, we use [Black](https://github.com/psf/black).
39+
40+
* **Open a pull request:** A pull request is compulsory any time a new change wants to be added to the core or the project. After solving the issue, create a pull request with your branch. In this pull request include all the commits made, write a good description of the changes made and refer to the issue solved to make things easier to the maintainers. Include any additional resource that would be interesting (references, screenshots...). Link the PR with the issue.
4141

4242
* **Testing and merging pull requests**
43-
Your pull request will be automatically tested by Travis CI. If any jobs have failed, you should fix them.
44-
To rerun the automatic builds just push changes to your branch on GitHub. No need to close that pull request and open a new one!
45-
Once all the builders are "green", one of DetectionMetrics's developers will review your code. Reviewer could ask you to modify your pull request.
46-
Please provide timely response for reviewers (within weeks, not months), otherwise you submission could be postponed or even rejected.
43+
One of DetectionMetrics's developers will review your code. Reviewer could ask you to modify your pull request. Please provide timely response for reviewers (within weeks, not months), otherwise you submission could be postponed or even rejected.
4744

4845
* **[If you have write permission] Don't accept your own pull requests:** Wait for a project maintainer to accept the changes you made. They will probably comment the pull request with some feedback and will consider if it can be merge to the master branch.
4946
Be proactive and kind!

README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ Now, we're excited to introduce ***DetectionMetrics v2***! While retaining the f
3535
<tr>
3636
<td rowspan="2">Segmentation</td>
3737
<td>Image</td>
38-
<td>Rellis3D, GOOSE, custom GAIA format</td>
38+
<td>Rellis3D, GOOSE, RUGD, WildScenes, custom GAIA format</td>
3939
<td>PyTorch, Tensorflow</td>
4040
</tr>
4141
<tr>
@@ -97,7 +97,10 @@ If you are using LiDAR, Open3D currently requires `torch==2.2*`.
9797
As of now, *DetectionMetrics* can either be used as a Python library or as a command-line application.
9898

9999
### Library
100-
You can check the `examples` directory for inspiration. If you are using *poetry*, you can run the scripts provided either by activating the created environment using `poetry shell` or directly running `poetry run python examples/<some_python_script.py>`.
100+
101+
🧑‍🏫️ [Image Segmentation Tutorial](https://github.com/JdeRobot/DetectionMetrics/blob/master/examples/tutorial_image_segmentation.ipynb)
102+
103+
You can check the `examples` directory for further inspiration. If you are using *poetry*, you can run the scripts provided either by activating the created environment using `poetry shell` or directly running `poetry run python examples/<some_python_script.py>`.
101104

102105
### Command-line interface
103106
DetectionMetrics currently provides a CLI with two commands, `dm_evaluate` and `dm_batch`. Thanks to the configuration in the `pyproject.toml` file, we can simply run `poetry install` from the root directory and use them without explicitly invoking the Python files. More details are provided in [DetectionMetrics website](https://jderobot.github.io/DetectionMetrics/v2/usage/#command-line-interface).

detectionmetrics/models/tensorflow.py

Lines changed: 94 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -85,41 +85,72 @@ def get_computational_cost(
8585

8686
def resize_image(
8787
image: tf.Tensor,
88-
target_size: Tuple[int, int],
8988
method: str,
90-
keep_aspect: bool = False,
89+
width: Optional[int] = None,
90+
height: Optional[int] = None,
91+
closest_divisor: int = 16,
9192
) -> tf.Tensor:
92-
"""Resize tensorflow image to target size
93+
"""Resize tensorflow image to target size. If only one dimension is provided, the
94+
aspect ratio is preserved.
9395
9496
:param image: Input image tensor
9597
:type image: tf.Tensor
96-
:param target_size: Target size for the image
97-
:type target_size: Tuple[int, int]
9898
:param method: Resizing method (e.g. bilinear, nearest)
9999
:type method: str
100-
:param keep_aspect: Whether to keep aspect ratio when resizing images. If true, resize to match smaller sides size and crop center. Defaults to False
101-
:type keep_aspect: bool, optional
100+
:param width: Target width for resizing
101+
:type width: Optional[int], optional
102+
:param height: Target height for resizing
103+
:type height: Optional[int], optional
104+
:param closest_divisor: Closest divisor for the target size, defaults to 16. Only applies to the dimension not provided.
105+
:type closest_divisor: int, optional
102106
:return: Resized image tensor
103107
:rtype: tf.Tensor
104108
"""
105-
# If keep_aspect is True, resize to match smaller side
106-
if keep_aspect:
107-
original_size = tf.cast(tf.shape(image)[:2], tf.float32)
108-
resize_size = tf.cast(tf.convert_to_tensor(target_size), tf.float32)
109-
scale_factor = tf.reduce_max(resize_size / original_size)
110-
resize_size = tf.cast(tf.round(original_size * scale_factor), tf.int32)
111-
else:
112-
resize_size = target_size
109+
old_size = tf.cast(tf.shape(image)[:2], tf.float32)
110+
old_h = old_size[0]
111+
old_w = old_size[1]
112+
113+
h, w = (old_h, old_w)
114+
if width is None:
115+
w = int((height / old_h) * old_w)
116+
h = height
117+
if height is None:
118+
h = int((width / old_w) * old_h)
119+
w = width
120+
121+
h = (h / closest_divisor) * closest_divisor
122+
w = (w / closest_divisor) * closest_divisor
123+
new_size = [int(h), int(w)]
124+
125+
image = tf_image.resize(
126+
images=image, size=tf.cast(new_size, tf.int32), method=method
127+
)
113128

114-
image = tf_image.resize(images=image, size=resize_size, method=method)
129+
return image
115130

116-
# If keep_aspect is True, crop center to match target size
117-
if keep_aspect:
118-
y0 = (resize_size[0] - target_size[0]) // 2
119-
x0 = (resize_size[1] - target_size[1]) // 2
120-
image = tf_image.crop_to_bounding_box(
121-
image, y0, x0, target_size[1], target_size[0]
122-
)
131+
132+
def crop_center(image: tf.Tensor, width: int, height: int) -> tf.Tensor:
133+
"""Crop tensorflow image center to target size
134+
135+
:param image: Input image tensor
136+
:type image: tf.Tensor
137+
:param width: Target width for cropping
138+
:type width: int
139+
:param height: Target width for cropping
140+
:type height: int
141+
:return: Cropped image tensor
142+
:rtype: tf.Tensor
143+
"""
144+
old_size = tf.cast(tf.shape(image)[:2], tf.float32)
145+
old_h = old_size[0]
146+
old_w = old_size[1]
147+
148+
offset_height = int((old_h - height) // 2)
149+
offset_width = int((old_w - width) // 2)
150+
151+
image = tf.image.crop_to_bounding_box(
152+
image, offset_height, offset_width, height, width
153+
)
123154

124155
return image
125156

@@ -129,8 +160,10 @@ class ImageSegmentationTensorflowDataset:
129160
130161
:param dataset: Image segmentation dataset
131162
:type dataset: ImageSegmentationDataset
132-
:param image_size: Image size in pixels (width, height)
133-
:type image_size: Tuple[int, int]
163+
:param resize: Target size for resizing images, defaults to None
164+
:type resize: Optional[Tuple[int, int]], optional
165+
:param crop: Target size for center cropping images, defaults to None
166+
:type crop: Optional[Tuple[int, int]], optional
134167
:param batch_size: Batch size, defaults to 1
135168
:type batch_size: int, optional
136169
:param splits: Splits to be used from the dataset, defaults to ["test"]
@@ -146,14 +179,16 @@ class ImageSegmentationTensorflowDataset:
146179
def __init__(
147180
self,
148181
dataset: ImageSegmentationDataset,
149-
image_size: Tuple[int, int],
182+
resize: Optional[Tuple[int, int]] = None,
183+
crop: Optional[Tuple[int, int]] = None,
150184
batch_size: int = 1,
151185
splits: List[str] = ["test"],
152186
lut_ontology: Optional[dict] = None,
153187
normalization: Optional[dict] = None,
154188
keep_aspect: bool = False,
155189
):
156-
self.image_size = image_size
190+
self.resize = resize
191+
self.crop = crop
157192
self.normalization = None
158193
if normalization is not None:
159194
mean = tf.constant(normalization["mean"], dtype=tf.float32)
@@ -211,8 +246,20 @@ def read_image(self, fname: str, label=False) -> tf.Tensor:
211246
)
212247

213248
# Resize (use NN to avoid interpolation when dealing with labels)
214-
method = "nearest" if label else "bilinear"
215-
image = resize_image(image, self.image_size, method, self.keep_aspect)
249+
if self.resize is not None:
250+
method = "nearest" if label else "bilinear"
251+
image = resize_image(
252+
image,
253+
method=method,
254+
width=self.resize.get("width", None),
255+
height=self.resize.get("height", None),
256+
)
257+
if self.crop is not None:
258+
image = crop_center(
259+
image,
260+
width=self.crop.get("width", None),
261+
height=self.crop.get("height", None),
262+
)
216263

217264
# If label, round values to avoid interpolation artifacts
218265
if label:
@@ -283,18 +330,28 @@ def __init__(
283330
# Init transformation for input images
284331
def t_in(image):
285332
tensor = tf.convert_to_tensor(image)
286-
tensor = resize_image(
287-
tensor,
288-
target_size=self.model_cfg["image_size"],
289-
method="bilinear",
290-
keep_aspect=self.model_cfg.get("keep_aspect", False),
291-
)
333+
334+
if "resize" in self.model_cfg:
335+
tensor = resize_image(
336+
method="bilinear",
337+
width=self.model_cfg["resize"].get("width", None),
338+
height=self.model_cfg["resize"].get("height", None),
339+
)
340+
341+
if "crop" in self.model_cfg:
342+
tensor = crop_center(
343+
tensor,
344+
width=self.model_cfg["crop"].get("width", None),
345+
height=self.model_cfg["crop"].get("height", None),
346+
)
347+
292348
tensor = tf.expand_dims(tensor, axis=0)
293349
if "normalization" in self.model_cfg:
294350
mean = tf.constant(self.model_cfg["normalization"]["mean"])
295351
std = tf.constant(self.model_cfg["normalization"]["std"])
296352
tensor = tf.cast(tensor, tf.float32) / 255.0
297353
tensor = (tensor - mean) / std
354+
298355
return tensor
299356

300357
self.t_in = t_in
@@ -366,7 +423,8 @@ def eval(
366423
# Get Tensorflow dataset
367424
dataset = ImageSegmentationTensorflowDataset(
368425
dataset,
369-
image_size=self.model_cfg["image_size"],
426+
resize=self.model_cfg.get("resize", None),
427+
crop=self.model_cfg.get("crop", None),
370428
batch_size=self.model_cfg.get("batch_size", 1),
371429
splits=[split] if isinstance(split, str) else split,
372430
lut_ontology=lut_ontology,

detectionmetrics/models/torch.py

Lines changed: 42 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -163,39 +163,49 @@ def get_computational_cost(
163163

164164

165165
class CustomResize(torch.nn.Module):
166-
"""Custom rescale transformation for PyTorch
166+
"""Custom rescale transformation for PyTorch. If only one dimension is provided,
167+
the aspect ratio is preserved.
167168
168-
:param target_size: Target size for the image
169-
:type target_size: Tuple[int, int]
170-
:param keep_aspect: Flag to keep aspect ratio
171-
:type keep_aspect: bool, defaults to False
169+
:param width: Target width for resizing
170+
:type width: Optional[int], optional
171+
:param height: Target height for resizing
172+
:type height: Optional[int], optional
172173
:param interpolation: Interpolation mode for resizing (e.g. NEAREST, BILINEAR)
173174
:type interpolation: F.InterpolationMode, defaults to F.InterpolationMode.BILINEAR
175+
:param closest_divisor: Closest divisor for the target size, defaults to 16. Only applies to the dimension not provided.
176+
:type closest_divisor: int, optional
174177
"""
175178

176179
def __init__(
177180
self,
178-
target_size: Tuple[int, int],
179-
keep_aspect: bool = False,
181+
width: Optional[int] = None,
182+
height: Optional[int] = None,
180183
interpolation: F.InterpolationMode = F.InterpolationMode.BILINEAR,
184+
closest_divisor: int = 16,
181185
):
182186
super().__init__()
183-
self.target_size = target_size
184-
self.keep_aspect = keep_aspect
187+
self.width = width
188+
self.height = height
185189
self.interpolation = interpolation
190+
self.closest_divisor = closest_divisor
186191

187192
def forward(self, image: Image.Image) -> Image.Image:
188-
new_size = self.target_size
189-
if self.keep_aspect:
190-
h, w = image.size
191-
resize_factor = max((self.target_size[0] / h, self.target_size[1] / w))
192-
new_size = int(h * resize_factor), int(w * resize_factor)
193+
w, h = image.size
194+
old_size = (h, w)
193195

194-
if new_size != image.size:
195-
image = F.resize(image, new_size, self.interpolation)
196+
if self.width is None:
197+
w = int((self.height / image.size[1]) * image.size[0])
198+
h = self.height
199+
if self.height is None:
200+
h = int((self.width / image.size[0]) * image.size[1])
201+
w = self.width
202+
203+
h = round(h / self.closest_divisor) * self.closest_divisor
204+
w = round(w / self.closest_divisor) * self.closest_divisor
205+
new_size = (h, w)
196206

197-
if self.keep_aspect:
198-
image = F.center_crop(image, self.target_size)
207+
if new_size != old_size:
208+
image = F.resize(image, new_size, self.interpolation)
199209

200210
return image
201211

@@ -376,22 +386,30 @@ def __init__(
376386
self.transform_input = []
377387
self.transform_label = []
378388

379-
if "image_size" in self.model_cfg:
389+
if "resize" in self.model_cfg:
380390
self.transform_input += [
381391
CustomResize(
382-
tuple(self.model_cfg["image_size"]),
383-
keep_aspect=self.model_cfg.get("keep_aspect", False),
392+
width=self.model_cfg["resize"].get("width", None),
393+
height=self.model_cfg["resize"].get("height", None),
384394
interpolation=F.InterpolationMode.BILINEAR,
385395
)
386396
]
387397
self.transform_label += [
388398
CustomResize(
389-
tuple(self.model_cfg["image_size"]),
390-
keep_aspect=self.model_cfg.get("keep_aspect", False),
399+
width=self.model_cfg["resize"].get("width", None),
400+
height=self.model_cfg["resize"].get("height", None),
391401
interpolation=F.InterpolationMode.NEAREST,
392402
)
393403
]
394404

405+
if "crop" in self.model_cfg:
406+
crop_size = (
407+
self.model_cfg["crop"]["height"],
408+
self.model_cfg["crop"]["width"],
409+
)
410+
self.transform_input += [transforms.CenterCrop(crop_size)]
411+
self.transform_label += [transforms.CenterCrop(crop_size)]
412+
395413
try:
396414
self.transform_input += [
397415
transforms.ToImage(),
@@ -447,7 +465,7 @@ def inference(self, image: Image.Image) -> Image.Image:
447465
dict(
448466
ori_shape=tensor.shape[2:],
449467
img_shape=tensor.shape[2:],
450-
pad_shape=image.shape[2:],
468+
pad_shape=tensor.shape[2:],
451469
padding_size=[0, 0, 0, 0],
452470
)
453471
]

0 commit comments

Comments
 (0)