Skip to content

Commit 1650000

Browse files
authored
Remove segments (#3164)
1 parent fbd2ec7 commit 1650000

File tree

1 file changed

+5
-108
lines changed

1 file changed

+5
-108
lines changed

fine-tune-segformer.md

Lines changed: 5 additions & 108 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ authors:
1616
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
1717
</a>
1818

19-
**This guide shows how you can fine-tune Segformer, a state-of-the-art semantic segmentation model. Our goal is to build a model for a pizza delivery robot, so it can see where to drive and recognize obstacles 🍕🤖. We'll first label a set of sidewalk images on [Segments.ai](https://segments.ai?utm_source=hf&utm_medium=colab&utm_campaign=sem_seg). Then we'll fine-tune a pre-trained SegFormer model by using [`🤗 transformers`](https://huggingface.co/transformers), an open-source library that offers easy-to-use implementations of state-of-the-art models. Along the way, you'll learn how to work with the Hugging Face Hub, the largest open-source catalog of models and datasets.**
19+
**This guide shows how you can fine-tune Segformer, a state-of-the-art semantic segmentation model. Our goal is to build a model for a pizza delivery robot, so it can see where to drive and recognize obstacles 🍕🤖. We'll first use an available segmentation dataset from the 🤗 hub. Then we'll fine-tune a pre-trained SegFormer model by using [`🤗 transformers`](https://huggingface.co/transformers), an open-source library that offers easy-to-use implementations of state-of-the-art models. Along the way, you'll learn how to work with the Hugging Face Hub, the largest open-source catalog of models and datasets.**
2020

2121
Semantic segmentation is the task of classifying each pixel in an image. You can see it as a more precise way of classifying an image. It has a wide range of use cases in fields such as medical imaging and autonomous driving. For example, for our pizza delivery robot, it is important to know exactly where the sidewalk is in an image, not just whether there is a sidewalk or not.
2222

@@ -41,118 +41,15 @@ huggingface-cli login
4141

4242
## 1. Create/choose a dataset
4343

44-
The first step in any ML project is assembling a good dataset. In order to train a semantic segmentation model, we need a dataset with semantic segmentation labels. We can either use an existing dataset from the Hugging Face Hub, such as [ADE20k](https://huggingface.co/datasets/scene_parse_150), or create our own dataset.
44+
The first step in any ML project is assembling a good dataset. In order to train a semantic segmentation model, we need a dataset with semantic segmentation labels. We can either use an existing dataset from the Hugging Face Hub, such as [ADE20k](https://huggingface.co/datasets/scene_parse_150), or create our own dataset by annotating images with corresponding segmentation maps.
4545

4646
For our pizza delivery robot, we could use an existing autonomous driving dataset such as [CityScapes](https://www.cityscapes-dataset.com/) or [BDD100K](https://bdd100k.com/). However, these datasets were captured by cars driving on the road. Since our delivery robot will be driving on the sidewalk, there will be a mismatch between the images in these datasets and the data our robot will see in the real world.
4747

48-
We don't want our delivery robot to get confused, so we'll create our own semantic segmentation dataset using images captured on sidewalks. We'll show how you can label the images we captured in the next steps. If you just want to use our finished, labeled dataset, you can skip the ["Create your own dataset"](#create-your-own-dataset) section and continue from ["Use a dataset from the Hub"](#use-a-dataset-from-the-hub).
49-
50-
### Create your own dataset
51-
52-
To create your semantic segmentation dataset, you'll need two things:
53-
54-
1. images covering the situations your model will encounter in the real world
55-
2. segmentation labels, i.e. images where each pixel represents a class/category.
56-
57-
We went ahead and captured a thousand images of sidewalks in Belgium. Collecting and labeling such a dataset can take a long time, so you can start with a smaller dataset and expand it if the model does not perform well enough.
58-
59-
<figure class="image table text-center m-0 w-full">
60-
<medium-zoom background="rgba(0,0,0,.7)" alt="Example images from the sidewalk dataset" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/56_fine_tune_segformer/sidewalk-examples.png"></medium-zoom>
61-
<figcaption>Some examples of the raw images in the sidewalk dataset.</figcaption>
62-
</figure>
63-
64-
To obtain segmentation labels, we need to indicate the classes of all the regions/objects in these images. This can be a time-consuming endeavour, but using the right tools can speed up the task significantly. For labeling, we'll use [Segments.ai](https://segments.ai?utm_source=hf&utm_medium=colab&utm_campaign=sem_seg), since it has smart labeling tools for image segmentation and an easy-to-use Python SDK.
65-
66-
#### Set up the labeling task on Segments.ai
67-
68-
First, create an account at [https://segments.ai/join](https://segments.ai/join?utm_source=hf&utm_medium=colab&utm_campaign=sem_seg).
69-
Next, create a new dataset and upload your images. You can either do this from the web interface or via the Python SDK (see the [notebook](https://colab.research.google.com/github/huggingface/blog/blob/main/notebooks/56_fine_tune_segformer.ipynb)).
70-
71-
72-
#### Label the images
73-
74-
Now that the raw data is loaded, go to [segments.ai/home](https://segments.ai/home) and open the newly created dataset. Click "Start labeling" and create segmentation masks. You can use the ML-powered superpixel and autosegment tools to label faster.
75-
76-
<figure class="image table text-center m-0">
77-
<video
78-
alt="Labeling a sidewalk image on Segments.ai"
79-
style="max-width: 70%; margin: auto;"
80-
autoplay loop autobuffer muted playsinline
81-
>
82-
<source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/56_fine_tune_segformer/sidewalk-labeling-crop.mp4" poster="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/56_fine_tune_segformer/sidewalk-labeling-crop-poster.png" type="video/mp4">
83-
</video>
84-
<figcaption>Tip: when using the superpixel tool, scroll to change the superpixel size, and click and drag to select segments.</figcaption>
85-
</figure>
86-
87-
#### Push the result to the Hugging Face Hub
88-
89-
When you're done labeling, create a new dataset release containing the labeled data. You can either do this on the releases tab on Segments.ai, or programmatically through the SDK as shown in the notebook.
90-
91-
Note that creating the release can take a few seconds. You can check the releases tab on Segments.ai to check if your release is still being created.
92-
93-
Now, we'll convert the release to a [Hugging Face dataset](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset) via the Segments.ai Python SDK. If you haven't set up the Segments Python client yet, follow the instructions in the "Set up the labeling task on Segments.ai" section of the [notebook](https://colab.research.google.com/github/huggingface/blog/blob/main/notebooks/56_fine_tune_segformer.ipynb#scrollTo=9T2Jr9t9y4HD).
94-
95-
*Note that the conversion can take a while, depending on the size of your dataset.*
96-
97-
98-
```python
99-
from segments.huggingface import release2dataset
100-
101-
release = segments_client.get_release(dataset_identifier, release_name)
102-
hf_dataset = release2dataset(release)
103-
```
104-
105-
If we inspect the features of the new dataset, we can see the image column and the corresponding label. The label consists of two parts: a list of annotations and a segmentation bitmap. The annotation corresponds to the different objects in the image. For each object, the annotation contains an `id` and a `category_id`. The segmentation bitmap is an image where each pixel contains the `id` of the object at that pixel. More information can be found in the [relevant docs](https://docs.segments.ai/reference/sample-and-label-types/label-types#segmentation-labels).
106-
107-
For semantic segmentation, we need a semantic bitmap that contains a `category_id` for each pixel. We'll use the `get_semantic_bitmap` function from the Segments.ai SDK to convert the bitmaps to semantic bitmaps. To apply this function to all the rows in our dataset, we'll use [`dataset.map`](https://huggingface.co/docs/datasets/package_reference/main_classes#datasets.Dataset.map).
108-
109-
110-
```python
111-
from segments.utils import get_semantic_bitmap
112-
113-
def convert_segmentation_bitmap(example):
114-
return {
115-
"label.segmentation_bitmap":
116-
get_semantic_bitmap(
117-
example["label.segmentation_bitmap"],
118-
example["label.annotations"],
119-
id_increment=0,
120-
)
121-
}
122-
123-
124-
semantic_dataset = hf_dataset.map(
125-
convert_segmentation_bitmap,
126-
)
127-
```
128-
129-
You can also rewrite the `convert_segmentation_bitmap` function to use batches and pass `batched=True` to `dataset.map`. This will significantly speed up the mapping, but you might need to tweak the `batch_size` to ensure the process doesn't run out of memory.
130-
131-
132-
The SegFormer model we're going to fine-tune later expects specific names for the features. For convenience, we'll match this format now. Thus, we'll rename the `image` feature to `pixel_values` and the `label.segmentation_bitmap` to `label` and discard the other features.
133-
134-
135-
```python
136-
semantic_dataset = semantic_dataset.rename_column('image', 'pixel_values')
137-
semantic_dataset = semantic_dataset.rename_column('label.segmentation_bitmap', 'label')
138-
semantic_dataset = semantic_dataset.remove_columns(['name', 'uuid', 'status', 'label.annotations'])
139-
```
140-
141-
We can now push the transformed dataset to the Hugging Face Hub. That way, your team and the Hugging Face community can make use of it. In the next section, we'll see how you can load the dataset from the Hub.
142-
143-
144-
```python
145-
hf_dataset_identifier = f"{hf_username}/{dataset_name}"
146-
147-
semantic_dataset.push_to_hub(hf_dataset_identifier)
148-
```
48+
We don't want our delivery robot to get confused, so we have created our own semantic segmentation dataset using images captured on sidewalks. It's available at [segments/sidewalk-semantic](https://huggingface.co/datasets/segments/sidewalk-semantic). This can be done using annotation platforms like [CVAT](https://www.cvat.ai/) or[Segments.ai](https://segments.ai/).
14949

15050
### Use a dataset from the Hub
15151

152-
If you don't want to create your own dataset, but found a suitable dataset for your use case on the Hugging Face Hub, you can define the identifier here.
153-
154-
For example, you can use the full labeled sidewalk dataset. Note that you can check out the examples [directly in your browser](https://huggingface.co/datasets/segments/sidewalk-semantic).
155-
52+
We'll load the full labeled sidewalk dataset here. Note that you can check out the examples [directly in your browser](https://huggingface.co/datasets/segments/sidewalk-semantic).
15653

15754
```python
15855
hf_dataset_identifier = "segments/sidewalk-semantic"
@@ -449,7 +346,7 @@ We introduced you to some useful tools along the way, such as:
449346

450347

451348
* [Segments.ai](https://segments.ai) for labeling your data
452-
* [🤗 datasets](https://huggingface.co/docs/datasets/) for creating and sharing a dataset
349+
* [🤗 datasets](https://huggingface.co/docs/datasets/) for loading and sharing a dataset
453350
* [🤗 transformers](https://huggingface.co/transformers) for easily fine-tuning a state-of-the-art segmentation model
454351
* [Hugging Face Hub](https://huggingface.co/docs/hub/main) for sharing our dataset and model, and for creating an inference widget for our model
455352

0 commit comments

Comments
 (0)