Image segmentation with minimal human supervision

<p>Image segmentation is an indispensable part of computer vision due to its interpretative nature. However, training a robust segmenter presents significant challenges due to the costly labour required to acquire pixel-level labels. This thesis aims to explore various methodologies for learni...

Повний опис

Бібліографічні деталі
Автор: Shin, G
Інші автори: Xie, W
Формат: Дисертація
Мова:English
Опубліковано: 2024
Предмети:
Опис
Резюме:<p>Image segmentation is an indispensable part of computer vision due to its interpretative nature. However, training a robust segmenter presents significant challenges due to the costly labour required to acquire pixel-level labels. This thesis aims to explore various methodologies for learning strong visual representations for segmentation with minimal or no manual supervision.</p> <br> <p>The structure of this thesis is divided into three parts:<br> (i) Efficient Weakly-supervised Learning for Semantic Segmentation;<br> (ii) Semantic and Instance Segmentation without Manual Annotations; and<br> (iii) Self-supervised Learning for Class-agnostic Image Segmentation.</p> <br> <p>In the first part, we demonstrate that considering a model's uncertainty in predictions can significantly reduce the number of annotations required to train a segmenter that performs nearly as well as a fully-supervised one. This finding underscores the critical role of pixels around semantic boundaries in helping the model to learn representations suitable for segmentation.</p> <br> <p>In the second part, we propose a straightforward retrieve and co-segmentation mechanism for semantic segmentation, which mimics how humans learn the visual features of new object categories and enables a segmenter to classify the concept of the object without manual supervision. We also illustrate how this mechanism can be enhanced by substituting the co-segmentation component with a more advanced object segmenter and how it can be extended to instance segmentation.</p> <br> <p>For the final part, we demonstrate how the strong visual correspondences provided by modern self-supervised models can be effectively utilised to segment an object without any prior knowledge of the object's concept.</p>