Резюме: | <p>Image segmentation is an indispensable part of computer vision due to its interpretative nature. However, training a robust segmenter presents significant challenges due to the costly labour required to acquire pixel-level labels. This thesis aims to explore various methodologies for learning strong visual representations for segmentation with minimal or no manual supervision.</p>
<br>
<p>The structure of this thesis is divided into three parts:<br>
(i) Efficient Weakly-supervised Learning for Semantic Segmentation;<br>
(ii) Semantic and Instance Segmentation without Manual Annotations; and<br>
(iii) Self-supervised Learning for Class-agnostic Image Segmentation.</p>
<br>
<p>In the first part, we demonstrate that considering a model's uncertainty in predictions can significantly reduce the number of annotations required to train a segmenter that performs nearly as well as a fully-supervised one. This finding underscores the critical role of pixels around semantic boundaries in helping the model to learn representations suitable for segmentation.</p>
<br>
<p>In the second part, we propose a straightforward retrieve and co-segmentation mechanism for semantic segmentation, which mimics how humans learn the visual features of new object categories and enables a segmenter to classify the concept of the object without manual supervision. We also illustrate how this mechanism can be enhanced by substituting the co-segmentation component with a more advanced object segmenter and how it can be extended to instance segmentation.</p>
<br>
<p>For the final part, we demonstrate how the strong visual correspondences provided by modern self-supervised models can be effectively utilised to segment an object without any prior knowledge of the object's concept.</p>
|