Summary: | <p>In this thesis, we develop methods to exploit motion as a key attribute for detecting visually challenging moving objects. In a video observation, two categories of motion can be noted: a background motion, usually resulting from the camera movement, and the motion of the foreground objects present in the scene. We first revisit geometrical approaches to motion segmentation by investigating registration to compensate for camera motion. To this end, we model and learn to predict the background transformation between consecutive frames solely from optical flow input. We demonstrate that this effectively enhances the moving objects in the registered image difference and improves the segmentation performance. To validate our motion segmentation models, we collect a highly challenging dataset for Moving Camouflaged Animals (MoCA), which is today the largest benchmark for video camouflage object detection.</p>
<p>Second, we explore object-centric approaches to motion segmentation. In the first approach, we investigate a dual architecture that learns both the conventional modal segmentation, i.e. segmenting the visible part of the object, and the amodal segmentation, i.e. segmenting the object as a whole including the occluded part. To train our model, we develop a pipeline for creating large scale synthetic datasets, in the optical flow space, simulating various types of object shapes and motion and incorporating occlusion. We demonstrate that the motion segmentation model, trained from scratch on this synthetic dataset, generalises well to the real world data, such as DAVIS2016, SegTrackV2 and our challenging MoCA, outperforming vision based and motion-based methods trained on costly annotated real datasets. In the second approach, we learn a layered representation of the optical flow using slot attention in a self supervised manner.</p>
<p>Finally, we extend our synthetic dataset generation from the optical flow space to the RGB space and train a generative model to artificially create camouflage examples. We further define the factors that contribute to the effectiveness of a camouflage and propose score functions to quantify its challenge. We demonstrate that enforcing camouflage effectiveness while training the generative model results in improving the quality of the generated camouflage dataset, which, when used as a training set, boosts the performance of the motion segmentation model on the challenging MoCA benchmark.</p>
|