Pop‐net: A self‐growth network for popping out the salient object in videos

Abstract It is a big challenge for unsupervised video segmentation without any object annotation or prior knowledge. In this article, we formulate a completely unsupervised video object segmentation network which can pop out the most salient object in an input video by self‐growth, called Pop‐Net. S...

Full description

Bibliographic Details
Main Authors: Hui Yin, Ning Chen, Lin Yang, Jin Wan
Format: Article
Language:English
Published: Wiley 2021-08-01
Series:IET Computer Vision
Online Access:https://doi.org/10.1049/cvi2.12032
Description
Summary:Abstract It is a big challenge for unsupervised video segmentation without any object annotation or prior knowledge. In this article, we formulate a completely unsupervised video object segmentation network which can pop out the most salient object in an input video by self‐growth, called Pop‐Net. Specifically, in this article, a novel self‐growth strategy which helps a base segmentation network to gradually grow to stick out the salient object as the video goes on, is introduced. To solve the sample generation problem for the unsupervised method, the sample generation module which fuses the appearance and motion saliency is proposed. Furthermore, the proposed sample optimization module improves the samples by using contour constrains for each self‐growth step. Experimental results on several datasets (DAVIS, DAVSOD, VideoSD, Segtrack‐v2) show the effectiveness of the proposed method. In particular, the state‐of‐the‐art methods on completely unfamiliar datasets (no fine‐tuned datasets) are performed.
ISSN:1751-9632
1751-9640