Video Object Segmentation by Latent Outcome Regression

This paper presents a novel algorithm for unsupervised video object segmentation (UVOS) in unconstrained scenarios. Although a large variety of methods have been proposed in the literature, segmenting generic objects is still challenging because different methods often perform well in different situ...

Full description

Bibliographic Details
Main Authors: Lin Zhang, Yao Lu
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8985334/
_version_ 1818932154159071232
author Lin Zhang
Yao Lu
author_facet Lin Zhang
Yao Lu
author_sort Lin Zhang
collection DOAJ
description This paper presents a novel algorithm for unsupervised video object segmentation (UVOS) in unconstrained scenarios. Although a large variety of methods have been proposed in the literature, segmenting generic objects is still challenging because different methods often perform well in different situations, and no single method can outperform the others in all cases. To address this, we propose to solve the problem of UVOS in a crowd-sourcing setting. We claim that one can achieve superior results by aggregating the predictions of multiple imperfect methods in a reasonable way. Specifically, we propose a latent regression algorithm for ensemble-based segmentation by jointly labelling pixels in a sequence and learning an adaptive weight for each single method in an ensemble. The pixel labellings offer the outcome (pseudo groundtruth) for regression and thus promote the procedure of weight learning, while the learnt weights could provide better shape priors for labelling, resulting in more accurate segmentation. Besides, Laplacian regularization is introduced into the regression to facilitate a stable learning of the weights. The most distinct feature of our algorithm is that it adaptively learns the contributions of different single methods for each test sequence, thus is capable of capturing the advantages of those methods while avoiding their weaknesses. In the experiments, our algorithm is built on 14 non-deep learning segmentation methods which are based on handcrafted features and require no training data. Experimental results on popular benchmarks show that our algorithm achieves compelling performance, even in comparison with deep learning-based methods. Furthermore, benefiting from the adaptive weight learning mechanism, our algorithm can achieve good flexibility and usability by choosing the most complementary single methods without losing too much performance.
first_indexed 2024-12-20T04:27:58Z
format Article
id doaj.art-51e3127408dc466d9ee1a7fd9128881e
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-20T04:27:58Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-51e3127408dc466d9ee1a7fd9128881e2022-12-21T19:53:27ZengIEEEIEEE Access2169-35362020-01-018303553036710.1109/ACCESS.2020.29719648985334Video Object Segmentation by Latent Outcome RegressionLin Zhang0https://orcid.org/0000-0001-5151-9273Yao Lu1Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, ChinaBeijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, ChinaThis paper presents a novel algorithm for unsupervised video object segmentation (UVOS) in unconstrained scenarios. Although a large variety of methods have been proposed in the literature, segmenting generic objects is still challenging because different methods often perform well in different situations, and no single method can outperform the others in all cases. To address this, we propose to solve the problem of UVOS in a crowd-sourcing setting. We claim that one can achieve superior results by aggregating the predictions of multiple imperfect methods in a reasonable way. Specifically, we propose a latent regression algorithm for ensemble-based segmentation by jointly labelling pixels in a sequence and learning an adaptive weight for each single method in an ensemble. The pixel labellings offer the outcome (pseudo groundtruth) for regression and thus promote the procedure of weight learning, while the learnt weights could provide better shape priors for labelling, resulting in more accurate segmentation. Besides, Laplacian regularization is introduced into the regression to facilitate a stable learning of the weights. The most distinct feature of our algorithm is that it adaptively learns the contributions of different single methods for each test sequence, thus is capable of capturing the advantages of those methods while avoiding their weaknesses. In the experiments, our algorithm is built on 14 non-deep learning segmentation methods which are based on handcrafted features and require no training data. Experimental results on popular benchmarks show that our algorithm achieves compelling performance, even in comparison with deep learning-based methods. Furthermore, benefiting from the adaptive weight learning mechanism, our algorithm can achieve good flexibility and usability by choosing the most complementary single methods without losing too much performance.https://ieeexplore.ieee.org/document/8985334/Video object segmentationlatent regressionappearance modellingunsupervised
spellingShingle Lin Zhang
Yao Lu
Video Object Segmentation by Latent Outcome Regression
IEEE Access
Video object segmentation
latent regression
appearance modelling
unsupervised
title Video Object Segmentation by Latent Outcome Regression
title_full Video Object Segmentation by Latent Outcome Regression
title_fullStr Video Object Segmentation by Latent Outcome Regression
title_full_unstemmed Video Object Segmentation by Latent Outcome Regression
title_short Video Object Segmentation by Latent Outcome Regression
title_sort video object segmentation by latent outcome regression
topic Video object segmentation
latent regression
appearance modelling
unsupervised
url https://ieeexplore.ieee.org/document/8985334/
work_keys_str_mv AT linzhang videoobjectsegmentationbylatentoutcomeregression
AT yaolu videoobjectsegmentationbylatentoutcomeregression