MoNet : deep motion exploitation for video object segmentation

In this paper, we propose a novel MoNet model to deeply exploit motion cues for boosting video object segmentation performance from two aspects, i.e., frame representation learning and segmentation refinement. Concretely, MoNet exploits computed motion cue (i.e., optical flow) to reinforce the repre...

Full description

Bibliographic Details
Main Authors:	Xiao, Huaxin, Feng, Jiashi, Lin, Guosheng, Liu, Yu, Zhang, Maojun
Other Authors:	School of Computer Science and Engineering
Format:	Conference Paper
Language:	English
Published:	2020
Subjects:	Engineering::Computer science and engineering Motion Segmentation Feature Extraction
Online Access:	https://hdl.handle.net/10356/143257

_version_	1811690753833828352
author	Xiao, Huaxin Feng, Jiashi Lin, Guosheng Liu, Yu Zhang, Maojun
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Xiao, Huaxin Feng, Jiashi Lin, Guosheng Liu, Yu Zhang, Maojun
author_sort	Xiao, Huaxin
collection	NTU
description	In this paper, we propose a novel MoNet model to deeply exploit motion cues for boosting video object segmentation performance from two aspects, i.e., frame representation learning and segmentation refinement. Concretely, MoNet exploits computed motion cue (i.e., optical flow) to reinforce the representation of the target frame by aligning and integrating representations from its neighbors. The new representation provides valuable temporal contexts for segmentation and improves robustness to various common contaminating factors, e.g., motion blur, appearance variation and deformation of video objects. Moreover, MoNet exploits motion inconsistency and transforms such motion cue into foreground/background prior to eliminate distraction from confusing instances and noisy regions. By introducing a distance transform layer, MoNet can effectively separate motion-inconstant instances/regions and thoroughly refine segmentation results. Integrating the proposed two motion exploitation components with a standard segmentation network, MoNet provides new state-of-the-art performance on three competitive benchmark datasets.
first_indexed	2024-10-01T06:09:01Z
format	Conference Paper
id	ntu-10356/143257
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T06:09:01Z
publishDate	2020
record_format	dspace
spelling	ntu-10356/1432572020-08-17T05:05:17Z MoNet : deep motion exploitation for video object segmentation Xiao, Huaxin Feng, Jiashi Lin, Guosheng Liu, Yu Zhang, Maojun School of Computer Science and Engineering 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018 CVPR) Engineering::Computer science and engineering Motion Segmentation Feature Extraction In this paper, we propose a novel MoNet model to deeply exploit motion cues for boosting video object segmentation performance from two aspects, i.e., frame representation learning and segmentation refinement. Concretely, MoNet exploits computed motion cue (i.e., optical flow) to reinforce the representation of the target frame by aligning and integrating representations from its neighbors. The new representation provides valuable temporal contexts for segmentation and improves robustness to various common contaminating factors, e.g., motion blur, appearance variation and deformation of video objects. Moreover, MoNet exploits motion inconsistency and transforms such motion cue into foreground/background prior to eliminate distraction from confusing instances and noisy regions. By introducing a distance transform layer, MoNet can effectively separate motion-inconstant instances/regions and thoroughly refine segmentation results. Integrating the proposed two motion exploitation components with a standard segmentation network, MoNet provides new state-of-the-art performance on three competitive benchmark datasets. Ministry of Education (MOE) Accepted version Huaxin Xiao was supported by the China Scholarship Council under Grant 201603170287. Jiashi Feng was partially supported by NUS startup R-263-000-C08-133, MOE Tier-I R-263-000-C21-112, NUS IDS R-263-000-C67-646 and ECRA R-263-000-C87-133. 2020-08-17T05:05:16Z 2020-08-17T05:05:16Z 2018 Conference Paper Xiao, H., Feng, J., Lin, G., Liu, Y. & Zhang, M. (2018). MoNet : deep motion exploitation for video object segmentation. Proceedings of the 2018 IEEE/CVF Conference o Computer Vision and Pattern Recognition (2018 CVPR). doi:10.1109/CVPR.2018.00125 978-1-5386-6421-6 https://hdl.handle.net/10356/143257 10.1109/CVPR.2018.00125 2-s2.0-85062869824 1140 1148 en © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/CVPR.2018.00125. application/pdf
spellingShingle	Engineering::Computer science and engineering Motion Segmentation Feature Extraction Xiao, Huaxin Feng, Jiashi Lin, Guosheng Liu, Yu Zhang, Maojun MoNet : deep motion exploitation for video object segmentation
title	MoNet : deep motion exploitation for video object segmentation
title_full	MoNet : deep motion exploitation for video object segmentation
title_fullStr	MoNet : deep motion exploitation for video object segmentation
title_full_unstemmed	MoNet : deep motion exploitation for video object segmentation
title_short	MoNet : deep motion exploitation for video object segmentation
title_sort	monet deep motion exploitation for video object segmentation
topic	Engineering::Computer science and engineering Motion Segmentation Feature Extraction
url	https://hdl.handle.net/10356/143257
work_keys_str_mv	AT xiaohuaxin monetdeepmotionexploitationforvideoobjectsegmentation AT fengjiashi monetdeepmotionexploitationforvideoobjectsegmentation AT linguosheng monetdeepmotionexploitationforvideoobjectsegmentation AT liuyu monetdeepmotionexploitationforvideoobjectsegmentation AT zhangmaojun monetdeepmotionexploitationforvideoobjectsegmentation

MoNet : deep motion exploitation for video object segmentation

Similar Items