U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation

Most scenes in practical applications are dynamic scenes containing moving objects, so accurately segmenting moving objects is crucial for many computer vision applications. In order to efficiently segment all the moving objects in the scene, regardless of whether the object has a predefined semanti...

Full description

Bibliographic Details
Main Authors:	Chenjie Wang, Chengyuan Li, Jun Liu, Bin Luo, Xin Su, Yajun Wang, Yan Gao
Format:	Article
Language:	English
Published:	MDPI AG 2020-12-01
Series:	Remote Sensing
Subjects:	moving object segmentation octave convolution nested U-structure hierarchical supervision
Online Access:	https://www.mdpi.com/2072-4292/13/1/60

_version_	1797543445099708416
author	Chenjie Wang Chengyuan Li Jun Liu Bin Luo Xin Su Yajun Wang Yan Gao
author_facet	Chenjie Wang Chengyuan Li Jun Liu Bin Luo Xin Su Yajun Wang Yan Gao
author_sort	Chenjie Wang
collection	DOAJ
description	Most scenes in practical applications are dynamic scenes containing moving objects, so accurately segmenting moving objects is crucial for many computer vision applications. In order to efficiently segment all the moving objects in the scene, regardless of whether the object has a predefined semantic label, we propose a two-level nested octave U-structure network with a multi-scale attention mechanism, called U<sup>2</sup>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula>. <inline-formula><math display="inline"><semantics><msup><mi mathvariant="normal">U</mi><mn>2</mn></msup></semantics></math></inline-formula>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula> takes two RGB frames, the optical flow between these frames, and the instance segmentation of the frames as inputs. Each stage of <inline-formula><math display="inline"><semantics><msup><mi mathvariant="normal">U</mi><mn>2</mn></msup></semantics></math></inline-formula>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula> is filled with the newly designed octave residual U-block (ORSU block) to enhance the ability to obtain more contextual information at different scales while reducing the spatial redundancy of the feature maps. In order to efficiently train the multi-scale deep network, we introduce a hierarchical training supervision strategy that calculates the loss at each level while adding knowledge-matching loss to keep the optimization consistent. The experimental results show that the proposed <inline-formula><math display="inline"><semantics><msup><mi mathvariant="normal">U</mi><mn>2</mn></msup></semantics></math></inline-formula>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula> method can achieve a state-of-the-art performance in several general moving object segmentation datasets.
first_indexed	2024-03-10T13:45:39Z
format	Article
id	doaj.art-8cbadc47bc5141b3a8dca97759d57808
institution	Directory Open Access Journal
issn	2072-4292
language	English
last_indexed	2024-03-10T13:45:39Z
publishDate	2020-12-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj.art-8cbadc47bc5141b3a8dca97759d578082023-11-21T02:37:52ZengMDPI AGRemote Sensing2072-42922020-12-011316010.3390/rs13010060U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object SegmentationChenjie Wang0Chengyuan Li1Jun Liu2Bin Luo3Xin Su4Yajun Wang5Yan Gao6State Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaSchool of Remote Sensing and Information Engineering, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaZhuhai Da Hengqin Science and Technology Development Co., Ltd., Unit 1, 33 Haihe Street, Hengqin New Area, Zhuhai 519031, ChinaMost scenes in practical applications are dynamic scenes containing moving objects, so accurately segmenting moving objects is crucial for many computer vision applications. In order to efficiently segment all the moving objects in the scene, regardless of whether the object has a predefined semantic label, we propose a two-level nested octave U-structure network with a multi-scale attention mechanism, called U<sup>2</sup>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula>. <inline-formula><math display="inline"><semantics><msup><mi mathvariant="normal">U</mi><mn>2</mn></msup></semantics></math></inline-formula>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula> takes two RGB frames, the optical flow between these frames, and the instance segmentation of the frames as inputs. Each stage of <inline-formula><math display="inline"><semantics><msup><mi mathvariant="normal">U</mi><mn>2</mn></msup></semantics></math></inline-formula>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula> is filled with the newly designed octave residual U-block (ORSU block) to enhance the ability to obtain more contextual information at different scales while reducing the spatial redundancy of the feature maps. In order to efficiently train the multi-scale deep network, we introduce a hierarchical training supervision strategy that calculates the loss at each level while adding knowledge-matching loss to keep the optimization consistent. The experimental results show that the proposed <inline-formula><math display="inline"><semantics><msup><mi mathvariant="normal">U</mi><mn>2</mn></msup></semantics></math></inline-formula>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula> method can achieve a state-of-the-art performance in several general moving object segmentation datasets.https://www.mdpi.com/2072-4292/13/1/60moving object segmentationoctave convolutionnested U-structurehierarchical supervision
spellingShingle	Chenjie Wang Chengyuan Li Jun Liu Bin Luo Xin Su Yajun Wang Yan Gao U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation Remote Sensing moving object segmentation octave convolution nested U-structure hierarchical supervision
title	U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation
title_full	U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation
title_fullStr	U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation
title_full_unstemmed	U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation
title_short	U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation
title_sort	u sup 2 sup onet a two level nested octave u structure network with a multi scale attention mechanism for moving object segmentation
topic	moving object segmentation octave convolution nested U-structure hierarchical supervision
url	https://www.mdpi.com/2072-4292/13/1/60
work_keys_str_mv	AT chenjiewang usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation AT chengyuanli usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation AT junliu usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation AT binluo usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation AT xinsu usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation AT yajunwang usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation AT yangao usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation

U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation

Similar Items