U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation
Most scenes in practical applications are dynamic scenes containing moving objects, so accurately segmenting moving objects is crucial for many computer vision applications. In order to efficiently segment all the moving objects in the scene, regardless of whether the object has a predefined semanti...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-12-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/13/1/60 |
_version_ | 1797543445099708416 |
---|---|
author | Chenjie Wang Chengyuan Li Jun Liu Bin Luo Xin Su Yajun Wang Yan Gao |
author_facet | Chenjie Wang Chengyuan Li Jun Liu Bin Luo Xin Su Yajun Wang Yan Gao |
author_sort | Chenjie Wang |
collection | DOAJ |
description | Most scenes in practical applications are dynamic scenes containing moving objects, so accurately segmenting moving objects is crucial for many computer vision applications. In order to efficiently segment all the moving objects in the scene, regardless of whether the object has a predefined semantic label, we propose a two-level nested octave U-structure network with a multi-scale attention mechanism, called U<sup>2</sup>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula>. <inline-formula><math display="inline"><semantics><msup><mi mathvariant="normal">U</mi><mn>2</mn></msup></semantics></math></inline-formula>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula> takes two RGB frames, the optical flow between these frames, and the instance segmentation of the frames as inputs. Each stage of <inline-formula><math display="inline"><semantics><msup><mi mathvariant="normal">U</mi><mn>2</mn></msup></semantics></math></inline-formula>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula> is filled with the newly designed octave residual U-block (ORSU block) to enhance the ability to obtain more contextual information at different scales while reducing the spatial redundancy of the feature maps. In order to efficiently train the multi-scale deep network, we introduce a hierarchical training supervision strategy that calculates the loss at each level while adding knowledge-matching loss to keep the optimization consistent. The experimental results show that the proposed <inline-formula><math display="inline"><semantics><msup><mi mathvariant="normal">U</mi><mn>2</mn></msup></semantics></math></inline-formula>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula> method can achieve a state-of-the-art performance in several general moving object segmentation datasets. |
first_indexed | 2024-03-10T13:45:39Z |
format | Article |
id | doaj.art-8cbadc47bc5141b3a8dca97759d57808 |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-10T13:45:39Z |
publishDate | 2020-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-8cbadc47bc5141b3a8dca97759d578082023-11-21T02:37:52ZengMDPI AGRemote Sensing2072-42922020-12-011316010.3390/rs13010060U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object SegmentationChenjie Wang0Chengyuan Li1Jun Liu2Bin Luo3Xin Su4Yajun Wang5Yan Gao6State Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaSchool of Remote Sensing and Information Engineering, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaZhuhai Da Hengqin Science and Technology Development Co., Ltd., Unit 1, 33 Haihe Street, Hengqin New Area, Zhuhai 519031, ChinaMost scenes in practical applications are dynamic scenes containing moving objects, so accurately segmenting moving objects is crucial for many computer vision applications. In order to efficiently segment all the moving objects in the scene, regardless of whether the object has a predefined semantic label, we propose a two-level nested octave U-structure network with a multi-scale attention mechanism, called U<sup>2</sup>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula>. <inline-formula><math display="inline"><semantics><msup><mi mathvariant="normal">U</mi><mn>2</mn></msup></semantics></math></inline-formula>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula> takes two RGB frames, the optical flow between these frames, and the instance segmentation of the frames as inputs. Each stage of <inline-formula><math display="inline"><semantics><msup><mi mathvariant="normal">U</mi><mn>2</mn></msup></semantics></math></inline-formula>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula> is filled with the newly designed octave residual U-block (ORSU block) to enhance the ability to obtain more contextual information at different scales while reducing the spatial redundancy of the feature maps. In order to efficiently train the multi-scale deep network, we introduce a hierarchical training supervision strategy that calculates the loss at each level while adding knowledge-matching loss to keep the optimization consistent. The experimental results show that the proposed <inline-formula><math display="inline"><semantics><msup><mi mathvariant="normal">U</mi><mn>2</mn></msup></semantics></math></inline-formula>-ONe<inline-formula><math display="inline"><semantics><mi mathvariant="normal">t</mi></semantics></math></inline-formula> method can achieve a state-of-the-art performance in several general moving object segmentation datasets.https://www.mdpi.com/2072-4292/13/1/60moving object segmentationoctave convolutionnested U-structurehierarchical supervision |
spellingShingle | Chenjie Wang Chengyuan Li Jun Liu Bin Luo Xin Su Yajun Wang Yan Gao U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation Remote Sensing moving object segmentation octave convolution nested U-structure hierarchical supervision |
title | U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation |
title_full | U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation |
title_fullStr | U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation |
title_full_unstemmed | U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation |
title_short | U<sup>2</sup>-ONet: A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation |
title_sort | u sup 2 sup onet a two level nested octave u structure network with a multi scale attention mechanism for moving object segmentation |
topic | moving object segmentation octave convolution nested U-structure hierarchical supervision |
url | https://www.mdpi.com/2072-4292/13/1/60 |
work_keys_str_mv | AT chenjiewang usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation AT chengyuanli usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation AT junliu usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation AT binluo usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation AT xinsu usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation AT yajunwang usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation AT yangao usup2suponetatwolevelnestedoctaveustructurenetworkwithamultiscaleattentionmechanismformovingobjectsegmentation |