Exploring New Backbone and Attention Module for Semantic Segmentation in Street Scenes

Semantic segmentation, as dense pixel-wise classification task, played an important tache in scene understanding. There are two main challenges in many state-of-the-art works: 1) most backbone of segmentation models that often were extracted from pretrained classification models generated poor perfo...

Full description

Bibliographic Details
Main Authors:	Lei Fan, Wei-Chien Wang, Fuyuan Zha, Jiapeng Yan
Format:	Article
Language:	English
Published:	IEEE 2018-01-01
Series:	IEEE Access
Subjects:	Semantic segmentation segmentation backbone attention mechanism street scenes
Online Access:	https://ieeexplore.ieee.org/document/8531594/

_version_	1819133500450668544
author	Lei Fan Wei-Chien Wang Fuyuan Zha Jiapeng Yan
author_facet	Lei Fan Wei-Chien Wang Fuyuan Zha Jiapeng Yan
author_sort	Lei Fan
collection	DOAJ
description	Semantic segmentation, as dense pixel-wise classification task, played an important tache in scene understanding. There are two main challenges in many state-of-the-art works: 1) most backbone of segmentation models that often were extracted from pretrained classification models generated poor performance in small categories because they were lacking in spatial information and 2) the gap of combination between high-level and low-level features in segmentation models has led to inaccurate predictions. To handle these challenges, in this paper, we proposed a new tailored backbone and attention select module for segmentation tasks. Specifically, our new backbone was modified from the original Resnet, which can yield better segmentation performance. Attention select module employed spatial and channel self-attention mechanism to reinforce the propagation of contextual features, which can aggregate semantic and spatial information simultaneously. In addition, based on our new backbone and attention select module, we further proposed our segmentation model for street scenes understanding. We conducted a series of ablation studies on two public benchmarks, including Cityscapes and CamVid dataset to demonstrate the effectiveness of our proposals. Our model achieved a mIoU score of 71.5% on the Cityscapes test set with only fine annotation data and 60.1% on the CamVid test set.
first_indexed	2024-12-22T09:48:17Z
format	Article
id	doaj.art-f8bcf876de0e481183632296649f28ed
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-22T09:48:17Z
publishDate	2018-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-f8bcf876de0e481183632296649f28ed2022-12-21T18:30:29ZengIEEEIEEE Access2169-35362018-01-016715667158010.1109/ACCESS.2018.28808778531594Exploring New Backbone and Attention Module for Semantic Segmentation in Street ScenesLei Fan0https://orcid.org/0000-0001-9472-7152Wei-Chien Wang1Fuyuan Zha2Jiapeng Yan3School of Electrical Engineering and Automation, Hefei University of Technology, Hefei, ChinaScience and Engineering Faculty, Queensland University of Technology, Brisbane, QLD, AustraliaSchool of Electrical Engineering and Automation, Hefei University of Technology, Hefei, ChinaSchool of Electrical Engineering and Automation, Hefei University of Technology, Hefei, ChinaSemantic segmentation, as dense pixel-wise classification task, played an important tache in scene understanding. There are two main challenges in many state-of-the-art works: 1) most backbone of segmentation models that often were extracted from pretrained classification models generated poor performance in small categories because they were lacking in spatial information and 2) the gap of combination between high-level and low-level features in segmentation models has led to inaccurate predictions. To handle these challenges, in this paper, we proposed a new tailored backbone and attention select module for segmentation tasks. Specifically, our new backbone was modified from the original Resnet, which can yield better segmentation performance. Attention select module employed spatial and channel self-attention mechanism to reinforce the propagation of contextual features, which can aggregate semantic and spatial information simultaneously. In addition, based on our new backbone and attention select module, we further proposed our segmentation model for street scenes understanding. We conducted a series of ablation studies on two public benchmarks, including Cityscapes and CamVid dataset to demonstrate the effectiveness of our proposals. Our model achieved a mIoU score of 71.5% on the Cityscapes test set with only fine annotation data and 60.1% on the CamVid test set.https://ieeexplore.ieee.org/document/8531594/Semantic segmentationsegmentation backboneattention mechanismstreet scenes
spellingShingle	Lei Fan Wei-Chien Wang Fuyuan Zha Jiapeng Yan Exploring New Backbone and Attention Module for Semantic Segmentation in Street Scenes IEEE Access Semantic segmentation segmentation backbone attention mechanism street scenes
title	Exploring New Backbone and Attention Module for Semantic Segmentation in Street Scenes
title_full	Exploring New Backbone and Attention Module for Semantic Segmentation in Street Scenes
title_fullStr	Exploring New Backbone and Attention Module for Semantic Segmentation in Street Scenes
title_full_unstemmed	Exploring New Backbone and Attention Module for Semantic Segmentation in Street Scenes
title_short	Exploring New Backbone and Attention Module for Semantic Segmentation in Street Scenes
title_sort	exploring new backbone and attention module for semantic segmentation in street scenes
topic	Semantic segmentation segmentation backbone attention mechanism street scenes
url	https://ieeexplore.ieee.org/document/8531594/
work_keys_str_mv	AT leifan exploringnewbackboneandattentionmoduleforsemanticsegmentationinstreetscenes AT weichienwang exploringnewbackboneandattentionmoduleforsemanticsegmentationinstreetscenes AT fuyuanzha exploringnewbackboneandattentionmoduleforsemanticsegmentationinstreetscenes AT jiapengyan exploringnewbackboneandattentionmoduleforsemanticsegmentationinstreetscenes

Exploring New Backbone and Attention Module for Semantic Segmentation in Street Scenes

Similar Items