Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes

Semantic segmentation is a process of linking each pixel in an image to a class label, and is widely used in the field of autonomous vehicles and robotics. Although deep learning methods have already made great progress for semantic segmentation, they either achieve great results with numerous param...

Full description

Bibliographic Details
Main Authors: Gen Li, Shenlu Jiang, Inyong Yun, Jonghyun Kim, Joongkyu Kim
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8984359/
_version_ 1819276384656162816
author Gen Li
Shenlu Jiang
Inyong Yun
Jonghyun Kim
Joongkyu Kim
author_facet Gen Li
Shenlu Jiang
Inyong Yun
Jonghyun Kim
Joongkyu Kim
author_sort Gen Li
collection DOAJ
description Semantic segmentation is a process of linking each pixel in an image to a class label, and is widely used in the field of autonomous vehicles and robotics. Although deep learning methods have already made great progress for semantic segmentation, they either achieve great results with numerous parameters or design lightweight models but heavily sacrifice the segmentation accuracy. Because of the strict requirements of real-world applications, it is critical to design an effective real-time model with both competitive segmentation accuracy and small model capacity. In this paper, we propose a lightweight network named DABNet, which employs Depth-wise Asymmetric Bottleneck (DAB) and Point-wise Aggregation Decoder (PAD) module to tackle the challenging real-time semantic segmentation in urban scenes. Specifically, the DAB module creates a sufficient receptive field and densely utilizes the contextual information, and the PAD module aggregates the feature maps of different scales to optimize performance through the attention mechanism. Compared with existing methods, our network substantially reduces the number of parameters but still achieves high accuracy with real-time inference ability. Extensive ablation experiments on two challenging urban scene datasets (Cityscapes and CamVid) have proved the effectiveness of the proposed approach in real-time semantic segmentation.
first_indexed 2024-12-23T23:39:22Z
format Article
id doaj.art-1332b954b6374aa68514bdf600b19498
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-23T23:39:22Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-1332b954b6374aa68514bdf600b194982022-12-21T17:25:44ZengIEEEIEEE Access2169-35362020-01-018274952750610.1109/ACCESS.2020.29717608984359Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban ScenesGen Li0https://orcid.org/0000-0001-6636-1106Shenlu Jiang1https://orcid.org/0000-0001-6208-2142Inyong Yun2https://orcid.org/0000-0001-8082-033XJonghyun Kim3https://orcid.org/0000-0002-5797-4186Joongkyu Kim4https://orcid.org/0000-0002-2225-1703Department of Electronic, Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South KoreaDepartment of Electronic, Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South KoreaDepartment of Electronic, Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South KoreaDepartment of Electronic, Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South KoreaDepartment of Electronic, Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South KoreaSemantic segmentation is a process of linking each pixel in an image to a class label, and is widely used in the field of autonomous vehicles and robotics. Although deep learning methods have already made great progress for semantic segmentation, they either achieve great results with numerous parameters or design lightweight models but heavily sacrifice the segmentation accuracy. Because of the strict requirements of real-world applications, it is critical to design an effective real-time model with both competitive segmentation accuracy and small model capacity. In this paper, we propose a lightweight network named DABNet, which employs Depth-wise Asymmetric Bottleneck (DAB) and Point-wise Aggregation Decoder (PAD) module to tackle the challenging real-time semantic segmentation in urban scenes. Specifically, the DAB module creates a sufficient receptive field and densely utilizes the contextual information, and the PAD module aggregates the feature maps of different scales to optimize performance through the attention mechanism. Compared with existing methods, our network substantially reduces the number of parameters but still achieves high accuracy with real-time inference ability. Extensive ablation experiments on two challenging urban scene datasets (Cityscapes and CamVid) have proved the effectiveness of the proposed approach in real-time semantic segmentation.https://ieeexplore.ieee.org/document/8984359/Real-time semantic segmentationencoder-decoder networkconvolutional neural networkurban sceneslightweight network
spellingShingle Gen Li
Shenlu Jiang
Inyong Yun
Jonghyun Kim
Joongkyu Kim
Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes
IEEE Access
Real-time semantic segmentation
encoder-decoder network
convolutional neural network
urban scenes
lightweight network
title Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes
title_full Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes
title_fullStr Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes
title_full_unstemmed Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes
title_short Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes
title_sort depth wise asymmetric bottleneck with point wise aggregation decoder for real time semantic segmentation in urban scenes
topic Real-time semantic segmentation
encoder-decoder network
convolutional neural network
urban scenes
lightweight network
url https://ieeexplore.ieee.org/document/8984359/
work_keys_str_mv AT genli depthwiseasymmetricbottleneckwithpointwiseaggregationdecoderforrealtimesemanticsegmentationinurbanscenes
AT shenlujiang depthwiseasymmetricbottleneckwithpointwiseaggregationdecoderforrealtimesemanticsegmentationinurbanscenes
AT inyongyun depthwiseasymmetricbottleneckwithpointwiseaggregationdecoderforrealtimesemanticsegmentationinurbanscenes
AT jonghyunkim depthwiseasymmetricbottleneckwithpointwiseaggregationdecoderforrealtimesemanticsegmentationinurbanscenes
AT joongkyukim depthwiseasymmetricbottleneckwithpointwiseaggregationdecoderforrealtimesemanticsegmentationinurbanscenes