Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes
Semantic segmentation is a process of linking each pixel in an image to a class label, and is widely used in the field of autonomous vehicles and robotics. Although deep learning methods have already made great progress for semantic segmentation, they either achieve great results with numerous param...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8984359/ |
_version_ | 1819276384656162816 |
---|---|
author | Gen Li Shenlu Jiang Inyong Yun Jonghyun Kim Joongkyu Kim |
author_facet | Gen Li Shenlu Jiang Inyong Yun Jonghyun Kim Joongkyu Kim |
author_sort | Gen Li |
collection | DOAJ |
description | Semantic segmentation is a process of linking each pixel in an image to a class label, and is widely used in the field of autonomous vehicles and robotics. Although deep learning methods have already made great progress for semantic segmentation, they either achieve great results with numerous parameters or design lightweight models but heavily sacrifice the segmentation accuracy. Because of the strict requirements of real-world applications, it is critical to design an effective real-time model with both competitive segmentation accuracy and small model capacity. In this paper, we propose a lightweight network named DABNet, which employs Depth-wise Asymmetric Bottleneck (DAB) and Point-wise Aggregation Decoder (PAD) module to tackle the challenging real-time semantic segmentation in urban scenes. Specifically, the DAB module creates a sufficient receptive field and densely utilizes the contextual information, and the PAD module aggregates the feature maps of different scales to optimize performance through the attention mechanism. Compared with existing methods, our network substantially reduces the number of parameters but still achieves high accuracy with real-time inference ability. Extensive ablation experiments on two challenging urban scene datasets (Cityscapes and CamVid) have proved the effectiveness of the proposed approach in real-time semantic segmentation. |
first_indexed | 2024-12-23T23:39:22Z |
format | Article |
id | doaj.art-1332b954b6374aa68514bdf600b19498 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-23T23:39:22Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-1332b954b6374aa68514bdf600b194982022-12-21T17:25:44ZengIEEEIEEE Access2169-35362020-01-018274952750610.1109/ACCESS.2020.29717608984359Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban ScenesGen Li0https://orcid.org/0000-0001-6636-1106Shenlu Jiang1https://orcid.org/0000-0001-6208-2142Inyong Yun2https://orcid.org/0000-0001-8082-033XJonghyun Kim3https://orcid.org/0000-0002-5797-4186Joongkyu Kim4https://orcid.org/0000-0002-2225-1703Department of Electronic, Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South KoreaDepartment of Electronic, Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South KoreaDepartment of Electronic, Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South KoreaDepartment of Electronic, Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South KoreaDepartment of Electronic, Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South KoreaSemantic segmentation is a process of linking each pixel in an image to a class label, and is widely used in the field of autonomous vehicles and robotics. Although deep learning methods have already made great progress for semantic segmentation, they either achieve great results with numerous parameters or design lightweight models but heavily sacrifice the segmentation accuracy. Because of the strict requirements of real-world applications, it is critical to design an effective real-time model with both competitive segmentation accuracy and small model capacity. In this paper, we propose a lightweight network named DABNet, which employs Depth-wise Asymmetric Bottleneck (DAB) and Point-wise Aggregation Decoder (PAD) module to tackle the challenging real-time semantic segmentation in urban scenes. Specifically, the DAB module creates a sufficient receptive field and densely utilizes the contextual information, and the PAD module aggregates the feature maps of different scales to optimize performance through the attention mechanism. Compared with existing methods, our network substantially reduces the number of parameters but still achieves high accuracy with real-time inference ability. Extensive ablation experiments on two challenging urban scene datasets (Cityscapes and CamVid) have proved the effectiveness of the proposed approach in real-time semantic segmentation.https://ieeexplore.ieee.org/document/8984359/Real-time semantic segmentationencoder-decoder networkconvolutional neural networkurban sceneslightweight network |
spellingShingle | Gen Li Shenlu Jiang Inyong Yun Jonghyun Kim Joongkyu Kim Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes IEEE Access Real-time semantic segmentation encoder-decoder network convolutional neural network urban scenes lightweight network |
title | Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes |
title_full | Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes |
title_fullStr | Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes |
title_full_unstemmed | Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes |
title_short | Depth-Wise Asymmetric Bottleneck With Point-Wise Aggregation Decoder for Real-Time Semantic Segmentation in Urban Scenes |
title_sort | depth wise asymmetric bottleneck with point wise aggregation decoder for real time semantic segmentation in urban scenes |
topic | Real-time semantic segmentation encoder-decoder network convolutional neural network urban scenes lightweight network |
url | https://ieeexplore.ieee.org/document/8984359/ |
work_keys_str_mv | AT genli depthwiseasymmetricbottleneckwithpointwiseaggregationdecoderforrealtimesemanticsegmentationinurbanscenes AT shenlujiang depthwiseasymmetricbottleneckwithpointwiseaggregationdecoderforrealtimesemanticsegmentationinurbanscenes AT inyongyun depthwiseasymmetricbottleneckwithpointwiseaggregationdecoderforrealtimesemanticsegmentationinurbanscenes AT jonghyunkim depthwiseasymmetricbottleneckwithpointwiseaggregationdecoderforrealtimesemanticsegmentationinurbanscenes AT joongkyukim depthwiseasymmetricbottleneckwithpointwiseaggregationdecoderforrealtimesemanticsegmentationinurbanscenes |