Lightweight Asymmetric Dilation Network for Real-Time Semantic Segmentation

Semantic segmentation is a very important and challenging problem in computer vision. Many applications, such as automated driving and robotic navigation in urban road scenes, require accurate and efficient segmentation. Nowadays, system models are often designed with high speed but a large number o...

Full description

Bibliographic Details
Main Authors: Xuegang Hu, Yu Gong
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9399082/
_version_ 1819121338577584128
author Xuegang Hu
Yu Gong
author_facet Xuegang Hu
Yu Gong
author_sort Xuegang Hu
collection DOAJ
description Semantic segmentation is a very important and challenging problem in computer vision. Many applications, such as automated driving and robotic navigation in urban road scenes, require accurate and efficient segmentation. Nowadays, system models are often designed with high speed but a large number of parameters, or they take up a lot of memory space with a very small speed, so they are not suitable for real-time semantic segmentation conditions. In order to solve this problem, we propose a more comprehensive model that has not only a faster speed, but also a smaller number of parameters and a higher accuracy which is termed as Lightweight Asymmetric Dilation Network (LADNet). Our model is based on our Lightweight Asymmetric Dilation Module (LAD Module) which provides a larger receptive field than all existing lightweight models to learn more information, while Lightweight Asymmetric Dilation-A (LAD-A) can better perceive spatial and semantic information, and Lightweight Asymmetric Dilation-B (LAD-B) can better perceive semantic information. Our Lightweight Downsampling Module (LDM) downsamples the feature map, it can greatly reduce model parameters. Finally, our Attention Enhancement Decoder (AED) to restore the feature map to the same size as the resolution of the original image, AED enables two attentional feature maps to simultaneously guide semantic information for better semantic segmentation of images. Our extensive experiments on the Cityscapes, CamVid, and NYUv2 test set show that our model is able to achieve the best balance in parameters, accuracy, and speed.
first_indexed 2024-12-22T06:34:59Z
format Article
id doaj.art-ca9e138eb1814267b2c0102126221a5a
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-22T06:34:59Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-ca9e138eb1814267b2c0102126221a5a2022-12-21T18:35:36ZengIEEEIEEE Access2169-35362021-01-019556305564310.1109/ACCESS.2021.30718669399082Lightweight Asymmetric Dilation Network for Real-Time Semantic SegmentationXuegang Hu0Yu Gong1https://orcid.org/0000-0003-2836-2890School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing, ChinaLaboratory of Intelligent Analysis and Decision on Complex Systems, Chongqing University of Posts and Telecommunications, Chongqing, ChinaSemantic segmentation is a very important and challenging problem in computer vision. Many applications, such as automated driving and robotic navigation in urban road scenes, require accurate and efficient segmentation. Nowadays, system models are often designed with high speed but a large number of parameters, or they take up a lot of memory space with a very small speed, so they are not suitable for real-time semantic segmentation conditions. In order to solve this problem, we propose a more comprehensive model that has not only a faster speed, but also a smaller number of parameters and a higher accuracy which is termed as Lightweight Asymmetric Dilation Network (LADNet). Our model is based on our Lightweight Asymmetric Dilation Module (LAD Module) which provides a larger receptive field than all existing lightweight models to learn more information, while Lightweight Asymmetric Dilation-A (LAD-A) can better perceive spatial and semantic information, and Lightweight Asymmetric Dilation-B (LAD-B) can better perceive semantic information. Our Lightweight Downsampling Module (LDM) downsamples the feature map, it can greatly reduce model parameters. Finally, our Attention Enhancement Decoder (AED) to restore the feature map to the same size as the resolution of the original image, AED enables two attentional feature maps to simultaneously guide semantic information for better semantic segmentation of images. Our extensive experiments on the Cityscapes, CamVid, and NYUv2 test set show that our model is able to achieve the best balance in parameters, accuracy, and speed.https://ieeexplore.ieee.org/document/9399082/Attention mechanismconvolutional neural networkencoder-decoder networklightweight modelreal-time semantic segmentation
spellingShingle Xuegang Hu
Yu Gong
Lightweight Asymmetric Dilation Network for Real-Time Semantic Segmentation
IEEE Access
Attention mechanism
convolutional neural network
encoder-decoder network
lightweight model
real-time semantic segmentation
title Lightweight Asymmetric Dilation Network for Real-Time Semantic Segmentation
title_full Lightweight Asymmetric Dilation Network for Real-Time Semantic Segmentation
title_fullStr Lightweight Asymmetric Dilation Network for Real-Time Semantic Segmentation
title_full_unstemmed Lightweight Asymmetric Dilation Network for Real-Time Semantic Segmentation
title_short Lightweight Asymmetric Dilation Network for Real-Time Semantic Segmentation
title_sort lightweight asymmetric dilation network for real time semantic segmentation
topic Attention mechanism
convolutional neural network
encoder-decoder network
lightweight model
real-time semantic segmentation
url https://ieeexplore.ieee.org/document/9399082/
work_keys_str_mv AT xueganghu lightweightasymmetricdilationnetworkforrealtimesemanticsegmentation
AT yugong lightweightasymmetricdilationnetworkforrealtimesemanticsegmentation