EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning

Segmentation of street scenes is a key technology in the field of autonomous vehicles. However, conventional segmentation methods achieve low accuracy because of the complexity of street landscapes. Therefore, we propose an efficient atrous residual network (EAR-Net) to improve accuracy while mainta...

Full description

Bibliographic Details
Main Authors:	Seokyong Shin, Sanghun Lee, Hyunho Han
Format:	Article
Language:	English
Published:	MDPI AG 2021-09-01
Series:	Applied Sciences
Subjects:	atrous spatial pyramid pooling deep learning encoder–decoder residual learning semantic segmentation
Online Access:	https://www.mdpi.com/2076-3417/11/19/9119

_version_	1797516777370943488
author	Seokyong Shin Sanghun Lee Hyunho Han
author_facet	Seokyong Shin Sanghun Lee Hyunho Han
author_sort	Seokyong Shin
collection	DOAJ
description	Segmentation of street scenes is a key technology in the field of autonomous vehicles. However, conventional segmentation methods achieve low accuracy because of the complexity of street landscapes. Therefore, we propose an efficient atrous residual network (EAR-Net) to improve accuracy while maintaining computation costs. First, we performed feature extraction and restoration, utilizing depthwise separable convolution (DSConv) and interpolation. Compared with conventional methods, DSConv and interpolation significantly reduce computation costs while minimizing performance degradation. Second, we utilized residual learning and atrous spatial pyramid pooling (ASPP) to achieve high accuracy. Residual learning increases the ability to extract context information by preventing the problem of feature and gradient losses. In addition, ASPP extracts additional context information while maintaining the resolution of the feature map. Finally, to alleviate the class imbalance between the image background and objects and to improve learning efficiency, we utilized focal loss. We evaluated EAR-Net on the Cityscapes dataset, which is commonly used for street scene segmentation studies. Experimental results showed that the EAR-Net had better segmentation results and similar computation costs as the conventional methods. We also conducted an ablation study to analyze the contributions of the ASPP and DSConv in the EAR-Net.
first_indexed	2024-03-10T07:05:38Z
format	Article
id	doaj.art-470ee410c52345fc873593c9cd4bc2d2
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-10T07:05:38Z
publishDate	2021-09-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-470ee410c52345fc873593c9cd4bc2d22023-11-22T15:47:58ZengMDPI AGApplied Sciences2076-34172021-09-011119911910.3390/app11199119EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep LearningSeokyong Shin0Sanghun Lee1Hyunho Han2Department of Plasma Bio Display, Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 01897, KoreaIngenium College of Liberal Arts, Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 01897, KoreaCollege of General Education, University of Ulsan, 93 Daehak-ro, Nam-gu, Ulsan 44610, KoreaSegmentation of street scenes is a key technology in the field of autonomous vehicles. However, conventional segmentation methods achieve low accuracy because of the complexity of street landscapes. Therefore, we propose an efficient atrous residual network (EAR-Net) to improve accuracy while maintaining computation costs. First, we performed feature extraction and restoration, utilizing depthwise separable convolution (DSConv) and interpolation. Compared with conventional methods, DSConv and interpolation significantly reduce computation costs while minimizing performance degradation. Second, we utilized residual learning and atrous spatial pyramid pooling (ASPP) to achieve high accuracy. Residual learning increases the ability to extract context information by preventing the problem of feature and gradient losses. In addition, ASPP extracts additional context information while maintaining the resolution of the feature map. Finally, to alleviate the class imbalance between the image background and objects and to improve learning efficiency, we utilized focal loss. We evaluated EAR-Net on the Cityscapes dataset, which is commonly used for street scene segmentation studies. Experimental results showed that the EAR-Net had better segmentation results and similar computation costs as the conventional methods. We also conducted an ablation study to analyze the contributions of the ASPP and DSConv in the EAR-Net.https://www.mdpi.com/2076-3417/11/19/9119atrous spatial pyramid poolingdeep learningencoder–decoderresidual learningsemantic segmentation
spellingShingle	Seokyong Shin Sanghun Lee Hyunho Han EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning Applied Sciences atrous spatial pyramid pooling deep learning encoder–decoder residual learning semantic segmentation
title	EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning
title_full	EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning
title_fullStr	EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning
title_full_unstemmed	EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning
title_short	EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning
title_sort	ear net efficient atrous residual network for semantic segmentation of street scenes based on deep learning
topic	atrous spatial pyramid pooling deep learning encoder–decoder residual learning semantic segmentation
url	https://www.mdpi.com/2076-3417/11/19/9119
work_keys_str_mv	AT seokyongshin earnetefficientatrousresidualnetworkforsemanticsegmentationofstreetscenesbasedondeeplearning AT sanghunlee earnetefficientatrousresidualnetworkforsemanticsegmentationofstreetscenesbasedondeeplearning AT hyunhohan earnetefficientatrousresidualnetworkforsemanticsegmentationofstreetscenesbasedondeeplearning

EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning

Similar Items