EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning
Segmentation of street scenes is a key technology in the field of autonomous vehicles. However, conventional segmentation methods achieve low accuracy because of the complexity of street landscapes. Therefore, we propose an efficient atrous residual network (EAR-Net) to improve accuracy while mainta...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-09-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/11/19/9119 |
_version_ | 1797516777370943488 |
---|---|
author | Seokyong Shin Sanghun Lee Hyunho Han |
author_facet | Seokyong Shin Sanghun Lee Hyunho Han |
author_sort | Seokyong Shin |
collection | DOAJ |
description | Segmentation of street scenes is a key technology in the field of autonomous vehicles. However, conventional segmentation methods achieve low accuracy because of the complexity of street landscapes. Therefore, we propose an efficient atrous residual network (EAR-Net) to improve accuracy while maintaining computation costs. First, we performed feature extraction and restoration, utilizing depthwise separable convolution (DSConv) and interpolation. Compared with conventional methods, DSConv and interpolation significantly reduce computation costs while minimizing performance degradation. Second, we utilized residual learning and atrous spatial pyramid pooling (ASPP) to achieve high accuracy. Residual learning increases the ability to extract context information by preventing the problem of feature and gradient losses. In addition, ASPP extracts additional context information while maintaining the resolution of the feature map. Finally, to alleviate the class imbalance between the image background and objects and to improve learning efficiency, we utilized focal loss. We evaluated EAR-Net on the Cityscapes dataset, which is commonly used for street scene segmentation studies. Experimental results showed that the EAR-Net had better segmentation results and similar computation costs as the conventional methods. We also conducted an ablation study to analyze the contributions of the ASPP and DSConv in the EAR-Net. |
first_indexed | 2024-03-10T07:05:38Z |
format | Article |
id | doaj.art-470ee410c52345fc873593c9cd4bc2d2 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T07:05:38Z |
publishDate | 2021-09-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-470ee410c52345fc873593c9cd4bc2d22023-11-22T15:47:58ZengMDPI AGApplied Sciences2076-34172021-09-011119911910.3390/app11199119EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep LearningSeokyong Shin0Sanghun Lee1Hyunho Han2Department of Plasma Bio Display, Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 01897, KoreaIngenium College of Liberal Arts, Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 01897, KoreaCollege of General Education, University of Ulsan, 93 Daehak-ro, Nam-gu, Ulsan 44610, KoreaSegmentation of street scenes is a key technology in the field of autonomous vehicles. However, conventional segmentation methods achieve low accuracy because of the complexity of street landscapes. Therefore, we propose an efficient atrous residual network (EAR-Net) to improve accuracy while maintaining computation costs. First, we performed feature extraction and restoration, utilizing depthwise separable convolution (DSConv) and interpolation. Compared with conventional methods, DSConv and interpolation significantly reduce computation costs while minimizing performance degradation. Second, we utilized residual learning and atrous spatial pyramid pooling (ASPP) to achieve high accuracy. Residual learning increases the ability to extract context information by preventing the problem of feature and gradient losses. In addition, ASPP extracts additional context information while maintaining the resolution of the feature map. Finally, to alleviate the class imbalance between the image background and objects and to improve learning efficiency, we utilized focal loss. We evaluated EAR-Net on the Cityscapes dataset, which is commonly used for street scene segmentation studies. Experimental results showed that the EAR-Net had better segmentation results and similar computation costs as the conventional methods. We also conducted an ablation study to analyze the contributions of the ASPP and DSConv in the EAR-Net.https://www.mdpi.com/2076-3417/11/19/9119atrous spatial pyramid poolingdeep learningencoder–decoderresidual learningsemantic segmentation |
spellingShingle | Seokyong Shin Sanghun Lee Hyunho Han EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning Applied Sciences atrous spatial pyramid pooling deep learning encoder–decoder residual learning semantic segmentation |
title | EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning |
title_full | EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning |
title_fullStr | EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning |
title_full_unstemmed | EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning |
title_short | EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning |
title_sort | ear net efficient atrous residual network for semantic segmentation of street scenes based on deep learning |
topic | atrous spatial pyramid pooling deep learning encoder–decoder residual learning semantic segmentation |
url | https://www.mdpi.com/2076-3417/11/19/9119 |
work_keys_str_mv | AT seokyongshin earnetefficientatrousresidualnetworkforsemanticsegmentationofstreetscenesbasedondeeplearning AT sanghunlee earnetefficientatrousresidualnetworkforsemanticsegmentationofstreetscenesbasedondeeplearning AT hyunhohan earnetefficientatrousresidualnetworkforsemanticsegmentationofstreetscenesbasedondeeplearning |