DTS-Net: Depth-to-Space Networks for Fast and Accurate Semantic Object Segmentation

We propose Depth-to-Space Net (DTS-Net), an effective technique for semantic segmentation using the efficient sub-pixel convolutional neural network. This technique is inspired by depth-to-space (DTS) image reconstruction, which was originally used for image and video super-resolution tasks, combine...

Full description

Bibliographic Details
Main Authors: Hatem Ibrahem, Ahmed Salem, Hyun-Soo Kang
Format: Article
Language:English
Published: MDPI AG 2022-01-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/22/1/337
_version_ 1797497482098245632
author Hatem Ibrahem
Ahmed Salem
Hyun-Soo Kang
author_facet Hatem Ibrahem
Ahmed Salem
Hyun-Soo Kang
author_sort Hatem Ibrahem
collection DOAJ
description We propose Depth-to-Space Net (DTS-Net), an effective technique for semantic segmentation using the efficient sub-pixel convolutional neural network. This technique is inspired by depth-to-space (DTS) image reconstruction, which was originally used for image and video super-resolution tasks, combined with a mask enhancement filtration technique based on multi-label classification, namely, Nearest Label Filtration. In the proposed technique, we employ depth-wise separable convolution-based architectures. We propose both a deep network, that is, DTS-Net, and a lightweight network, DTS-Net-Lite, for real-time semantic segmentation; these networks employ Xception and MobileNetV2 architectures as the feature extractors, respectively. In addition, we explore the joint semantic segmentation and depth estimation task and demonstrate that the proposed technique can efficiently perform both tasks simultaneously, outperforming state-of-art (SOTA) methods. We train and evaluate the performance of the proposed method on the PASCAL VOC2012, NYUV2, and CITYSCAPES benchmarks. Hence, we obtain high mean intersection over union (mIOU) and mean pixel accuracy (Pix.acc.) values using simple and lightweight convolutional neural network architectures of the developed networks. Notably, the proposed method outperforms SOTA methods that depend on encoder–decoder architectures, although our implementation and computations are far simpler.
first_indexed 2024-03-10T03:20:48Z
format Article
id doaj.art-165914c838dd441c8d33ae472185d910
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-10T03:20:48Z
publishDate 2022-01-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-165914c838dd441c8d33ae472185d9102023-11-23T12:20:42ZengMDPI AGSensors1424-82202022-01-0122133710.3390/s22010337DTS-Net: Depth-to-Space Networks for Fast and Accurate Semantic Object SegmentationHatem Ibrahem0Ahmed Salem1Hyun-Soo Kang2Department of Information and Communication Engineering, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju-si 28644, KoreaDepartment of Information and Communication Engineering, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju-si 28644, KoreaDepartment of Information and Communication Engineering, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju-si 28644, KoreaWe propose Depth-to-Space Net (DTS-Net), an effective technique for semantic segmentation using the efficient sub-pixel convolutional neural network. This technique is inspired by depth-to-space (DTS) image reconstruction, which was originally used for image and video super-resolution tasks, combined with a mask enhancement filtration technique based on multi-label classification, namely, Nearest Label Filtration. In the proposed technique, we employ depth-wise separable convolution-based architectures. We propose both a deep network, that is, DTS-Net, and a lightweight network, DTS-Net-Lite, for real-time semantic segmentation; these networks employ Xception and MobileNetV2 architectures as the feature extractors, respectively. In addition, we explore the joint semantic segmentation and depth estimation task and demonstrate that the proposed technique can efficiently perform both tasks simultaneously, outperforming state-of-art (SOTA) methods. We train and evaluate the performance of the proposed method on the PASCAL VOC2012, NYUV2, and CITYSCAPES benchmarks. Hence, we obtain high mean intersection over union (mIOU) and mean pixel accuracy (Pix.acc.) values using simple and lightweight convolutional neural network architectures of the developed networks. Notably, the proposed method outperforms SOTA methods that depend on encoder–decoder architectures, although our implementation and computations are far simpler.https://www.mdpi.com/1424-8220/22/1/337convolutional neural networkssemantic segmentationreal-time computer vision
spellingShingle Hatem Ibrahem
Ahmed Salem
Hyun-Soo Kang
DTS-Net: Depth-to-Space Networks for Fast and Accurate Semantic Object Segmentation
Sensors
convolutional neural networks
semantic segmentation
real-time computer vision
title DTS-Net: Depth-to-Space Networks for Fast and Accurate Semantic Object Segmentation
title_full DTS-Net: Depth-to-Space Networks for Fast and Accurate Semantic Object Segmentation
title_fullStr DTS-Net: Depth-to-Space Networks for Fast and Accurate Semantic Object Segmentation
title_full_unstemmed DTS-Net: Depth-to-Space Networks for Fast and Accurate Semantic Object Segmentation
title_short DTS-Net: Depth-to-Space Networks for Fast and Accurate Semantic Object Segmentation
title_sort dts net depth to space networks for fast and accurate semantic object segmentation
topic convolutional neural networks
semantic segmentation
real-time computer vision
url https://www.mdpi.com/1424-8220/22/1/337
work_keys_str_mv AT hatemibrahem dtsnetdepthtospacenetworksforfastandaccuratesemanticobjectsegmentation
AT ahmedsalem dtsnetdepthtospacenetworksforfastandaccuratesemanticobjectsegmentation
AT hyunsookang dtsnetdepthtospacenetworksforfastandaccuratesemanticobjectsegmentation