Fully convolutional multi‐scale dense networks for monocular depth estimation
Monocular depth estimation is of vital importance in understanding the 3D geometry of a scene. However, inferring the underlying depth is ill‐posed and inherently ambiguous. In this study, two improvements to existing approaches are proposed. One is about a clean improved network architecture, for w...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2019-08-01
|
Series: | IET Computer Vision |
Subjects: | |
Online Access: | https://doi.org/10.1049/iet-cvi.2018.5645 |
_version_ | 1827817445072568320 |
---|---|
author | Jiwei Liu Yunzhou Zhang Jiahua Cui Yonghui Feng Linzhuo Pang |
author_facet | Jiwei Liu Yunzhou Zhang Jiahua Cui Yonghui Feng Linzhuo Pang |
author_sort | Jiwei Liu |
collection | DOAJ |
description | Monocular depth estimation is of vital importance in understanding the 3D geometry of a scene. However, inferring the underlying depth is ill‐posed and inherently ambiguous. In this study, two improvements to existing approaches are proposed. One is about a clean improved network architecture, for which the authors extend Densely Connected Convolutional Network (DenseNet) to work as end‐to‐end fully convolutional multi‐scale dense networks. The dense upsampling blocks are integrated to improve the output resolution and selected skip connection is incorporated to connect the downsampling and the upsampling paths efficiently. The other is about edge‐preserving loss functions, encompassing the reverse Huber loss, depth gradient loss and feature edge loss, which is particularly suited for estimation of fine details and clear boundaries of objects. Experiments on the NYU‐Depth‐v2 dataset and KITTI dataset show that the proposed model is competitive to the state‐of‐the‐art methods, achieving 0.506 and 4.977 performance in terms of root mean squared error respectively. |
first_indexed | 2024-03-12T00:34:20Z |
format | Article |
id | doaj.art-a5ac98fa4e4d43f8a3259b9727eee2b3 |
institution | Directory Open Access Journal |
issn | 1751-9632 1751-9640 |
language | English |
last_indexed | 2024-03-12T00:34:20Z |
publishDate | 2019-08-01 |
publisher | Wiley |
record_format | Article |
series | IET Computer Vision |
spelling | doaj.art-a5ac98fa4e4d43f8a3259b9727eee2b32023-09-15T10:01:39ZengWileyIET Computer Vision1751-96321751-96402019-08-0113551552210.1049/iet-cvi.2018.5645Fully convolutional multi‐scale dense networks for monocular depth estimationJiwei Liu0Yunzhou Zhang1Jiahua Cui2Yonghui Feng3Linzhuo Pang4College of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of ChinaCollege of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of ChinaCollege of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of ChinaCollege of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of ChinaCollege of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of ChinaMonocular depth estimation is of vital importance in understanding the 3D geometry of a scene. However, inferring the underlying depth is ill‐posed and inherently ambiguous. In this study, two improvements to existing approaches are proposed. One is about a clean improved network architecture, for which the authors extend Densely Connected Convolutional Network (DenseNet) to work as end‐to‐end fully convolutional multi‐scale dense networks. The dense upsampling blocks are integrated to improve the output resolution and selected skip connection is incorporated to connect the downsampling and the upsampling paths efficiently. The other is about edge‐preserving loss functions, encompassing the reverse Huber loss, depth gradient loss and feature edge loss, which is particularly suited for estimation of fine details and clear boundaries of objects. Experiments on the NYU‐Depth‐v2 dataset and KITTI dataset show that the proposed model is competitive to the state‐of‐the‐art methods, achieving 0.506 and 4.977 performance in terms of root mean squared error respectively.https://doi.org/10.1049/iet-cvi.2018.5645clean improved network architectureend-to-end fully convolutional multiscale dense networksdense upsampling blocksselected skip connectionedge-preserving loss functionsreverse Huber loss |
spellingShingle | Jiwei Liu Yunzhou Zhang Jiahua Cui Yonghui Feng Linzhuo Pang Fully convolutional multi‐scale dense networks for monocular depth estimation IET Computer Vision clean improved network architecture end-to-end fully convolutional multiscale dense networks dense upsampling blocks selected skip connection edge-preserving loss functions reverse Huber loss |
title | Fully convolutional multi‐scale dense networks for monocular depth estimation |
title_full | Fully convolutional multi‐scale dense networks for monocular depth estimation |
title_fullStr | Fully convolutional multi‐scale dense networks for monocular depth estimation |
title_full_unstemmed | Fully convolutional multi‐scale dense networks for monocular depth estimation |
title_short | Fully convolutional multi‐scale dense networks for monocular depth estimation |
title_sort | fully convolutional multi scale dense networks for monocular depth estimation |
topic | clean improved network architecture end-to-end fully convolutional multiscale dense networks dense upsampling blocks selected skip connection edge-preserving loss functions reverse Huber loss |
url | https://doi.org/10.1049/iet-cvi.2018.5645 |
work_keys_str_mv | AT jiweiliu fullyconvolutionalmultiscaledensenetworksformonoculardepthestimation AT yunzhouzhang fullyconvolutionalmultiscaledensenetworksformonoculardepthestimation AT jiahuacui fullyconvolutionalmultiscaledensenetworksformonoculardepthestimation AT yonghuifeng fullyconvolutionalmultiscaledensenetworksformonoculardepthestimation AT linzhuopang fullyconvolutionalmultiscaledensenetworksformonoculardepthestimation |