Fully convolutional multi‐scale dense networks for monocular depth estimation

Monocular depth estimation is of vital importance in understanding the 3D geometry of a scene. However, inferring the underlying depth is ill‐posed and inherently ambiguous. In this study, two improvements to existing approaches are proposed. One is about a clean improved network architecture, for w...

Full description

Bibliographic Details
Main Authors: Jiwei Liu, Yunzhou Zhang, Jiahua Cui, Yonghui Feng, Linzhuo Pang
Format: Article
Language:English
Published: Wiley 2019-08-01
Series:IET Computer Vision
Subjects:
Online Access:https://doi.org/10.1049/iet-cvi.2018.5645
_version_ 1827817445072568320
author Jiwei Liu
Yunzhou Zhang
Jiahua Cui
Yonghui Feng
Linzhuo Pang
author_facet Jiwei Liu
Yunzhou Zhang
Jiahua Cui
Yonghui Feng
Linzhuo Pang
author_sort Jiwei Liu
collection DOAJ
description Monocular depth estimation is of vital importance in understanding the 3D geometry of a scene. However, inferring the underlying depth is ill‐posed and inherently ambiguous. In this study, two improvements to existing approaches are proposed. One is about a clean improved network architecture, for which the authors extend Densely Connected Convolutional Network (DenseNet) to work as end‐to‐end fully convolutional multi‐scale dense networks. The dense upsampling blocks are integrated to improve the output resolution and selected skip connection is incorporated to connect the downsampling and the upsampling paths efficiently. The other is about edge‐preserving loss functions, encompassing the reverse Huber loss, depth gradient loss and feature edge loss, which is particularly suited for estimation of fine details and clear boundaries of objects. Experiments on the NYU‐Depth‐v2 dataset and KITTI dataset show that the proposed model is competitive to the state‐of‐the‐art methods, achieving 0.506 and 4.977 performance in terms of root mean squared error respectively.
first_indexed 2024-03-12T00:34:20Z
format Article
id doaj.art-a5ac98fa4e4d43f8a3259b9727eee2b3
institution Directory Open Access Journal
issn 1751-9632
1751-9640
language English
last_indexed 2024-03-12T00:34:20Z
publishDate 2019-08-01
publisher Wiley
record_format Article
series IET Computer Vision
spelling doaj.art-a5ac98fa4e4d43f8a3259b9727eee2b32023-09-15T10:01:39ZengWileyIET Computer Vision1751-96321751-96402019-08-0113551552210.1049/iet-cvi.2018.5645Fully convolutional multi‐scale dense networks for monocular depth estimationJiwei Liu0Yunzhou Zhang1Jiahua Cui2Yonghui Feng3Linzhuo Pang4College of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of ChinaCollege of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of ChinaCollege of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of ChinaCollege of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of ChinaCollege of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of ChinaMonocular depth estimation is of vital importance in understanding the 3D geometry of a scene. However, inferring the underlying depth is ill‐posed and inherently ambiguous. In this study, two improvements to existing approaches are proposed. One is about a clean improved network architecture, for which the authors extend Densely Connected Convolutional Network (DenseNet) to work as end‐to‐end fully convolutional multi‐scale dense networks. The dense upsampling blocks are integrated to improve the output resolution and selected skip connection is incorporated to connect the downsampling and the upsampling paths efficiently. The other is about edge‐preserving loss functions, encompassing the reverse Huber loss, depth gradient loss and feature edge loss, which is particularly suited for estimation of fine details and clear boundaries of objects. Experiments on the NYU‐Depth‐v2 dataset and KITTI dataset show that the proposed model is competitive to the state‐of‐the‐art methods, achieving 0.506 and 4.977 performance in terms of root mean squared error respectively.https://doi.org/10.1049/iet-cvi.2018.5645clean improved network architectureend-to-end fully convolutional multiscale dense networksdense upsampling blocksselected skip connectionedge-preserving loss functionsreverse Huber loss
spellingShingle Jiwei Liu
Yunzhou Zhang
Jiahua Cui
Yonghui Feng
Linzhuo Pang
Fully convolutional multi‐scale dense networks for monocular depth estimation
IET Computer Vision
clean improved network architecture
end-to-end fully convolutional multiscale dense networks
dense upsampling blocks
selected skip connection
edge-preserving loss functions
reverse Huber loss
title Fully convolutional multi‐scale dense networks for monocular depth estimation
title_full Fully convolutional multi‐scale dense networks for monocular depth estimation
title_fullStr Fully convolutional multi‐scale dense networks for monocular depth estimation
title_full_unstemmed Fully convolutional multi‐scale dense networks for monocular depth estimation
title_short Fully convolutional multi‐scale dense networks for monocular depth estimation
title_sort fully convolutional multi scale dense networks for monocular depth estimation
topic clean improved network architecture
end-to-end fully convolutional multiscale dense networks
dense upsampling blocks
selected skip connection
edge-preserving loss functions
reverse Huber loss
url https://doi.org/10.1049/iet-cvi.2018.5645
work_keys_str_mv AT jiweiliu fullyconvolutionalmultiscaledensenetworksformonoculardepthestimation
AT yunzhouzhang fullyconvolutionalmultiscaledensenetworksformonoculardepthestimation
AT jiahuacui fullyconvolutionalmultiscaledensenetworksformonoculardepthestimation
AT yonghuifeng fullyconvolutionalmultiscaledensenetworksformonoculardepthestimation
AT linzhuopang fullyconvolutionalmultiscaledensenetworksformonoculardepthestimation