DEMVSNet: Denoising and depth inference for unstructured multi‐view stereo on noised images

Abstract Most deep‐learning‐based multi‐view stereo series studies are concerned with improving the depth prediction accuracy of noise‐free images. However, it is difficult to obtain off‐the‐set clean images in practice and 3D convolutional neural networks require a lot of computing resources. To ma...

Full description

Bibliographic Details
Main Authors: Jiawei Han, Xiaomei Chen, Yongtian Zhang, Weimin Hou, Zibo Hu
Format: Article
Language:English
Published: Wiley 2022-10-01
Series:IET Computer Vision
Subjects:
Online Access:https://doi.org/10.1049/cvi2.12102
_version_ 1797989414523109376
author Jiawei Han
Xiaomei Chen
Yongtian Zhang
Weimin Hou
Zibo Hu
author_facet Jiawei Han
Xiaomei Chen
Yongtian Zhang
Weimin Hou
Zibo Hu
author_sort Jiawei Han
collection DOAJ
description Abstract Most deep‐learning‐based multi‐view stereo series studies are concerned with improving the depth prediction accuracy of noise‐free images. However, it is difficult to obtain off‐the‐set clean images in practice and 3D convolutional neural networks require a lot of computing resources. To make full use of its computing power, different types of information can be processed simultaneously in the network. For these two issues, this paper proposes a novel multi‐stage network architecture to address depth inference and denoising simultaneously. Specifically, 2D feature maps are first converted into 3D cost volumes containing pixel information and depth information through differentiable homography and Gaussian probability mapping. Then, the cost volume is input into the regularisation module in each network stage to obtain the predicted probability volumes. Furthermore, simple static weights lead to training failure, and it is necessary to dynamically adjust the loss function by gradient normalisation. The proposed method can dispose of pixel information and depth information simultaneously and both reach an excellent level. Extensive experimental results show that the authors’ work surpasses the state‐of‐the‐art denoising on the DTU dataset (adding Gaussian–Poisson noise) and is more robust to noise images in depth inference.
first_indexed 2024-04-11T08:19:52Z
format Article
id doaj.art-13eec5f1c270486a8fc2983a3e547c57
institution Directory Open Access Journal
issn 1751-9632
1751-9640
language English
last_indexed 2024-04-11T08:19:52Z
publishDate 2022-10-01
publisher Wiley
record_format Article
series IET Computer Vision
spelling doaj.art-13eec5f1c270486a8fc2983a3e547c572022-12-22T04:34:59ZengWileyIET Computer Vision1751-96321751-96402022-10-0116757058010.1049/cvi2.12102DEMVSNet: Denoising and depth inference for unstructured multi‐view stereo on noised imagesJiawei Han0Xiaomei Chen1Yongtian Zhang2Weimin Hou3Zibo Hu4School of Optics and Photonics Beijing Institute of Technology Beijing ChinaSchool of Optics and Photonics Beijing Institute of Technology Beijing ChinaSchool of Optics and Photonics Beijing Institute of Technology Beijing ChinaSchool of Optics and Photonics Beijing Institute of Technology Beijing ChinaSchool of Optics and Photonics Beijing Institute of Technology Beijing ChinaAbstract Most deep‐learning‐based multi‐view stereo series studies are concerned with improving the depth prediction accuracy of noise‐free images. However, it is difficult to obtain off‐the‐set clean images in practice and 3D convolutional neural networks require a lot of computing resources. To make full use of its computing power, different types of information can be processed simultaneously in the network. For these two issues, this paper proposes a novel multi‐stage network architecture to address depth inference and denoising simultaneously. Specifically, 2D feature maps are first converted into 3D cost volumes containing pixel information and depth information through differentiable homography and Gaussian probability mapping. Then, the cost volume is input into the regularisation module in each network stage to obtain the predicted probability volumes. Furthermore, simple static weights lead to training failure, and it is necessary to dynamically adjust the loss function by gradient normalisation. The proposed method can dispose of pixel information and depth information simultaneously and both reach an excellent level. Extensive experimental results show that the authors’ work surpasses the state‐of‐the‐art denoising on the DTU dataset (adding Gaussian–Poisson noise) and is more robust to noise images in depth inference.https://doi.org/10.1049/cvi2.12102computer visionneural net architecturerandom noise
spellingShingle Jiawei Han
Xiaomei Chen
Yongtian Zhang
Weimin Hou
Zibo Hu
DEMVSNet: Denoising and depth inference for unstructured multi‐view stereo on noised images
IET Computer Vision
computer vision
neural net architecture
random noise
title DEMVSNet: Denoising and depth inference for unstructured multi‐view stereo on noised images
title_full DEMVSNet: Denoising and depth inference for unstructured multi‐view stereo on noised images
title_fullStr DEMVSNet: Denoising and depth inference for unstructured multi‐view stereo on noised images
title_full_unstemmed DEMVSNet: Denoising and depth inference for unstructured multi‐view stereo on noised images
title_short DEMVSNet: Denoising and depth inference for unstructured multi‐view stereo on noised images
title_sort demvsnet denoising and depth inference for unstructured multi view stereo on noised images
topic computer vision
neural net architecture
random noise
url https://doi.org/10.1049/cvi2.12102
work_keys_str_mv AT jiaweihan demvsnetdenoisinganddepthinferenceforunstructuredmultiviewstereoonnoisedimages
AT xiaomeichen demvsnetdenoisinganddepthinferenceforunstructuredmultiviewstereoonnoisedimages
AT yongtianzhang demvsnetdenoisinganddepthinferenceforunstructuredmultiviewstereoonnoisedimages
AT weiminhou demvsnetdenoisinganddepthinferenceforunstructuredmultiviewstereoonnoisedimages
AT zibohu demvsnetdenoisinganddepthinferenceforunstructuredmultiviewstereoonnoisedimages