PA-MVSNet: Sparse-to-Dense Multi-View Stereo With Pyramid Attention

Multi-view based 3D reconstruction aims to obtain 3D structure information of objects in space through two-dimensional images. In this paper, we propose a new multi-view stereo network that can robustly reconstruct the scene. To enhance the feature representation ability of Point-MVSNet, a pyramid a...

Full description

Bibliographic Details
Main Authors:	Ke Zhang, Mengyu Liu, Jinlai Zhang, Zhenbiao Dong
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Multi-view stereo pyramid attention point cloud depth estimate deep learning
Online Access:	https://ieeexplore.ieee.org/document/9352763/

_version_	1818191529912565760
author	Ke Zhang Mengyu Liu Jinlai Zhang Zhenbiao Dong
author_facet	Ke Zhang Mengyu Liu Jinlai Zhang Zhenbiao Dong
author_sort	Ke Zhang
collection	DOAJ
description	Multi-view based 3D reconstruction aims to obtain 3D structure information of objects in space through two-dimensional images. In this paper, we propose a new multi-view stereo network that can robustly reconstruct the scene. To enhance the feature representation ability of Point-MVSNet, a pyramid attention module is introduced. Specifically, we exploit the attention mechanism for the multi-scale feature pyramid to capture larger receptive fields and richer information. Instead of constructing a feature pyramid as the input, results of the pyramid attention module at different scales are directly used for the next layer. The network eventually generates a high-quality depth estimation for 3D reconstruction from sparse to dense by an iterative refinement schedule. Experiments have been performed to evaluate 3D reconstruction quality by comparison with existing state-of-the-art methods on the DTU dataset. The experimental results indicate our method performs the best in overall quality compared with previous methods, proving the effectiveness of our method. In the end, we use the data collected by mobile devices to implement 3D reconstruction with a combination of traditional and learning-based methods, providing ideas for the 3D reconstruction technology on mobile devices.
first_indexed	2024-12-12T00:16:04Z
format	Article
id	doaj.art-abaa9ca0134d49578dc575e116f94a6c
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-12T00:16:04Z
publishDate	2021-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-abaa9ca0134d49578dc575e116f94a6c2022-12-22T00:44:51ZengIEEEIEEE Access2169-35362021-01-019279082791510.1109/ACCESS.2021.30585229352763PA-MVSNet: Sparse-to-Dense Multi-View Stereo With Pyramid AttentionKe Zhang0https://orcid.org/0000-0001-6205-4782Mengyu Liu1https://orcid.org/0000-0002-0675-1572Jinlai Zhang2https://orcid.org/0000-0002-3457-1982Zhenbiao Dong3https://orcid.org/0000-0002-4389-5612School of Mechanical Engineering, Shanghai Institute of Technology, Shanghai, ChinaCollege of Mechanical Engineering, Guangxi University, Nanning, ChinaCollege of Mechanical Engineering, Guangxi University, Nanning, ChinaSchool of Mechanical Engineering, Shanghai Institute of Technology, Shanghai, ChinaMulti-view based 3D reconstruction aims to obtain 3D structure information of objects in space through two-dimensional images. In this paper, we propose a new multi-view stereo network that can robustly reconstruct the scene. To enhance the feature representation ability of Point-MVSNet, a pyramid attention module is introduced. Specifically, we exploit the attention mechanism for the multi-scale feature pyramid to capture larger receptive fields and richer information. Instead of constructing a feature pyramid as the input, results of the pyramid attention module at different scales are directly used for the next layer. The network eventually generates a high-quality depth estimation for 3D reconstruction from sparse to dense by an iterative refinement schedule. Experiments have been performed to evaluate 3D reconstruction quality by comparison with existing state-of-the-art methods on the DTU dataset. The experimental results indicate our method performs the best in overall quality compared with previous methods, proving the effectiveness of our method. In the end, we use the data collected by mobile devices to implement 3D reconstruction with a combination of traditional and learning-based methods, providing ideas for the 3D reconstruction technology on mobile devices.https://ieeexplore.ieee.org/document/9352763/Multi-view stereopyramid attentionpoint clouddepth estimatedeep learning
spellingShingle	Ke Zhang Mengyu Liu Jinlai Zhang Zhenbiao Dong PA-MVSNet: Sparse-to-Dense Multi-View Stereo With Pyramid Attention IEEE Access Multi-view stereo pyramid attention point cloud depth estimate deep learning
title	PA-MVSNet: Sparse-to-Dense Multi-View Stereo With Pyramid Attention
title_full	PA-MVSNet: Sparse-to-Dense Multi-View Stereo With Pyramid Attention
title_fullStr	PA-MVSNet: Sparse-to-Dense Multi-View Stereo With Pyramid Attention
title_full_unstemmed	PA-MVSNet: Sparse-to-Dense Multi-View Stereo With Pyramid Attention
title_short	PA-MVSNet: Sparse-to-Dense Multi-View Stereo With Pyramid Attention
title_sort	pa mvsnet sparse to dense multi view stereo with pyramid attention
topic	Multi-view stereo pyramid attention point cloud depth estimate deep learning
url	https://ieeexplore.ieee.org/document/9352763/
work_keys_str_mv	AT kezhang pamvsnetsparsetodensemultiviewstereowithpyramidattention AT mengyuliu pamvsnetsparsetodensemultiviewstereowithpyramidattention AT jinlaizhang pamvsnetsparsetodensemultiviewstereowithpyramidattention AT zhenbiaodong pamvsnetsparsetodensemultiviewstereowithpyramidattention

PA-MVSNet: Sparse-to-Dense Multi-View Stereo With Pyramid Attention

Similar Items