Coarse-to-Fine Stereo Matching Network Based on Multi-Scale Structural Information Filtrating

Stereo vision measurement is widely applied in tasks such as autonomous driving and 3D scene reconstruction. Accurately obtaining the disparity of stereo images relies on effective stereo matching algorithms. Compared with the traditional algorithm, the stereo matching algorithm based on convolution...

Full description

Bibliographic Details
Main Authors:	Yuanwei Bi, Chuanbiao Li, Qiang Zheng, Guohui Wang, Shidong Xu, Weiyuan Wang
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Stereo matching convolutional neural network structural information filtrating contextual information ill-posed regions
Online Access:	https://ieeexplore.ieee.org/document/10184007/

_version_	1797743472428449792
author	Yuanwei Bi Chuanbiao Li Qiang Zheng Guohui Wang Shidong Xu Weiyuan Wang
author_facet	Yuanwei Bi Chuanbiao Li Qiang Zheng Guohui Wang Shidong Xu Weiyuan Wang
author_sort	Yuanwei Bi
collection	DOAJ
description	Stereo vision measurement is widely applied in tasks such as autonomous driving and 3D scene reconstruction. Accurately obtaining the disparity of stereo images relies on effective stereo matching algorithms. Compared with the traditional algorithm, the stereo matching algorithm based on convolutional neural networks (CNNs) demonstrates higher accuracy. In this paper, we propose Cs-Net, a coarse-to-fine stereo matching framework that incorporates structural information filtering, aiming to obtain accurate disparity maps. The proposed framework specifically addresses the challenge of accurate disparity estimation, and improves stereo matching in ill-posed regions, such as texture-less and reflective surfaces. To effectively tackle this challenge, the proposed framework incorporates several key modules. First, a contextual attention feature extraction module is introduced, which plays a crucial role in obtaining context information for ill-posed region. Second, a structural attention weight generation module is designed to alleviate the stereo matching errors caused by lack of structural information, and the structure boundary generated by the proposed module is proved to be related to stereo matching errors. Furthermore, a two-stage cost aggregation module is used to regularize the initial cost volume and effectively aggregate the depth information to alleviate matching errors. In the ablation experiments studies, compared to baseline algorithm (GwcNet), Cs-Net can improve D3 and EPE metrics by 14.4% and 0.16 px on the KITTI2015 validation dataset, respectively. Additionally, in the reflective regions of the KITTI2012 benchmark, compared to baseline algorithm, the D3 and D5 metrics of Cs-Net reduced by 15.3% and 20.1%. Additionally, on the DriveStereo dataset, Cs-Net exhibited significant reductions in the D3 and EPE metrics compared to the baseline algorithm, achieving a decrease of 23.5% and 0.09 px, respectively.
first_indexed	2024-03-12T14:55:55Z
format	Article
id	doaj.art-b5df1107b9da44109d05424b2dde8a7d
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-12T14:55:55Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-b5df1107b9da44109d05424b2dde8a7d2023-08-14T23:00:51ZengIEEEIEEE Access2169-35362023-01-0111836928370210.1109/ACCESS.2023.329444110184007Coarse-to-Fine Stereo Matching Network Based on Multi-Scale Structural Information FiltratingYuanwei Bi0Chuanbiao Li1https://orcid.org/0000-0002-3010-7445Qiang Zheng2https://orcid.org/0000-0002-7853-8033Guohui Wang3Shidong Xu4Weiyuan Wang5School of Computer and Control Engineering, Yantai University, Yantai, ChinaSchool of Computer and Control Engineering, Yantai University, Yantai, ChinaSchool of Computer and Control Engineering, Yantai University, Yantai, ChinaSchool of Computer and Control Engineering, Yantai University, Yantai, ChinaSchool of Computer and Control Engineering, Yantai University, Yantai, ChinaSchool of Computer and Control Engineering, Yantai University, Yantai, ChinaStereo vision measurement is widely applied in tasks such as autonomous driving and 3D scene reconstruction. Accurately obtaining the disparity of stereo images relies on effective stereo matching algorithms. Compared with the traditional algorithm, the stereo matching algorithm based on convolutional neural networks (CNNs) demonstrates higher accuracy. In this paper, we propose Cs-Net, a coarse-to-fine stereo matching framework that incorporates structural information filtering, aiming to obtain accurate disparity maps. The proposed framework specifically addresses the challenge of accurate disparity estimation, and improves stereo matching in ill-posed regions, such as texture-less and reflective surfaces. To effectively tackle this challenge, the proposed framework incorporates several key modules. First, a contextual attention feature extraction module is introduced, which plays a crucial role in obtaining context information for ill-posed region. Second, a structural attention weight generation module is designed to alleviate the stereo matching errors caused by lack of structural information, and the structure boundary generated by the proposed module is proved to be related to stereo matching errors. Furthermore, a two-stage cost aggregation module is used to regularize the initial cost volume and effectively aggregate the depth information to alleviate matching errors. In the ablation experiments studies, compared to baseline algorithm (GwcNet), Cs-Net can improve D3 and EPE metrics by 14.4% and 0.16 px on the KITTI2015 validation dataset, respectively. Additionally, in the reflective regions of the KITTI2012 benchmark, compared to baseline algorithm, the D3 and D5 metrics of Cs-Net reduced by 15.3% and 20.1%. Additionally, on the DriveStereo dataset, Cs-Net exhibited significant reductions in the D3 and EPE metrics compared to the baseline algorithm, achieving a decrease of 23.5% and 0.09 px, respectively.https://ieeexplore.ieee.org/document/10184007/Stereo matchingconvolutional neural networkstructural information filtratingcontextual informationill-posed regions
spellingShingle	Yuanwei Bi Chuanbiao Li Qiang Zheng Guohui Wang Shidong Xu Weiyuan Wang Coarse-to-Fine Stereo Matching Network Based on Multi-Scale Structural Information Filtrating IEEE Access Stereo matching convolutional neural network structural information filtrating contextual information ill-posed regions
title	Coarse-to-Fine Stereo Matching Network Based on Multi-Scale Structural Information Filtrating
title_full	Coarse-to-Fine Stereo Matching Network Based on Multi-Scale Structural Information Filtrating
title_fullStr	Coarse-to-Fine Stereo Matching Network Based on Multi-Scale Structural Information Filtrating
title_full_unstemmed	Coarse-to-Fine Stereo Matching Network Based on Multi-Scale Structural Information Filtrating
title_short	Coarse-to-Fine Stereo Matching Network Based on Multi-Scale Structural Information Filtrating
title_sort	coarse to fine stereo matching network based on multi scale structural information filtrating
topic	Stereo matching convolutional neural network structural information filtrating contextual information ill-posed regions
url	https://ieeexplore.ieee.org/document/10184007/
work_keys_str_mv	AT yuanweibi coarsetofinestereomatchingnetworkbasedonmultiscalestructuralinformationfiltrating AT chuanbiaoli coarsetofinestereomatchingnetworkbasedonmultiscalestructuralinformationfiltrating AT qiangzheng coarsetofinestereomatchingnetworkbasedonmultiscalestructuralinformationfiltrating AT guohuiwang coarsetofinestereomatchingnetworkbasedonmultiscalestructuralinformationfiltrating AT shidongxu coarsetofinestereomatchingnetworkbasedonmultiscalestructuralinformationfiltrating AT weiyuanwang coarsetofinestereomatchingnetworkbasedonmultiscalestructuralinformationfiltrating

Coarse-to-Fine Stereo Matching Network Based on Multi-Scale Structural Information Filtrating

Similar Items