Summary: | Stereo vision systems rely on accurate feature matching to provide valid stereo reconstruction and pose estimation. This accuracy is achieved through outlier removal techniques, such as RANSAC. However, images also contain semantic information, which can be extracted using neural networks. This paper proposes an additional outlier removal method, where the images are semantically segmented using a neural network, before the features identified are assigned semantic identifiers using a probabilistic data association technique, and matches are evaluated based on this added semantic information. This blending of feature-based techniques with dense semantic maps allows for more information to be tied to each feature, not just its position in the image. This opens paths to applications like class-based clustering. The approach proposed is compared to a traditional outlier removal system by comparing the produced disparity values to known ground truth measurements, and assessed for accuracy and execution speed. It is shown how the addition of semantic segmentation does improve the accuracy of disparity measurements in stereo images, with a loss in processing speed. However, this loss can be mitigated by utilising more specialised hardware.
|