Multimodal Semantic Segmentation in Autonomous Driving: A Review of Current Approaches and Future Perspectives

The perception of the surrounding environment is a key requirement for autonomous driving systems, yet the computation of an accurate semantic representation of the scene starting from RGB information alone is very challenging. In particular, the lack of geometric information and the strong dependen...

Full description

Bibliographic Details
Main Authors: Giulia Rizzoli, Francesco Barbato, Pietro Zanuttigh
Format: Article
Language:English
Published: MDPI AG 2022-07-01
Series:Technologies
Subjects:
Online Access:https://www.mdpi.com/2227-7080/10/4/90
Description
Summary:The perception of the surrounding environment is a key requirement for autonomous driving systems, yet the computation of an accurate semantic representation of the scene starting from RGB information alone is very challenging. In particular, the lack of geometric information and the strong dependence on weather and illumination conditions introduce critical challenges for approaches tackling this task. For this reason, most autonomous cars exploit a variety of sensors, including color, depth or thermal cameras, LiDARs, and RADARs. How to efficiently combine all these sources of information to compute an accurate semantic description of the scene is still an unsolved task, leading to an active research field. In this survey, we start by presenting the most commonly employed acquisition setups and datasets. Then we review several different deep learning architectures for multimodal semantic segmentation. We will discuss the various techniques to combine color, depth, LiDAR, and other modalities of data at different stages of the learning architectures, and we will show how smart fusion strategies allow us to improve performances with respect to the exploitation of a single source of information.
ISSN:2227-7080