Multimodal Colearning Meets Remote Sensing: Taxonomy, State of the Art, and Future Works

In remote sensing (RS), multiple modalities of data are usually available, e.g., RGB, multispectral, hyperspectral, light detection and ranging (LiDAR), and synthetic aperture radar (SAR). Multimodal machine learning systems, which fuse these rich multimodal data modalities, have shown better perfor...

Full description

Bibliographic Details
Main Authors:	Nhi Kieu, Kien Nguyen, Abdullah Nazib, Tharindu Fernando, Clinton Fookes, Sridha Sridharan
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:	Multimodal colearning multimodal learning remote sensing (RS) satellite imagery
Online Access:	https://ieeexplore.ieee.org/document/10474099/

_version_	1797217407569231872
author	Nhi Kieu Kien Nguyen Abdullah Nazib Tharindu Fernando Clinton Fookes Sridha Sridharan
author_facet	Nhi Kieu Kien Nguyen Abdullah Nazib Tharindu Fernando Clinton Fookes Sridha Sridharan
author_sort	Nhi Kieu
collection	DOAJ
description	In remote sensing (RS), multiple modalities of data are usually available, e.g., RGB, multispectral, hyperspectral, light detection and ranging (LiDAR), and synthetic aperture radar (SAR). Multimodal machine learning systems, which fuse these rich multimodal data modalities, have shown better performance compared to unimodal systems. Most multimodal research assumes that all modalities are present, aligned, and noiseless during training and testing time. However, in real-world scenarios, it is common to observe that one or more modalities are missing, noisy, and nonaligned, in either training or testing or both. In addition, acquiring large-scale, noise-free annotations is expensive, as a result, lacking sufficient annotated datasets or having to deal with inconsistent labels are open challenges. These challenges can be addressed under a learning paradigm called multimodal colearning. This article focuses on multimodal colearning techniques for RS data. We first review what data modalities are available in the RS domain and the key benefits and challenges of combining multimodal data in the RS context. We then review the RS tasks that would benefit from multimodal processing including classification, segmentation, target detection, anomaly detection, and temporal change detection. We then dive deeper into technical details by reviewing more than 200 recent efforts in this area and provide a comprehensive taxonomy to systematically review state-of-the-art approaches in four key colearning challenges including missing modalities, noisy modalities, limited modality annotations, and weakly paired modalities. Based on these insights, we propose emerging research directions to inform potential future research in multimodal colearning for RS.
first_indexed	2024-04-24T12:01:22Z
format	Article
id	doaj.art-e877b3ac8e8a430785d4a55e4adc2beb
institution	Directory Open Access Journal
issn	2151-1535
language	English
last_indexed	2024-04-24T12:01:22Z
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling	doaj.art-e877b3ac8e8a430785d4a55e4adc2beb2024-04-08T23:00:12ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing2151-15352024-01-01177386740910.1109/JSTARS.2024.337834810474099Multimodal Colearning Meets Remote Sensing: Taxonomy, State of the Art, and Future WorksNhi Kieu0Kien Nguyen1https://orcid.org/0000-0002-3466-9218Abdullah Nazib2https://orcid.org/0000-0003-1048-0346Tharindu Fernando3https://orcid.org/0000-0002-6935-1816Clinton Fookes4https://orcid.org/0000-0002-8515-6324Sridha Sridharan5https://orcid.org/0000-0003-4316-9001School of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, QLD, AustraliaSchool of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, QLD, AustraliaSchool of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, QLD, AustraliaSchool of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, QLD, AustraliaSchool of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, QLD, AustraliaSchool of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, QLD, AustraliaIn remote sensing (RS), multiple modalities of data are usually available, e.g., RGB, multispectral, hyperspectral, light detection and ranging (LiDAR), and synthetic aperture radar (SAR). Multimodal machine learning systems, which fuse these rich multimodal data modalities, have shown better performance compared to unimodal systems. Most multimodal research assumes that all modalities are present, aligned, and noiseless during training and testing time. However, in real-world scenarios, it is common to observe that one or more modalities are missing, noisy, and nonaligned, in either training or testing or both. In addition, acquiring large-scale, noise-free annotations is expensive, as a result, lacking sufficient annotated datasets or having to deal with inconsistent labels are open challenges. These challenges can be addressed under a learning paradigm called multimodal colearning. This article focuses on multimodal colearning techniques for RS data. We first review what data modalities are available in the RS domain and the key benefits and challenges of combining multimodal data in the RS context. We then review the RS tasks that would benefit from multimodal processing including classification, segmentation, target detection, anomaly detection, and temporal change detection. We then dive deeper into technical details by reviewing more than 200 recent efforts in this area and provide a comprehensive taxonomy to systematically review state-of-the-art approaches in four key colearning challenges including missing modalities, noisy modalities, limited modality annotations, and weakly paired modalities. Based on these insights, we propose emerging research directions to inform potential future research in multimodal colearning for RS.https://ieeexplore.ieee.org/document/10474099/Multimodal colearningmultimodal learningremote sensing (RS)satellite imagery
spellingShingle	Nhi Kieu Kien Nguyen Abdullah Nazib Tharindu Fernando Clinton Fookes Sridha Sridharan Multimodal Colearning Meets Remote Sensing: Taxonomy, State of the Art, and Future Works IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Multimodal colearning multimodal learning remote sensing (RS) satellite imagery
title	Multimodal Colearning Meets Remote Sensing: Taxonomy, State of the Art, and Future Works
title_full	Multimodal Colearning Meets Remote Sensing: Taxonomy, State of the Art, and Future Works
title_fullStr	Multimodal Colearning Meets Remote Sensing: Taxonomy, State of the Art, and Future Works
title_full_unstemmed	Multimodal Colearning Meets Remote Sensing: Taxonomy, State of the Art, and Future Works
title_short	Multimodal Colearning Meets Remote Sensing: Taxonomy, State of the Art, and Future Works
title_sort	multimodal colearning meets remote sensing taxonomy state of the art and future works
topic	Multimodal colearning multimodal learning remote sensing (RS) satellite imagery
url	https://ieeexplore.ieee.org/document/10474099/
work_keys_str_mv	AT nhikieu multimodalcolearningmeetsremotesensingtaxonomystateoftheartandfutureworks AT kiennguyen multimodalcolearningmeetsremotesensingtaxonomystateoftheartandfutureworks AT abdullahnazib multimodalcolearningmeetsremotesensingtaxonomystateoftheartandfutureworks AT tharindufernando multimodalcolearningmeetsremotesensingtaxonomystateoftheartandfutureworks AT clintonfookes multimodalcolearningmeetsremotesensingtaxonomystateoftheartandfutureworks AT sridhasridharan multimodalcolearningmeetsremotesensingtaxonomystateoftheartandfutureworks

Multimodal Colearning Meets Remote Sensing: Taxonomy, State of the Art, and Future Works

Similar Items