A Systematic Literature Review on Multimodal Machine Learning: Applications, Challenges, Gaps and Future Directions

Multimodal machine learning (MML) is a tempting multidisciplinary research area where heterogeneous data from multiple modalities and machine learning (ML) are combined to solve critical problems. Usually, research works use data from a single modality, such as images, audio, text, and signals. Howe...

Full description

Bibliographic Details
Main Authors: Arnab Barua, Mobyen Uddin Ahmed, Shahina Begum
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10041115/
_version_ 1797904592692838400
author Arnab Barua
Mobyen Uddin Ahmed
Shahina Begum
author_facet Arnab Barua
Mobyen Uddin Ahmed
Shahina Begum
author_sort Arnab Barua
collection DOAJ
description Multimodal machine learning (MML) is a tempting multidisciplinary research area where heterogeneous data from multiple modalities and machine learning (ML) are combined to solve critical problems. Usually, research works use data from a single modality, such as images, audio, text, and signals. However, real-world issues have become critical now, and handling them using multiple modalities of data instead of a single modality can significantly impact finding solutions. ML algorithms play an essential role in tuning parameters in developing MML models. This paper reviews recent advancements in the challenges of MML, namely: representation, translation, alignment, fusion and co-learning, and presents the gaps and challenges. A systematic literature review (SLR) was applied to define the progress and trends on those challenges in the MML domain. In total, 1032 articles were examined in this review to extract features like source, domain, application, modality, etc. This research article will help researchers understand the constant state of MML and navigate the selection of future research directions.
first_indexed 2024-04-10T09:52:30Z
format Article
id doaj.art-d13bb5a652504f2f872a503487b4450d
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-10T09:52:30Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-d13bb5a652504f2f872a503487b4450d2023-02-17T00:00:22ZengIEEEIEEE Access2169-35362023-01-0111148041483110.1109/ACCESS.2023.324385410041115A Systematic Literature Review on Multimodal Machine Learning: Applications, Challenges, Gaps and Future DirectionsArnab Barua0https://orcid.org/0000-0002-9698-8142Mobyen Uddin Ahmed1https://orcid.org/0000-0003-1953-6086Shahina Begum2School of Innovation, Design and Engineering, Mälardalen University, Västerås, SwedenSchool of Innovation, Design and Engineering, Mälardalen University, Västerås, SwedenSchool of Innovation, Design and Engineering, Mälardalen University, Västerås, SwedenMultimodal machine learning (MML) is a tempting multidisciplinary research area where heterogeneous data from multiple modalities and machine learning (ML) are combined to solve critical problems. Usually, research works use data from a single modality, such as images, audio, text, and signals. However, real-world issues have become critical now, and handling them using multiple modalities of data instead of a single modality can significantly impact finding solutions. ML algorithms play an essential role in tuning parameters in developing MML models. This paper reviews recent advancements in the challenges of MML, namely: representation, translation, alignment, fusion and co-learning, and presents the gaps and challenges. A systematic literature review (SLR) was applied to define the progress and trends on those challenges in the MML domain. In total, 1032 articles were examined in this review to extract features like source, domain, application, modality, etc. This research article will help researchers understand the constant state of MML and navigate the selection of future research directions.https://ieeexplore.ieee.org/document/10041115/Multimodal machine learningsystematic literature reviewrepresentationtranslationalignmentfusion
spellingShingle Arnab Barua
Mobyen Uddin Ahmed
Shahina Begum
A Systematic Literature Review on Multimodal Machine Learning: Applications, Challenges, Gaps and Future Directions
IEEE Access
Multimodal machine learning
systematic literature review
representation
translation
alignment
fusion
title A Systematic Literature Review on Multimodal Machine Learning: Applications, Challenges, Gaps and Future Directions
title_full A Systematic Literature Review on Multimodal Machine Learning: Applications, Challenges, Gaps and Future Directions
title_fullStr A Systematic Literature Review on Multimodal Machine Learning: Applications, Challenges, Gaps and Future Directions
title_full_unstemmed A Systematic Literature Review on Multimodal Machine Learning: Applications, Challenges, Gaps and Future Directions
title_short A Systematic Literature Review on Multimodal Machine Learning: Applications, Challenges, Gaps and Future Directions
title_sort systematic literature review on multimodal machine learning applications challenges gaps and future directions
topic Multimodal machine learning
systematic literature review
representation
translation
alignment
fusion
url https://ieeexplore.ieee.org/document/10041115/
work_keys_str_mv AT arnabbarua asystematicliteraturereviewonmultimodalmachinelearningapplicationschallengesgapsandfuturedirections
AT mobyenuddinahmed asystematicliteraturereviewonmultimodalmachinelearningapplicationschallengesgapsandfuturedirections
AT shahinabegum asystematicliteraturereviewonmultimodalmachinelearningapplicationschallengesgapsandfuturedirections
AT arnabbarua systematicliteraturereviewonmultimodalmachinelearningapplicationschallengesgapsandfuturedirections
AT mobyenuddinahmed systematicliteraturereviewonmultimodalmachinelearningapplicationschallengesgapsandfuturedirections
AT shahinabegum systematicliteraturereviewonmultimodalmachinelearningapplicationschallengesgapsandfuturedirections