Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild

We present an image-based real-time sentiment analysis system that can be used to recognize in-the-wild sentiment expressions on online social networks. The system deploys the newly proposed transformer architecture on online social networks (OSN) big data to extract emotion and sentiment features u...

Full description

Bibliographic Details
Main Authors:	Fatimah Alzamzami, Abdulmotaleb El Saddik
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Transformers ViT sentiment online social media transfer learning threshold moving
Online Access:	https://ieeexplore.ieee.org/document/10122531/

_version_	1797823426048557056
author	Fatimah Alzamzami Abdulmotaleb El Saddik
author_facet	Fatimah Alzamzami Abdulmotaleb El Saddik
author_sort	Fatimah Alzamzami
collection	DOAJ
description	We present an image-based real-time sentiment analysis system that can be used to recognize in-the-wild sentiment expressions on online social networks. The system deploys the newly proposed transformer architecture on online social networks (OSN) big data to extract emotion and sentiment features using three types of images: images containing faces, images containing text, and images containing no faces/text. We build three separate models, one for each type of image, and then fuse all the models to learn the online sentiment behavior. Our proposed methodology combines a supervised two-stage training approach and threshold-moving method, which is crucial for the data imbalance found in OSN data. The training is carried out on existing popular datasets (i.e., for the three models) and our newly proposed dataset, the Domain Free Multimedia Sentiment Dataset (DFMSD). Our results show that inducing the threshold-moving method during the training has enhanced the sentiment learning performance by 5-8% more points compared to when the training was conducted without the threshold-moving approach. Combining the two-stage strategy with the threshold-moving method during the training process, has been proven effective to further improve the learning performance (i.e. by <inline-formula> <tex-math notation="LaTeX">$\approx ~12$ </tex-math></inline-formula>% more enhanced accuracy compared to the threshold-moving strategy alone). Furthermore, the proposed approach has shown a positive learning impact on the fusion of the three models in terms of the accuracy and F-score.
first_indexed	2024-03-13T10:23:50Z
format	Article
id	doaj.art-69fb595bf4f94413aaeefda1b07f1cc6
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-13T10:23:50Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-69fb595bf4f94413aaeefda1b07f1cc62023-05-19T23:00:38ZengIEEEIEEE Access2169-35362023-01-0111470704707910.1109/ACCESS.2023.327474410122531Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the WildFatimah Alzamzami0https://orcid.org/0000-0002-5009-3861Abdulmotaleb El Saddik1https://orcid.org/0000-0002-7690-8547Multimedia Communication Research Laboratory, School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON, CanadaMultimedia Communication Research Laboratory, School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON, CanadaWe present an image-based real-time sentiment analysis system that can be used to recognize in-the-wild sentiment expressions on online social networks. The system deploys the newly proposed transformer architecture on online social networks (OSN) big data to extract emotion and sentiment features using three types of images: images containing faces, images containing text, and images containing no faces/text. We build three separate models, one for each type of image, and then fuse all the models to learn the online sentiment behavior. Our proposed methodology combines a supervised two-stage training approach and threshold-moving method, which is crucial for the data imbalance found in OSN data. The training is carried out on existing popular datasets (i.e., for the three models) and our newly proposed dataset, the Domain Free Multimedia Sentiment Dataset (DFMSD). Our results show that inducing the threshold-moving method during the training has enhanced the sentiment learning performance by 5-8% more points compared to when the training was conducted without the threshold-moving approach. Combining the two-stage strategy with the threshold-moving method during the training process, has been proven effective to further improve the learning performance (i.e. by <inline-formula> <tex-math notation="LaTeX">$\approx ~12$ </tex-math></inline-formula>% more enhanced accuracy compared to the threshold-moving strategy alone). Furthermore, the proposed approach has shown a positive learning impact on the fusion of the three models in terms of the accuracy and F-score.https://ieeexplore.ieee.org/document/10122531/TransformersViTsentimentonline social mediatransfer learningthreshold moving
spellingShingle	Fatimah Alzamzami Abdulmotaleb El Saddik Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild IEEE Access Transformers ViT sentiment online social media transfer learning threshold moving
title	Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild
title_full	Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild
title_fullStr	Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild
title_full_unstemmed	Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild
title_short	Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild
title_sort	transformer based feature fusion approach for multimodal visual sentiment recognition using tweets in the wild
topic	Transformers ViT sentiment online social media transfer learning threshold moving
url	https://ieeexplore.ieee.org/document/10122531/
work_keys_str_mv	AT fatimahalzamzami transformerbasedfeaturefusionapproachformultimodalvisualsentimentrecognitionusingtweetsinthewild AT abdulmotalebelsaddik transformerbasedfeaturefusionapproachformultimodalvisualsentimentrecognitionusingtweetsinthewild

Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild

Similar Items