Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild

We present an image-based real-time sentiment analysis system that can be used to recognize in-the-wild sentiment expressions on online social networks. The system deploys the newly proposed transformer architecture on online social networks (OSN) big data to extract emotion and sentiment features u...

Full description

Bibliographic Details
Main Authors: Fatimah Alzamzami, Abdulmotaleb El Saddik
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10122531/
_version_ 1797823426048557056
author Fatimah Alzamzami
Abdulmotaleb El Saddik
author_facet Fatimah Alzamzami
Abdulmotaleb El Saddik
author_sort Fatimah Alzamzami
collection DOAJ
description We present an image-based real-time sentiment analysis system that can be used to recognize in-the-wild sentiment expressions on online social networks. The system deploys the newly proposed transformer architecture on online social networks (OSN) big data to extract emotion and sentiment features using three types of images: images containing faces, images containing text, and images containing no faces/text. We build three separate models, one for each type of image, and then fuse all the models to learn the online sentiment behavior. Our proposed methodology combines a supervised two-stage training approach and threshold-moving method, which is crucial for the data imbalance found in OSN data. The training is carried out on existing popular datasets (i.e., for the three models) and our newly proposed dataset, the Domain Free Multimedia Sentiment Dataset (DFMSD). Our results show that inducing the threshold-moving method during the training has enhanced the sentiment learning performance by 5-8&#x0025; more points compared to when the training was conducted without the threshold-moving approach. Combining the two-stage strategy with the threshold-moving method during the training process, has been proven effective to further improve the learning performance (i.e. by <inline-formula> <tex-math notation="LaTeX">$\approx ~12$ </tex-math></inline-formula>&#x0025; more enhanced accuracy compared to the threshold-moving strategy alone). Furthermore, the proposed approach has shown a positive learning impact on the fusion of the three models in terms of the accuracy and F-score.
first_indexed 2024-03-13T10:23:50Z
format Article
id doaj.art-69fb595bf4f94413aaeefda1b07f1cc6
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-13T10:23:50Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-69fb595bf4f94413aaeefda1b07f1cc62023-05-19T23:00:38ZengIEEEIEEE Access2169-35362023-01-0111470704707910.1109/ACCESS.2023.327474410122531Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the WildFatimah Alzamzami0https://orcid.org/0000-0002-5009-3861Abdulmotaleb El Saddik1https://orcid.org/0000-0002-7690-8547Multimedia Communication Research Laboratory, School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON, CanadaMultimedia Communication Research Laboratory, School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON, CanadaWe present an image-based real-time sentiment analysis system that can be used to recognize in-the-wild sentiment expressions on online social networks. The system deploys the newly proposed transformer architecture on online social networks (OSN) big data to extract emotion and sentiment features using three types of images: images containing faces, images containing text, and images containing no faces/text. We build three separate models, one for each type of image, and then fuse all the models to learn the online sentiment behavior. Our proposed methodology combines a supervised two-stage training approach and threshold-moving method, which is crucial for the data imbalance found in OSN data. The training is carried out on existing popular datasets (i.e., for the three models) and our newly proposed dataset, the Domain Free Multimedia Sentiment Dataset (DFMSD). Our results show that inducing the threshold-moving method during the training has enhanced the sentiment learning performance by 5-8&#x0025; more points compared to when the training was conducted without the threshold-moving approach. Combining the two-stage strategy with the threshold-moving method during the training process, has been proven effective to further improve the learning performance (i.e. by <inline-formula> <tex-math notation="LaTeX">$\approx ~12$ </tex-math></inline-formula>&#x0025; more enhanced accuracy compared to the threshold-moving strategy alone). Furthermore, the proposed approach has shown a positive learning impact on the fusion of the three models in terms of the accuracy and F-score.https://ieeexplore.ieee.org/document/10122531/TransformersViTsentimentonline social mediatransfer learningthreshold moving
spellingShingle Fatimah Alzamzami
Abdulmotaleb El Saddik
Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild
IEEE Access
Transformers
ViT
sentiment
online social media
transfer learning
threshold moving
title Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild
title_full Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild
title_fullStr Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild
title_full_unstemmed Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild
title_short Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild
title_sort transformer based feature fusion approach for multimodal visual sentiment recognition using tweets in the wild
topic Transformers
ViT
sentiment
online social media
transfer learning
threshold moving
url https://ieeexplore.ieee.org/document/10122531/
work_keys_str_mv AT fatimahalzamzami transformerbasedfeaturefusionapproachformultimodalvisualsentimentrecognitionusingtweetsinthewild
AT abdulmotalebelsaddik transformerbasedfeaturefusionapproachformultimodalvisualsentimentrecognitionusingtweetsinthewild