Multimodal Sentiment Analysis Based on Adaptive Gated Information Fusion
The goal of multimodal sentiment analysis is to achieve reliable and robust sentiment analysis by utilizing complementary information provided by multiple modalities.Recently,extracting deep semantic features by neural networks has achieved remarkable results in multimodal sentiment analysis.But the...
Main Author: | |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial office of Computer Science
2023-03-01
|
Series: | Jisuanji kexue |
Subjects: | |
Online Access: | https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2023-50-3-298.pdf |
_version_ | 1797845083419049984 |
---|---|
author | CHEN Zhen, PU Yuanyuan, ZHAO Zhengpeng, XU Dan, QIAN Wenhua |
author_facet | CHEN Zhen, PU Yuanyuan, ZHAO Zhengpeng, XU Dan, QIAN Wenhua |
author_sort | CHEN Zhen, PU Yuanyuan, ZHAO Zhengpeng, XU Dan, QIAN Wenhua |
collection | DOAJ |
description | The goal of multimodal sentiment analysis is to achieve reliable and robust sentiment analysis by utilizing complementary information provided by multiple modalities.Recently,extracting deep semantic features by neural networks has achieved remarkable results in multimodal sentiment analysis.But the fusion of features at different levels of multimodal information is also an important part in determining the effectiveness of sentiment analysis.Thus,a multimodal sentiment analysis model based on adaptive gating information fusion(AGIF) is proposed.Firstly,the different levels of visual and color features extracted by swin transformer and ResNet are organically fused through a gated information fusion network based on their contribution to sentiment analysis.Secondly,the sentiment of an image is often expressed by multiple subtle local regions due to the abstraction and complexity of sentiment,and these sentiment discriminating regions can be located accurately by iterative attention based on past information.The latest ERNIE pre-training model is utilized to solve the problem of Word2Vec and GloVe's inability to handle the word polysemy.Finally,the auto-fusion network is utilized to “dynamically” fuse the features of each modality,solving the pro-blem of information redundancy caused by the deterministic operation(concatenation or TFN) to construct multimodal joint representation.Extensive experiments on three publicly available real datasets demonstrate the effectiveness of the proposed model. |
first_indexed | 2024-04-09T17:32:47Z |
format | Article |
id | doaj.art-aef31f6baebb44d2a2604adecfa74cb2 |
institution | Directory Open Access Journal |
issn | 1002-137X |
language | zho |
last_indexed | 2024-04-09T17:32:47Z |
publishDate | 2023-03-01 |
publisher | Editorial office of Computer Science |
record_format | Article |
series | Jisuanji kexue |
spelling | doaj.art-aef31f6baebb44d2a2604adecfa74cb22023-04-18T02:33:25ZzhoEditorial office of Computer ScienceJisuanji kexue1002-137X2023-03-0150329830610.11896/jsjkx.220100156Multimodal Sentiment Analysis Based on Adaptive Gated Information FusionCHEN Zhen, PU Yuanyuan, ZHAO Zhengpeng, XU Dan, QIAN Wenhua01 College of Information Science and Engineering,Yunnan University,Kunming 650504,China;2 University Key Laboratory of Internet of Things Technology and Application,Yunnan Province,Kunming 650504,ChinaThe goal of multimodal sentiment analysis is to achieve reliable and robust sentiment analysis by utilizing complementary information provided by multiple modalities.Recently,extracting deep semantic features by neural networks has achieved remarkable results in multimodal sentiment analysis.But the fusion of features at different levels of multimodal information is also an important part in determining the effectiveness of sentiment analysis.Thus,a multimodal sentiment analysis model based on adaptive gating information fusion(AGIF) is proposed.Firstly,the different levels of visual and color features extracted by swin transformer and ResNet are organically fused through a gated information fusion network based on their contribution to sentiment analysis.Secondly,the sentiment of an image is often expressed by multiple subtle local regions due to the abstraction and complexity of sentiment,and these sentiment discriminating regions can be located accurately by iterative attention based on past information.The latest ERNIE pre-training model is utilized to solve the problem of Word2Vec and GloVe's inability to handle the word polysemy.Finally,the auto-fusion network is utilized to “dynamically” fuse the features of each modality,solving the pro-blem of information redundancy caused by the deterministic operation(concatenation or TFN) to construct multimodal joint representation.Extensive experiments on three publicly available real datasets demonstrate the effectiveness of the proposed model.https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2023-50-3-298.pdfmultimodal sentiment analysis|gated information fusion networks|iterative attention|ernie|auto-fusion network |
spellingShingle | CHEN Zhen, PU Yuanyuan, ZHAO Zhengpeng, XU Dan, QIAN Wenhua Multimodal Sentiment Analysis Based on Adaptive Gated Information Fusion Jisuanji kexue multimodal sentiment analysis|gated information fusion networks|iterative attention|ernie|auto-fusion network |
title | Multimodal Sentiment Analysis Based on Adaptive Gated Information Fusion |
title_full | Multimodal Sentiment Analysis Based on Adaptive Gated Information Fusion |
title_fullStr | Multimodal Sentiment Analysis Based on Adaptive Gated Information Fusion |
title_full_unstemmed | Multimodal Sentiment Analysis Based on Adaptive Gated Information Fusion |
title_short | Multimodal Sentiment Analysis Based on Adaptive Gated Information Fusion |
title_sort | multimodal sentiment analysis based on adaptive gated information fusion |
topic | multimodal sentiment analysis|gated information fusion networks|iterative attention|ernie|auto-fusion network |
url | https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2023-50-3-298.pdf |
work_keys_str_mv | AT chenzhenpuyuanyuanzhaozhengpengxudanqianwenhua multimodalsentimentanalysisbasedonadaptivegatedinformationfusion |