Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection

With the development of the Mobile Internet, more and more users publish multi-modal posts on social media platforms. Fake news detection has become an increasingly challenging task. Although there are many works using deep schemes to extract and combine textual and visual representation in the post...

Full description

Bibliographic Details
Main Authors: Long Ying, Hui Yu, Jinguang Wang, Yongze Ji, Shengsheng Qian
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9541113/
_version_ 1818910526141366272
author Long Ying
Hui Yu
Jinguang Wang
Yongze Ji
Shengsheng Qian
author_facet Long Ying
Hui Yu
Jinguang Wang
Yongze Ji
Shengsheng Qian
author_sort Long Ying
collection DOAJ
description With the development of the Mobile Internet, more and more users publish multi-modal posts on social media platforms. Fake news detection has become an increasingly challenging task. Although there are many works using deep schemes to extract and combine textual and visual representation in the post, most existing methods do not sufficiently utilize the complementary multi-modal information containing semantic concepts and entities to complement and enhance each modality. Moreover, these methods do not model and incorporate the rich multi-level semantics of text information to improve fake news detection tasks. In this paper, we propose a novel end-to-end <italic>Multi-level Multi-modal Cross-attention Network</italic> (MMCN) which exploits the multi-level semantics of textual content and jointly integrates the relationships of duplicate and different modalities (textual and visual modality) of social multimedia posts in a unified framework. Pre-trained BERT and ResNet models are employed to generate high-quality representations for text words and image regions respectively. A multi-modal cross-attention network is then designed to fuse the feature embeddings of the text words and image regions by simultaneously considering data relationships in duplicate and different modalities. Specially, due to different layers of the transformer architecture have different feature representations, we employ a multi-level encoding network to capture the rich multi-level semantics to enhance the presentations of posts. Extensive experiments on the two public datasets (WEIBO and PHEME) demonstrate that compared with the state-of-the-art models, the proposed MMCN has an advantageous performance.
first_indexed 2024-12-19T22:44:12Z
format Article
id doaj.art-df526dbacd1e4535985ce41bc0e8fd39
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-19T22:44:12Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-df526dbacd1e4535985ce41bc0e8fd392022-12-21T20:02:59ZengIEEEIEEE Access2169-35362021-01-01913236313237310.1109/ACCESS.2021.31140939541113Multi-Level Multi-Modal Cross-Attention Network for Fake News DetectionLong Ying0https://orcid.org/0000-0001-6834-5441Hui Yu1Jinguang Wang2Yongze Ji3Shengsheng Qian4https://orcid.org/0000-0001-9488-2208School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing, ChinaSchool of Computer and Software, Nanjing University of Information Science and Technology, Nanjing, ChinaSchool of Computer Science and Information Engineering, Hefei University of Technology, Hefei, ChinaSchool of Information Science and Engineering, China University of Petroleum, Beijing, ChinaNational Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, ChinaWith the development of the Mobile Internet, more and more users publish multi-modal posts on social media platforms. Fake news detection has become an increasingly challenging task. Although there are many works using deep schemes to extract and combine textual and visual representation in the post, most existing methods do not sufficiently utilize the complementary multi-modal information containing semantic concepts and entities to complement and enhance each modality. Moreover, these methods do not model and incorporate the rich multi-level semantics of text information to improve fake news detection tasks. In this paper, we propose a novel end-to-end <italic>Multi-level Multi-modal Cross-attention Network</italic> (MMCN) which exploits the multi-level semantics of textual content and jointly integrates the relationships of duplicate and different modalities (textual and visual modality) of social multimedia posts in a unified framework. Pre-trained BERT and ResNet models are employed to generate high-quality representations for text words and image regions respectively. A multi-modal cross-attention network is then designed to fuse the feature embeddings of the text words and image regions by simultaneously considering data relationships in duplicate and different modalities. Specially, due to different layers of the transformer architecture have different feature representations, we employ a multi-level encoding network to capture the rich multi-level semantics to enhance the presentations of posts. Extensive experiments on the two public datasets (WEIBO and PHEME) demonstrate that compared with the state-of-the-art models, the proposed MMCN has an advantageous performance.https://ieeexplore.ieee.org/document/9541113/Multi-level neural networksfake news detectionmulti-modal fusion
spellingShingle Long Ying
Hui Yu
Jinguang Wang
Yongze Ji
Shengsheng Qian
Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection
IEEE Access
Multi-level neural networks
fake news detection
multi-modal fusion
title Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection
title_full Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection
title_fullStr Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection
title_full_unstemmed Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection
title_short Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection
title_sort multi level multi modal cross attention network for fake news detection
topic Multi-level neural networks
fake news detection
multi-modal fusion
url https://ieeexplore.ieee.org/document/9541113/
work_keys_str_mv AT longying multilevelmultimodalcrossattentionnetworkforfakenewsdetection
AT huiyu multilevelmultimodalcrossattentionnetworkforfakenewsdetection
AT jinguangwang multilevelmultimodalcrossattentionnetworkforfakenewsdetection
AT yongzeji multilevelmultimodalcrossattentionnetworkforfakenewsdetection
AT shengshengqian multilevelmultimodalcrossattentionnetworkforfakenewsdetection