Visual Clue Guidance and Consistency Matching Framework for Multimodal Named Entity Recognition

The goal of multimodal named entity recognition (MNER) is to detect entity spans in given image–text pairs and classify them into corresponding entity types. Despite the success of existing works that leverage cross-modal attention mechanisms to integrate textual and visual representations, we obser...

Full description

Bibliographic Details
Main Authors: Li He, Qingxiang Wang, Jie Liu, Jianyong Duan, Hao Wang
Format: Article
Language:English
Published: MDPI AG 2024-03-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/14/6/2333

Similar Items