Thangka Image—Text Matching Based on Adaptive Pooling Layer and Improved Transformer

Image–text matching is a research hotspot in the multimodal task of integrating image and text processing. In order to solve the difficult problem of associating image and text data in the multimodal knowledge graph of Thangka, we propose an image and text matching method based on the Visual Semanti...

Full description

Bibliographic Details
Main Authors: Kaijie Wang, Tiejun Wang, Xiaoran Guo, Kui Xu, Jiao Wu
Format: Article
Language:English
Published: MDPI AG 2024-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/14/2/807