Cross-Modal Retrieval and Semantic Refinement for Remote Sensing Image Captioning

Two-stage remote sensing image captioning (RSIC) methods have achieved promising results by incorporating additional pre-trained remote sensing tasks to extract supplementary information and improve caption quality. However, these methods face limitations in semantic comprehension, as pre-trained de...

Full description

Bibliographic Details
Main Authors: Zhengxin Li, Wenzhe Zhao, Xuanyi Du, Guangyao Zhou, Songlin Zhang
Format: Article
Language:English
Published: MDPI AG 2024-01-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/16/1/196