Separate Syntax and Semantics: Part-of-Speech-Guided Transformer for Image Captioning

Transformer-based image captioning models have recently achieved remarkable performance by using new fully attentive paradigms. However, existing models generally follow the conventional language model of predicting the next word conditioned on the visual features and partially generated words. They...

Full description

Bibliographic Details
Main Authors: Dong Wang, Bing Liu, Yong Zhou, Mingming Liu, Peng Liu, Rui Yao
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/23/11875