Cross Encoder-Decoder Transformer with Global-Local Visual Extractor for Medical Image Captioning

Transformer-based approaches have shown good results in image captioning tasks. However, current approaches have a limitation in generating text from global features of an entire image. Therefore, we propose novel methods for generating better image captioning as follows: (1) The Global-Local Visual...

Full description

Bibliographic Details
Main Authors:	Hojun Lee, Hyunjun Cho, Jieun Park, Jinyeong Chae, Jihie Kim
Format:	Article
Language:	English
Published:	MDPI AG 2022-02-01
Series:	Sensors
Subjects:	medical image captioning deep learning transformer
Online Access:	https://www.mdpi.com/1424-8220/22/4/1429

Internet

https://www.mdpi.com/1424-8220/22/4/1429

Cross Encoder-Decoder Transformer with Global-Local Visual Extractor for Medical Image Captioning

Internet

Similar Items