Cross-scale Feature Fusion Self-attention for Image Captioning

In recent years,the encoder-decoder framework based on self-attention mechanism has become the mainstream model in image captioning.However,self-attention in the encoder only models the visual relations of low-scale features,ignoring some effective information in high-scale visual features,thus affe...

Full description

Bibliographic Details
Main Author:	WANG Ming-zhan, JI Jun-zhong, JIA Ao-zhe, ZHANG Xiao-dan
Format:	Article
Language:	zho
Published:	Editorial office of Computer Science 2022-10-01
Series:	Jisuanji kexue
Subjects:	image captioning\|self-attention\|cross-scale feature fusion
Online Access:	https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-10-191.pdf

Internet

https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-10-191.pdf

Cross-scale Feature Fusion Self-attention for Image Captioning

Internet

Similar Items