Cross-scale Feature Fusion Self-attention for Image Captioning
In recent years,the encoder-decoder framework based on self-attention mechanism has become the mainstream model in image captioning.However,self-attention in the encoder only models the visual relations of low-scale features,ignoring some effective information in high-scale visual features,thus affe...
Main Author: | |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial office of Computer Science
2022-10-01
|
Series: | Jisuanji kexue |
Subjects: | |
Online Access: | https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-10-191.pdf |