Spatial Encoding and Multi-layer Joint Encoding Enhanced Transformer for Image Captioning
Image captioning is one of the hot research topics in the field of computer vision.It is a cross-media data analysis task that combines computer vision and natural language processing.It describes the image by understanding the content of the image and generating captions that are both semantically...
Main Author: | |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial office of Computer Science
2022-10-01
|
Series: | Jisuanji kexue |
Subjects: | |
Online Access: | https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-10-151.pdf |