Research Progress on Vision–Language Multimodal Pretraining Model Technology

Because the pretraining model is not limited by the scale of data annotation and can learn general semantic information, it performs well in tasks related to natural language processing and computer vision. In recent years, more and more attention has been paid to research on the multimodal pretrain...

Full description

Bibliographic Details
Main Authors: Huansha Wang, Ruiyang Huang, Jianpeng Zhang
Format: Article
Language:English
Published: MDPI AG 2022-10-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/11/21/3556