Masked Autoencoders in Computer Vision: A Comprehensive Survey

Masked autoencoders (MAE) is a deep learning method based on Transformer. Originally used for images, it has now been extended to video, audio, and some other temporal prediction tasks. In the field of computer vision, MAE performs well in classification, prediction, and target detection tasks. In t...

Full description

Bibliographic Details
Main Authors: Zexian Zhou, Xiaojing Liu
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10278410/