MFVT: Multilevel Feature Fusion Vision Transformer and RAMix Data Augmentation for Fine-Grained Visual Categorization

The introduction and application of the Vision Transformer (ViT) has promoted the development of fine-grained visual categorization (FGVC). However, there are some problems when directly applying ViT to FGVC tasks. ViT only classifies using the class token in the last layer, ignoring the local and l...

Full description

Bibliographic Details
Main Authors: Xinyao Lv, Hao Xia, Na Li, Xudong Li, Ruoming Lan
Format: Article
Language:English
Published: MDPI AG 2022-10-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/11/21/3552

Similar Items