MFVT: Multilevel Feature Fusion Vision Transformer and RAMix Data Augmentation for Fine-Grained Visual Categorization
The introduction and application of the Vision Transformer (ViT) has promoted the development of fine-grained visual categorization (FGVC). However, there are some problems when directly applying ViT to FGVC tasks. ViT only classifies using the class token in the last layer, ignoring the local and l...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-10-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/11/21/3552 |