Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers

The complex backgrounds of crop disease images and the small contrast between the disease area and the background can easily cause confusion, which seriously affects the robustness and accuracy of apple disease- identification models. To solve the above problems, this paper proposes a Vision Transfo...

Full description

Bibliographic Details
Main Authors: Xiaopeng Li, Shuqin Li
Format: Article
Language:English
Published: MDPI AG 2022-06-01
Series:Agriculture
Subjects:
Online Access:https://www.mdpi.com/2077-0472/12/6/884
_version_ 1797490919135510528
author Xiaopeng Li
Shuqin Li
author_facet Xiaopeng Li
Shuqin Li
author_sort Xiaopeng Li
collection DOAJ
description The complex backgrounds of crop disease images and the small contrast between the disease area and the background can easily cause confusion, which seriously affects the robustness and accuracy of apple disease- identification models. To solve the above problems, this paper proposes a Vision Transformer-based lightweight apple leaf disease- identification model, ConvViT, to extract effective features of crop disease spots to identify crop diseases. Our ConvViT includes convolutional structures and Transformer structures; the convolutional structure is used to extract the global features of the image, and the Transformer structure is used to obtain the local features of the disease region to help the CNN see better. The patch embedding method is improved to retain more edge information of the image and promote the information exchange between patches in the Transformer. The parameters and FLOPs (Floating Point Operations) of the model are significantly reduced by using depthwise separable convolution and linear-complexity multi-head attention operations. Experimental results on a complex background of a self-built apple leaf disease dataset show that ConvViT achieves comparable identification results (96.85%) with the current performance of the state-of-the-art Swin-Tiny. The parameters and FLOPs are only 32.7% and 21.7% of Swin-Tiny, and significantly ahead of MobilenetV3, Efficientnet-b0, and other models, which indicates that the proposed model is indeed an effective disease-identification model with practical application value.
first_indexed 2024-03-10T00:39:56Z
format Article
id doaj.art-13bb872cc41443948c2a7ccd1fee8a3c
institution Directory Open Access Journal
issn 2077-0472
language English
last_indexed 2024-03-10T00:39:56Z
publishDate 2022-06-01
publisher MDPI AG
record_format Article
series Agriculture
spelling doaj.art-13bb872cc41443948c2a7ccd1fee8a3c2023-11-23T15:08:14ZengMDPI AGAgriculture2077-04722022-06-0112688410.3390/agriculture12060884Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on TransformersXiaopeng Li0Shuqin Li1College of Information Engineering, Northwest A&F University, Xianyang 712100, ChinaCollege of Information Engineering, Northwest A&F University, Xianyang 712100, ChinaThe complex backgrounds of crop disease images and the small contrast between the disease area and the background can easily cause confusion, which seriously affects the robustness and accuracy of apple disease- identification models. To solve the above problems, this paper proposes a Vision Transformer-based lightweight apple leaf disease- identification model, ConvViT, to extract effective features of crop disease spots to identify crop diseases. Our ConvViT includes convolutional structures and Transformer structures; the convolutional structure is used to extract the global features of the image, and the Transformer structure is used to obtain the local features of the disease region to help the CNN see better. The patch embedding method is improved to retain more edge information of the image and promote the information exchange between patches in the Transformer. The parameters and FLOPs (Floating Point Operations) of the model are significantly reduced by using depthwise separable convolution and linear-complexity multi-head attention operations. Experimental results on a complex background of a self-built apple leaf disease dataset show that ConvViT achieves comparable identification results (96.85%) with the current performance of the state-of-the-art Swin-Tiny. The parameters and FLOPs are only 32.7% and 21.7% of Swin-Tiny, and significantly ahead of MobilenetV3, Efficientnet-b0, and other models, which indicates that the proposed model is indeed an effective disease-identification model with practical application value.https://www.mdpi.com/2077-0472/12/6/884identification of apple diseasesimage classificationlightweight modelVision Transformerhybrid modelcomplex environments
spellingShingle Xiaopeng Li
Shuqin Li
Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers
Agriculture
identification of apple diseases
image classification
lightweight model
Vision Transformer
hybrid model
complex environments
title Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers
title_full Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers
title_fullStr Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers
title_full_unstemmed Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers
title_short Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers
title_sort transformer help cnn see better a lightweight hybrid apple disease identification model based on transformers
topic identification of apple diseases
image classification
lightweight model
Vision Transformer
hybrid model
complex environments
url https://www.mdpi.com/2077-0472/12/6/884
work_keys_str_mv AT xiaopengli transformerhelpcnnseebetteralightweighthybridapplediseaseidentificationmodelbasedontransformers
AT shuqinli transformerhelpcnnseebetteralightweighthybridapplediseaseidentificationmodelbasedontransformers