Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers

The complex backgrounds of crop disease images and the small contrast between the disease area and the background can easily cause confusion, which seriously affects the robustness and accuracy of apple disease- identification models. To solve the above problems, this paper proposes a Vision Transfo...

Full description

Bibliographic Details
Main Authors:	Xiaopeng Li, Shuqin Li
Format:	Article
Language:	English
Published:	MDPI AG 2022-06-01
Series:	Agriculture
Subjects:	identification of apple diseases image classification lightweight model Vision Transformer hybrid model complex environments
Online Access:	https://www.mdpi.com/2077-0472/12/6/884

_version_	1797490919135510528
author	Xiaopeng Li Shuqin Li
author_facet	Xiaopeng Li Shuqin Li
author_sort	Xiaopeng Li
collection	DOAJ
description	The complex backgrounds of crop disease images and the small contrast between the disease area and the background can easily cause confusion, which seriously affects the robustness and accuracy of apple disease- identification models. To solve the above problems, this paper proposes a Vision Transformer-based lightweight apple leaf disease- identification model, ConvViT, to extract effective features of crop disease spots to identify crop diseases. Our ConvViT includes convolutional structures and Transformer structures; the convolutional structure is used to extract the global features of the image, and the Transformer structure is used to obtain the local features of the disease region to help the CNN see better. The patch embedding method is improved to retain more edge information of the image and promote the information exchange between patches in the Transformer. The parameters and FLOPs (Floating Point Operations) of the model are significantly reduced by using depthwise separable convolution and linear-complexity multi-head attention operations. Experimental results on a complex background of a self-built apple leaf disease dataset show that ConvViT achieves comparable identification results (96.85%) with the current performance of the state-of-the-art Swin-Tiny. The parameters and FLOPs are only 32.7% and 21.7% of Swin-Tiny, and significantly ahead of MobilenetV3, Efficientnet-b0, and other models, which indicates that the proposed model is indeed an effective disease-identification model with practical application value.
first_indexed	2024-03-10T00:39:56Z
format	Article
id	doaj.art-13bb872cc41443948c2a7ccd1fee8a3c
institution	Directory Open Access Journal
issn	2077-0472
language	English
last_indexed	2024-03-10T00:39:56Z
publishDate	2022-06-01
publisher	MDPI AG
record_format	Article
series	Agriculture
spelling	doaj.art-13bb872cc41443948c2a7ccd1fee8a3c2023-11-23T15:08:14ZengMDPI AGAgriculture2077-04722022-06-0112688410.3390/agriculture12060884Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on TransformersXiaopeng Li0Shuqin Li1College of Information Engineering, Northwest A&F University, Xianyang 712100, ChinaCollege of Information Engineering, Northwest A&F University, Xianyang 712100, ChinaThe complex backgrounds of crop disease images and the small contrast between the disease area and the background can easily cause confusion, which seriously affects the robustness and accuracy of apple disease- identification models. To solve the above problems, this paper proposes a Vision Transformer-based lightweight apple leaf disease- identification model, ConvViT, to extract effective features of crop disease spots to identify crop diseases. Our ConvViT includes convolutional structures and Transformer structures; the convolutional structure is used to extract the global features of the image, and the Transformer structure is used to obtain the local features of the disease region to help the CNN see better. The patch embedding method is improved to retain more edge information of the image and promote the information exchange between patches in the Transformer. The parameters and FLOPs (Floating Point Operations) of the model are significantly reduced by using depthwise separable convolution and linear-complexity multi-head attention operations. Experimental results on a complex background of a self-built apple leaf disease dataset show that ConvViT achieves comparable identification results (96.85%) with the current performance of the state-of-the-art Swin-Tiny. The parameters and FLOPs are only 32.7% and 21.7% of Swin-Tiny, and significantly ahead of MobilenetV3, Efficientnet-b0, and other models, which indicates that the proposed model is indeed an effective disease-identification model with practical application value.https://www.mdpi.com/2077-0472/12/6/884identification of apple diseasesimage classificationlightweight modelVision Transformerhybrid modelcomplex environments
spellingShingle	Xiaopeng Li Shuqin Li Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers Agriculture identification of apple diseases image classification lightweight model Vision Transformer hybrid model complex environments
title	Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers
title_full	Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers
title_fullStr	Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers
title_full_unstemmed	Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers
title_short	Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers
title_sort	transformer help cnn see better a lightweight hybrid apple disease identification model based on transformers
topic	identification of apple diseases image classification lightweight model Vision Transformer hybrid model complex environments
url	https://www.mdpi.com/2077-0472/12/6/884
work_keys_str_mv	AT xiaopengli transformerhelpcnnseebetteralightweighthybridapplediseaseidentificationmodelbasedontransformers AT shuqinli transformerhelpcnnseebetteralightweighthybridapplediseaseidentificationmodelbasedontransformers

Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers

Similar Items