Apple Detection in Complex Scene Using the Improved YOLOv4 Model

To enable the apple picking robot to quickly and accurately detect apples under the complex background in orchards, we propose an improved You Only Look Once version 4 (YOLOv4) model and data augmentation methods. Firstly, the crawler technology is utilized to collect pertinent apple images from the...

Full description

Bibliographic Details
Main Authors: Lin Wu, Jie Ma, Yuehua Zhao, Hong Liu
Format: Article
Language:English
Published: MDPI AG 2021-03-01
Series:Agronomy
Subjects:
Online Access:https://www.mdpi.com/2073-4395/11/3/476
_version_ 1797414538157490176
author Lin Wu
Jie Ma
Yuehua Zhao
Hong Liu
author_facet Lin Wu
Jie Ma
Yuehua Zhao
Hong Liu
author_sort Lin Wu
collection DOAJ
description To enable the apple picking robot to quickly and accurately detect apples under the complex background in orchards, we propose an improved You Only Look Once version 4 (YOLOv4) model and data augmentation methods. Firstly, the crawler technology is utilized to collect pertinent apple images from the Internet for labeling. For the problem of insufficient image data caused by the random occlusion between leaves, in addition to traditional data augmentation techniques, a leaf illustration data augmentation method is proposed in this paper to accomplish data augmentation. Secondly, due to the large size and calculation of the YOLOv4 model, the backbone network Cross Stage Partial Darknet53 (CSPDarknet53) of the YOLOv4 model is replaced by EfficientNet, and convolution layer (Conv2D) is added to the three outputs to further adjust and extract the features, which make the model lighter and reduce the computational complexity. Finally, the apple detection experiment is performed on 2670 expanded samples. The test results show that the EfficientNet-B0-YOLOv4 model proposed in this paper has better detection performance than YOLOv3, YOLOv4, and Faster R-CNN with ResNet, which are state-of-the-art apple detection model. The average values of Recall, Precision, and F1 reach 97.43%, 95.52%, and 96.54% respectively, the average detection time per frame of the model is 0.338 s, which proves that the proposed method can be well applied in the vision system of picking robots in the apple industry.
first_indexed 2024-03-09T05:34:45Z
format Article
id doaj.art-b5bbdc07dc074b29af49a41b4a0166d0
institution Directory Open Access Journal
issn 2073-4395
language English
last_indexed 2024-03-09T05:34:45Z
publishDate 2021-03-01
publisher MDPI AG
record_format Article
series Agronomy
spelling doaj.art-b5bbdc07dc074b29af49a41b4a0166d02023-12-03T12:29:48ZengMDPI AGAgronomy2073-43952021-03-0111347610.3390/agronomy11030476Apple Detection in Complex Scene Using the Improved YOLOv4 ModelLin Wu0Jie Ma1Yuehua Zhao2Hong Liu3School of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, ChinaSchool of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, ChinaSchool of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, ChinaSchool of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, ChinaTo enable the apple picking robot to quickly and accurately detect apples under the complex background in orchards, we propose an improved You Only Look Once version 4 (YOLOv4) model and data augmentation methods. Firstly, the crawler technology is utilized to collect pertinent apple images from the Internet for labeling. For the problem of insufficient image data caused by the random occlusion between leaves, in addition to traditional data augmentation techniques, a leaf illustration data augmentation method is proposed in this paper to accomplish data augmentation. Secondly, due to the large size and calculation of the YOLOv4 model, the backbone network Cross Stage Partial Darknet53 (CSPDarknet53) of the YOLOv4 model is replaced by EfficientNet, and convolution layer (Conv2D) is added to the three outputs to further adjust and extract the features, which make the model lighter and reduce the computational complexity. Finally, the apple detection experiment is performed on 2670 expanded samples. The test results show that the EfficientNet-B0-YOLOv4 model proposed in this paper has better detection performance than YOLOv3, YOLOv4, and Faster R-CNN with ResNet, which are state-of-the-art apple detection model. The average values of Recall, Precision, and F1 reach 97.43%, 95.52%, and 96.54% respectively, the average detection time per frame of the model is 0.338 s, which proves that the proposed method can be well applied in the vision system of picking robots in the apple industry.https://www.mdpi.com/2073-4395/11/3/476apple detectionYOLOv4EfficientNetpicking robotdata augmentation
spellingShingle Lin Wu
Jie Ma
Yuehua Zhao
Hong Liu
Apple Detection in Complex Scene Using the Improved YOLOv4 Model
Agronomy
apple detection
YOLOv4
EfficientNet
picking robot
data augmentation
title Apple Detection in Complex Scene Using the Improved YOLOv4 Model
title_full Apple Detection in Complex Scene Using the Improved YOLOv4 Model
title_fullStr Apple Detection in Complex Scene Using the Improved YOLOv4 Model
title_full_unstemmed Apple Detection in Complex Scene Using the Improved YOLOv4 Model
title_short Apple Detection in Complex Scene Using the Improved YOLOv4 Model
title_sort apple detection in complex scene using the improved yolov4 model
topic apple detection
YOLOv4
EfficientNet
picking robot
data augmentation
url https://www.mdpi.com/2073-4395/11/3/476
work_keys_str_mv AT linwu appledetectionincomplexsceneusingtheimprovedyolov4model
AT jiema appledetectionincomplexsceneusingtheimprovedyolov4model
AT yuehuazhao appledetectionincomplexsceneusingtheimprovedyolov4model
AT hongliu appledetectionincomplexsceneusingtheimprovedyolov4model