YOLOFig detection model development using deep learning

Abstract The detection of fruit, including accuracy and speed is of great significance for robotic harvesting. Nevertheless, attributes such as illumination variation, occlusion, and so on have made fruit detection a challenging task. A robust YOLOFig detection model was proposed to solve detection...

Full description

Bibliographic Details
Main Authors: Olarewaju Mubashiru Lawal, Huamin Zhao
Format: Article
Language:English
Published: Wiley 2021-11-01
Series:IET Image Processing
Subjects:
Online Access:https://doi.org/10.1049/ipr2.12293
_version_ 1828010565548638208
author Olarewaju Mubashiru Lawal
Huamin Zhao
author_facet Olarewaju Mubashiru Lawal
Huamin Zhao
author_sort Olarewaju Mubashiru Lawal
collection DOAJ
description Abstract The detection of fruit, including accuracy and speed is of great significance for robotic harvesting. Nevertheless, attributes such as illumination variation, occlusion, and so on have made fruit detection a challenging task. A robust YOLOFig detection model was proposed to solve detection challenges and to improve detection accuracy and speed. The YOLOFig detection model incorporated Leaky activated ResNet43 backbone with a new 2,3,4,3,2 residual block arrangement, spatial pyramid pooling network (SPPNet), feature pyramid network (FPN), complete (CIoU) loss, and distance DIoU−NMS to improve the fruit detection performance. The obtained average precision (AP) and speed (frames per second or fps) respectively under 2,3,4,3,2 residual block arranged backbone for YOLOv3b is 78.6% and 69.8 fps, YOLOv4b is 87.6% and 57.1 fps, and YOLOFig is 89.3% and 96.8 fps; under 1,2,8,8,4 residual block arranged backbone for YOLOv3 is 77.1% and 56.3 fps, YOLOv4 is 87.1% and 52.5 fps, and YOLOResNet70 is 87.3% and 79 fps; and under 3,4,6,3 residual block arranged backbone for YOLOResNet50 is 85.4% and 77.1 fps. An indication that the new residual block arranged backbone of 2,3,4,3,2 outperformed 1,2,8,8,4 on an average AP of 1.33% and detection speed of 15.2%. Finally, the compared results showed that the YOLOFig detection model performed better than other models at the same level of residual block arrangement. It can better generalize and is highly suitable for real‐time harvesting robots.
first_indexed 2024-04-10T09:01:55Z
format Article
id doaj.art-25b35de54bd64d7d8142fbde597a0f29
institution Directory Open Access Journal
issn 1751-9659
1751-9667
language English
last_indexed 2024-04-10T09:01:55Z
publishDate 2021-11-01
publisher Wiley
record_format Article
series IET Image Processing
spelling doaj.art-25b35de54bd64d7d8142fbde597a0f292023-02-21T11:57:05ZengWileyIET Image Processing1751-96591751-96672021-11-0115133071307910.1049/ipr2.12293YOLOFig detection model development using deep learningOlarewaju Mubashiru Lawal0Huamin Zhao1College of Agricultural Engineering Shanxi Agricultural University Jinzhong ChinaCollege of Agricultural Engineering Shanxi Agricultural University Jinzhong ChinaAbstract The detection of fruit, including accuracy and speed is of great significance for robotic harvesting. Nevertheless, attributes such as illumination variation, occlusion, and so on have made fruit detection a challenging task. A robust YOLOFig detection model was proposed to solve detection challenges and to improve detection accuracy and speed. The YOLOFig detection model incorporated Leaky activated ResNet43 backbone with a new 2,3,4,3,2 residual block arrangement, spatial pyramid pooling network (SPPNet), feature pyramid network (FPN), complete (CIoU) loss, and distance DIoU−NMS to improve the fruit detection performance. The obtained average precision (AP) and speed (frames per second or fps) respectively under 2,3,4,3,2 residual block arranged backbone for YOLOv3b is 78.6% and 69.8 fps, YOLOv4b is 87.6% and 57.1 fps, and YOLOFig is 89.3% and 96.8 fps; under 1,2,8,8,4 residual block arranged backbone for YOLOv3 is 77.1% and 56.3 fps, YOLOv4 is 87.1% and 52.5 fps, and YOLOResNet70 is 87.3% and 79 fps; and under 3,4,6,3 residual block arranged backbone for YOLOResNet50 is 85.4% and 77.1 fps. An indication that the new residual block arranged backbone of 2,3,4,3,2 outperformed 1,2,8,8,4 on an average AP of 1.33% and detection speed of 15.2%. Finally, the compared results showed that the YOLOFig detection model performed better than other models at the same level of residual block arrangement. It can better generalize and is highly suitable for real‐time harvesting robots.https://doi.org/10.1049/ipr2.12293Optical, image and video signal processingImage and video codingComputer vision and image processing techniques
spellingShingle Olarewaju Mubashiru Lawal
Huamin Zhao
YOLOFig detection model development using deep learning
IET Image Processing
Optical, image and video signal processing
Image and video coding
Computer vision and image processing techniques
title YOLOFig detection model development using deep learning
title_full YOLOFig detection model development using deep learning
title_fullStr YOLOFig detection model development using deep learning
title_full_unstemmed YOLOFig detection model development using deep learning
title_short YOLOFig detection model development using deep learning
title_sort yolofig detection model development using deep learning
topic Optical, image and video signal processing
Image and video coding
Computer vision and image processing techniques
url https://doi.org/10.1049/ipr2.12293
work_keys_str_mv AT olarewajumubashirulawal yolofigdetectionmodeldevelopmentusingdeeplearning
AT huaminzhao yolofigdetectionmodeldevelopmentusingdeeplearning