YOLOFig detection model development using deep learning

Abstract The detection of fruit, including accuracy and speed is of great significance for robotic harvesting. Nevertheless, attributes such as illumination variation, occlusion, and so on have made fruit detection a challenging task. A robust YOLOFig detection model was proposed to solve detection...

Full description

Bibliographic Details
Main Authors:	Olarewaju Mubashiru Lawal, Huamin Zhao
Format:	Article
Language:	English
Published:	Wiley 2021-11-01
Series:	IET Image Processing
Subjects:	Optical, image and video signal processing Image and video coding Computer vision and image processing techniques
Online Access:	https://doi.org/10.1049/ipr2.12293

_version_	1828010565548638208
author	Olarewaju Mubashiru Lawal Huamin Zhao
author_facet	Olarewaju Mubashiru Lawal Huamin Zhao
author_sort	Olarewaju Mubashiru Lawal
collection	DOAJ
description	Abstract The detection of fruit, including accuracy and speed is of great significance for robotic harvesting. Nevertheless, attributes such as illumination variation, occlusion, and so on have made fruit detection a challenging task. A robust YOLOFig detection model was proposed to solve detection challenges and to improve detection accuracy and speed. The YOLOFig detection model incorporated Leaky activated ResNet43 backbone with a new 2,3,4,3,2 residual block arrangement, spatial pyramid pooling network (SPPNet), feature pyramid network (FPN), complete (CIoU) loss, and distance DIoU−NMS to improve the fruit detection performance. The obtained average precision (AP) and speed (frames per second or fps) respectively under 2,3,4,3,2 residual block arranged backbone for YOLOv3b is 78.6% and 69.8 fps, YOLOv4b is 87.6% and 57.1 fps, and YOLOFig is 89.3% and 96.8 fps; under 1,2,8,8,4 residual block arranged backbone for YOLOv3 is 77.1% and 56.3 fps, YOLOv4 is 87.1% and 52.5 fps, and YOLOResNet70 is 87.3% and 79 fps; and under 3,4,6,3 residual block arranged backbone for YOLOResNet50 is 85.4% and 77.1 fps. An indication that the new residual block arranged backbone of 2,3,4,3,2 outperformed 1,2,8,8,4 on an average AP of 1.33% and detection speed of 15.2%. Finally, the compared results showed that the YOLOFig detection model performed better than other models at the same level of residual block arrangement. It can better generalize and is highly suitable for real‐time harvesting robots.
first_indexed	2024-04-10T09:01:55Z
format	Article
id	doaj.art-25b35de54bd64d7d8142fbde597a0f29
institution	Directory Open Access Journal
issn	1751-9659 1751-9667
language	English
last_indexed	2024-04-10T09:01:55Z
publishDate	2021-11-01
publisher	Wiley
record_format	Article
series	IET Image Processing
spelling	doaj.art-25b35de54bd64d7d8142fbde597a0f292023-02-21T11:57:05ZengWileyIET Image Processing1751-96591751-96672021-11-0115133071307910.1049/ipr2.12293YOLOFig detection model development using deep learningOlarewaju Mubashiru Lawal0Huamin Zhao1College of Agricultural Engineering Shanxi Agricultural University Jinzhong ChinaCollege of Agricultural Engineering Shanxi Agricultural University Jinzhong ChinaAbstract The detection of fruit, including accuracy and speed is of great significance for robotic harvesting. Nevertheless, attributes such as illumination variation, occlusion, and so on have made fruit detection a challenging task. A robust YOLOFig detection model was proposed to solve detection challenges and to improve detection accuracy and speed. The YOLOFig detection model incorporated Leaky activated ResNet43 backbone with a new 2,3,4,3,2 residual block arrangement, spatial pyramid pooling network (SPPNet), feature pyramid network (FPN), complete (CIoU) loss, and distance DIoU−NMS to improve the fruit detection performance. The obtained average precision (AP) and speed (frames per second or fps) respectively under 2,3,4,3,2 residual block arranged backbone for YOLOv3b is 78.6% and 69.8 fps, YOLOv4b is 87.6% and 57.1 fps, and YOLOFig is 89.3% and 96.8 fps; under 1,2,8,8,4 residual block arranged backbone for YOLOv3 is 77.1% and 56.3 fps, YOLOv4 is 87.1% and 52.5 fps, and YOLOResNet70 is 87.3% and 79 fps; and under 3,4,6,3 residual block arranged backbone for YOLOResNet50 is 85.4% and 77.1 fps. An indication that the new residual block arranged backbone of 2,3,4,3,2 outperformed 1,2,8,8,4 on an average AP of 1.33% and detection speed of 15.2%. Finally, the compared results showed that the YOLOFig detection model performed better than other models at the same level of residual block arrangement. It can better generalize and is highly suitable for real‐time harvesting robots.https://doi.org/10.1049/ipr2.12293Optical, image and video signal processingImage and video codingComputer vision and image processing techniques
spellingShingle	Olarewaju Mubashiru Lawal Huamin Zhao YOLOFig detection model development using deep learning IET Image Processing Optical, image and video signal processing Image and video coding Computer vision and image processing techniques
title	YOLOFig detection model development using deep learning
title_full	YOLOFig detection model development using deep learning
title_fullStr	YOLOFig detection model development using deep learning
title_full_unstemmed	YOLOFig detection model development using deep learning
title_short	YOLOFig detection model development using deep learning
title_sort	yolofig detection model development using deep learning
topic	Optical, image and video signal processing Image and video coding Computer vision and image processing techniques
url	https://doi.org/10.1049/ipr2.12293
work_keys_str_mv	AT olarewajumubashirulawal yolofigdetectionmodeldevelopmentusingdeeplearning AT huaminzhao yolofigdetectionmodeldevelopmentusingdeeplearning

YOLOFig detection model development using deep learning

Similar Items