YOLOFig detection model development using deep learning
Abstract The detection of fruit, including accuracy and speed is of great significance for robotic harvesting. Nevertheless, attributes such as illumination variation, occlusion, and so on have made fruit detection a challenging task. A robust YOLOFig detection model was proposed to solve detection...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2021-11-01
|
Series: | IET Image Processing |
Subjects: | |
Online Access: | https://doi.org/10.1049/ipr2.12293 |
_version_ | 1828010565548638208 |
---|---|
author | Olarewaju Mubashiru Lawal Huamin Zhao |
author_facet | Olarewaju Mubashiru Lawal Huamin Zhao |
author_sort | Olarewaju Mubashiru Lawal |
collection | DOAJ |
description | Abstract The detection of fruit, including accuracy and speed is of great significance for robotic harvesting. Nevertheless, attributes such as illumination variation, occlusion, and so on have made fruit detection a challenging task. A robust YOLOFig detection model was proposed to solve detection challenges and to improve detection accuracy and speed. The YOLOFig detection model incorporated Leaky activated ResNet43 backbone with a new 2,3,4,3,2 residual block arrangement, spatial pyramid pooling network (SPPNet), feature pyramid network (FPN), complete (CIoU) loss, and distance DIoU−NMS to improve the fruit detection performance. The obtained average precision (AP) and speed (frames per second or fps) respectively under 2,3,4,3,2 residual block arranged backbone for YOLOv3b is 78.6% and 69.8 fps, YOLOv4b is 87.6% and 57.1 fps, and YOLOFig is 89.3% and 96.8 fps; under 1,2,8,8,4 residual block arranged backbone for YOLOv3 is 77.1% and 56.3 fps, YOLOv4 is 87.1% and 52.5 fps, and YOLOResNet70 is 87.3% and 79 fps; and under 3,4,6,3 residual block arranged backbone for YOLOResNet50 is 85.4% and 77.1 fps. An indication that the new residual block arranged backbone of 2,3,4,3,2 outperformed 1,2,8,8,4 on an average AP of 1.33% and detection speed of 15.2%. Finally, the compared results showed that the YOLOFig detection model performed better than other models at the same level of residual block arrangement. It can better generalize and is highly suitable for real‐time harvesting robots. |
first_indexed | 2024-04-10T09:01:55Z |
format | Article |
id | doaj.art-25b35de54bd64d7d8142fbde597a0f29 |
institution | Directory Open Access Journal |
issn | 1751-9659 1751-9667 |
language | English |
last_indexed | 2024-04-10T09:01:55Z |
publishDate | 2021-11-01 |
publisher | Wiley |
record_format | Article |
series | IET Image Processing |
spelling | doaj.art-25b35de54bd64d7d8142fbde597a0f292023-02-21T11:57:05ZengWileyIET Image Processing1751-96591751-96672021-11-0115133071307910.1049/ipr2.12293YOLOFig detection model development using deep learningOlarewaju Mubashiru Lawal0Huamin Zhao1College of Agricultural Engineering Shanxi Agricultural University Jinzhong ChinaCollege of Agricultural Engineering Shanxi Agricultural University Jinzhong ChinaAbstract The detection of fruit, including accuracy and speed is of great significance for robotic harvesting. Nevertheless, attributes such as illumination variation, occlusion, and so on have made fruit detection a challenging task. A robust YOLOFig detection model was proposed to solve detection challenges and to improve detection accuracy and speed. The YOLOFig detection model incorporated Leaky activated ResNet43 backbone with a new 2,3,4,3,2 residual block arrangement, spatial pyramid pooling network (SPPNet), feature pyramid network (FPN), complete (CIoU) loss, and distance DIoU−NMS to improve the fruit detection performance. The obtained average precision (AP) and speed (frames per second or fps) respectively under 2,3,4,3,2 residual block arranged backbone for YOLOv3b is 78.6% and 69.8 fps, YOLOv4b is 87.6% and 57.1 fps, and YOLOFig is 89.3% and 96.8 fps; under 1,2,8,8,4 residual block arranged backbone for YOLOv3 is 77.1% and 56.3 fps, YOLOv4 is 87.1% and 52.5 fps, and YOLOResNet70 is 87.3% and 79 fps; and under 3,4,6,3 residual block arranged backbone for YOLOResNet50 is 85.4% and 77.1 fps. An indication that the new residual block arranged backbone of 2,3,4,3,2 outperformed 1,2,8,8,4 on an average AP of 1.33% and detection speed of 15.2%. Finally, the compared results showed that the YOLOFig detection model performed better than other models at the same level of residual block arrangement. It can better generalize and is highly suitable for real‐time harvesting robots.https://doi.org/10.1049/ipr2.12293Optical, image and video signal processingImage and video codingComputer vision and image processing techniques |
spellingShingle | Olarewaju Mubashiru Lawal Huamin Zhao YOLOFig detection model development using deep learning IET Image Processing Optical, image and video signal processing Image and video coding Computer vision and image processing techniques |
title | YOLOFig detection model development using deep learning |
title_full | YOLOFig detection model development using deep learning |
title_fullStr | YOLOFig detection model development using deep learning |
title_full_unstemmed | YOLOFig detection model development using deep learning |
title_short | YOLOFig detection model development using deep learning |
title_sort | yolofig detection model development using deep learning |
topic | Optical, image and video signal processing Image and video coding Computer vision and image processing techniques |
url | https://doi.org/10.1049/ipr2.12293 |
work_keys_str_mv | AT olarewajumubashirulawal yolofigdetectionmodeldevelopmentusingdeeplearning AT huaminzhao yolofigdetectionmodeldevelopmentusingdeeplearning |