YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception

Precise pear detection and recognition is an essential step toward modernizing orchard management. However, due to the ubiquitous occlusion in orchards and various locations of image acquisition, the pears in the acquired images may be quite small and occluded, causing high false detection and objec...

Full description

Bibliographic Details
Main Authors: Yipu Li, Yuan Rao, Xiu Jin, Zhaohui Jiang, Yuwei Wang, Tan Wang, Fengyi Wang, Qing Luo, Lu Liu
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/1/30
_version_ 1797431206619381760
author Yipu Li
Yuan Rao
Xiu Jin
Zhaohui Jiang
Yuwei Wang
Tan Wang
Fengyi Wang
Qing Luo
Lu Liu
author_facet Yipu Li
Yuan Rao
Xiu Jin
Zhaohui Jiang
Yuwei Wang
Tan Wang
Fengyi Wang
Qing Luo
Lu Liu
author_sort Yipu Li
collection DOAJ
description Precise pear detection and recognition is an essential step toward modernizing orchard management. However, due to the ubiquitous occlusion in orchards and various locations of image acquisition, the pears in the acquired images may be quite small and occluded, causing high false detection and object loss rate. In this paper, a multi-scale collaborative perception network YOLOv5s-FP (Fusion and Perception) was proposed for pear detection, which coupled local and global features. Specifically, a pear dataset with a high proportion of small and occluded pears was proposed, comprising 3680 images acquired with cameras mounted on a ground tripod and a UAV platform. The cross-stage partial (CSP) module was optimized to extract global features through a transformer encoder, which was then fused with local features by an attentional feature fusion mechanism. Subsequently, a modified path aggregation network oriented to collaboration perception of multi-scale features was proposed by incorporating a transformer encoder, the optimized CSP, and new skip connections. The quantitative results of utilizing the YOLOv5s-FP for pear detection were compared with other typical object detection networks of the YOLO series, recording the highest average precision of 96.12% with less detection time and computational cost. In qualitative experiments, the proposed network achieved superior visual performance with stronger robustness to the changes in occlusion and illumination conditions, particularly providing the ability to detect pears with different sizes in highly dense, overlapping environments and non-normal illumination areas. Therefore, the proposed YOLOv5s-FP network was practicable for detecting in-field pears in a real-time and accurate way, which could be an advantageous component of the technology for monitoring pear growth status and implementing automated harvesting in unmanned orchards.
first_indexed 2024-03-09T09:41:33Z
format Article
id doaj.art-4129c139789545ccad842e768101510f
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T09:41:33Z
publishDate 2022-12-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-4129c139789545ccad842e768101510f2023-12-02T00:52:43ZengMDPI AGSensors1424-82202022-12-012313010.3390/s23010030YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration PerceptionYipu Li0Yuan Rao1Xiu Jin2Zhaohui Jiang3Yuwei Wang4Tan Wang5Fengyi Wang6Qing Luo7Lu Liu8College of Information and Computer Science, Anhui Agricultural University, Hefei 230036, ChinaCollege of Information and Computer Science, Anhui Agricultural University, Hefei 230036, ChinaCollege of Information and Computer Science, Anhui Agricultural University, Hefei 230036, ChinaCollege of Information and Computer Science, Anhui Agricultural University, Hefei 230036, ChinaKey Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei 230036, ChinaCollege of Information and Computer Science, Anhui Agricultural University, Hefei 230036, ChinaCollege of Information and Computer Science, Anhui Agricultural University, Hefei 230036, ChinaCollege of Information and Computer Science, Anhui Agricultural University, Hefei 230036, ChinaKey Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei 230036, ChinaPrecise pear detection and recognition is an essential step toward modernizing orchard management. However, due to the ubiquitous occlusion in orchards and various locations of image acquisition, the pears in the acquired images may be quite small and occluded, causing high false detection and object loss rate. In this paper, a multi-scale collaborative perception network YOLOv5s-FP (Fusion and Perception) was proposed for pear detection, which coupled local and global features. Specifically, a pear dataset with a high proportion of small and occluded pears was proposed, comprising 3680 images acquired with cameras mounted on a ground tripod and a UAV platform. The cross-stage partial (CSP) module was optimized to extract global features through a transformer encoder, which was then fused with local features by an attentional feature fusion mechanism. Subsequently, a modified path aggregation network oriented to collaboration perception of multi-scale features was proposed by incorporating a transformer encoder, the optimized CSP, and new skip connections. The quantitative results of utilizing the YOLOv5s-FP for pear detection were compared with other typical object detection networks of the YOLO series, recording the highest average precision of 96.12% with less detection time and computational cost. In qualitative experiments, the proposed network achieved superior visual performance with stronger robustness to the changes in occlusion and illumination conditions, particularly providing the ability to detect pears with different sizes in highly dense, overlapping environments and non-normal illumination areas. Therefore, the proposed YOLOv5s-FP network was practicable for detecting in-field pears in a real-time and accurate way, which could be an advantageous component of the technology for monitoring pear growth status and implementing automated harvesting in unmanned orchards.https://www.mdpi.com/1424-8220/23/1/30object detectionagricultural applicationtransformer encodermulti-scale featurecollaboration perception
spellingShingle Yipu Li
Yuan Rao
Xiu Jin
Zhaohui Jiang
Yuwei Wang
Tan Wang
Fengyi Wang
Qing Luo
Lu Liu
YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception
Sensors
object detection
agricultural application
transformer encoder
multi-scale feature
collaboration perception
title YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception
title_full YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception
title_fullStr YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception
title_full_unstemmed YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception
title_short YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception
title_sort yolov5s fp a novel method for in field pear detection using a transformer encoder and multi scale collaboration perception
topic object detection
agricultural application
transformer encoder
multi-scale feature
collaboration perception
url https://www.mdpi.com/1424-8220/23/1/30
work_keys_str_mv AT yipuli yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT yuanrao yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT xiujin yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT zhaohuijiang yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT yuweiwang yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT tanwang yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT fengyiwang yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT qingluo yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT luliu yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception