An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level

The ‘Kyoho’ (<i>Vitis labruscana</i>) grape is one of the mainly fresh fruits; it is important to accurately segment the grape bunch and to detect its maturity level for the construction of an intelligent grape orchard. Grapes in the natural environment have different shapes, occlusion,...

Full description

Bibliographic Details
Main Authors: Yane Li, Ying Wang, Dayu Xu, Jiaojiao Zhang, Jun Wen
Format: Article
Language:English
Published: MDPI AG 2023-04-01
Series:Agriculture
Subjects:
Online Access:https://www.mdpi.com/2077-0472/13/4/914
_version_ 1797606795595743232
author Yane Li
Ying Wang
Dayu Xu
Jiaojiao Zhang
Jun Wen
author_facet Yane Li
Ying Wang
Dayu Xu
Jiaojiao Zhang
Jun Wen
author_sort Yane Li
collection DOAJ
description The ‘Kyoho’ (<i>Vitis labruscana</i>) grape is one of the mainly fresh fruits; it is important to accurately segment the grape bunch and to detect its maturity level for the construction of an intelligent grape orchard. Grapes in the natural environment have different shapes, occlusion, complex backgrounds, and varying illumination; this leads to poor accuracy in grape maturity detection. In this paper, an improved Mask RCNN-based algorithm was proposed by adding attention mechanism modules to establish a grape bunch segmentation and maturity level detection model. The dataset had 656 grape bunches of different backgrounds, acquired from a grape growing environment of natural conditions. This dataset was divided into four groups according to maturity level. In this study, we first compared different grape bunch segmentation and maturity level detection models established with YoloV3, Solov2, Yolact, and Mask RCNN to select the backbone network. By comparing the performances of the different models established with these methods, Mask RCNN was selected as the backbone network. Then, three different attention mechanism modules, including squeeze-and-excitation attention (SE), the convolutional block attention module (CBAM), and coordinate attention (CA), were introduced to the backbone network of the ResNet50/101 in Mask RCNN, respectively. The results showed that the mean average precision (<i>mAP</i>) and <i>mAP</i><sub>0.75</sub> and the average accuracy of the model established with ResNet101 + CA reached 0.934, 0.891, and 0.944, which were 6.1%, 4.4%, and 9.4% higher than the ResNet101-based model, respectively. The error rate of this model was 5.6%, which was less than the ResNet101-based model. In addition, we compared the performances of the models established with MASK RCNN, adding different attention mechanism modules. The results showed that the <i>mAP</i> and <i>mAP</i><sub>0.75</sub> and the accuracy for the Mask RCNN50/101 + CA-based model were higher than those of the Mask RCNN50/101 + SE- and Mask RCNN50/101 + CBAM-based models. Furthermore, the performances of the models constructed with different network layers of ResNet50- and ResNet101-based attention mechanism modules in a combination method were compared. The results showed that the performance of the ResNet101-based combination with CA model was better than the ResNet50-based combination with CA model. The results showed that the proposed model of Mask RCNN ResNet101 + CA was good for capturing the features of a grape bunch. The proposed model has practical significance for the segmentation of grape bunches and the evaluation of the grape maturity level, which contributes to the construction of intelligent vineyards.
first_indexed 2024-03-11T05:20:08Z
format Article
id doaj.art-a642fbf06979427f8aa9e82b5e3188fd
institution Directory Open Access Journal
issn 2077-0472
language English
last_indexed 2024-03-11T05:20:08Z
publishDate 2023-04-01
publisher MDPI AG
record_format Article
series Agriculture
spelling doaj.art-a642fbf06979427f8aa9e82b5e3188fd2023-11-17T17:55:17ZengMDPI AGAgriculture2077-04722023-04-0113491410.3390/agriculture13040914An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity LevelYane Li0Ying Wang1Dayu Xu2Jiaojiao Zhang3Jun Wen4College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou 311300, ChinaCollege of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou 311300, ChinaCollege of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou 311300, ChinaCollege of Food and Health, Zhejiang A&F University, Hangzhou 311300, ChinaDepartment of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USAThe ‘Kyoho’ (<i>Vitis labruscana</i>) grape is one of the mainly fresh fruits; it is important to accurately segment the grape bunch and to detect its maturity level for the construction of an intelligent grape orchard. Grapes in the natural environment have different shapes, occlusion, complex backgrounds, and varying illumination; this leads to poor accuracy in grape maturity detection. In this paper, an improved Mask RCNN-based algorithm was proposed by adding attention mechanism modules to establish a grape bunch segmentation and maturity level detection model. The dataset had 656 grape bunches of different backgrounds, acquired from a grape growing environment of natural conditions. This dataset was divided into four groups according to maturity level. In this study, we first compared different grape bunch segmentation and maturity level detection models established with YoloV3, Solov2, Yolact, and Mask RCNN to select the backbone network. By comparing the performances of the different models established with these methods, Mask RCNN was selected as the backbone network. Then, three different attention mechanism modules, including squeeze-and-excitation attention (SE), the convolutional block attention module (CBAM), and coordinate attention (CA), were introduced to the backbone network of the ResNet50/101 in Mask RCNN, respectively. The results showed that the mean average precision (<i>mAP</i>) and <i>mAP</i><sub>0.75</sub> and the average accuracy of the model established with ResNet101 + CA reached 0.934, 0.891, and 0.944, which were 6.1%, 4.4%, and 9.4% higher than the ResNet101-based model, respectively. The error rate of this model was 5.6%, which was less than the ResNet101-based model. In addition, we compared the performances of the models established with MASK RCNN, adding different attention mechanism modules. The results showed that the <i>mAP</i> and <i>mAP</i><sub>0.75</sub> and the accuracy for the Mask RCNN50/101 + CA-based model were higher than those of the Mask RCNN50/101 + SE- and Mask RCNN50/101 + CBAM-based models. Furthermore, the performances of the models constructed with different network layers of ResNet50- and ResNet101-based attention mechanism modules in a combination method were compared. The results showed that the performance of the ResNet101-based combination with CA model was better than the ResNet50-based combination with CA model. The results showed that the proposed model of Mask RCNN ResNet101 + CA was good for capturing the features of a grape bunch. The proposed model has practical significance for the segmentation of grape bunches and the evaluation of the grape maturity level, which contributes to the construction of intelligent vineyards.https://www.mdpi.com/2077-0472/13/4/914Mask RCNN algorithminstance segmentationattention moduleconvolutional neural networkgrape maturity level detection
spellingShingle Yane Li
Ying Wang
Dayu Xu
Jiaojiao Zhang
Jun Wen
An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level
Agriculture
Mask RCNN algorithm
instance segmentation
attention module
convolutional neural network
grape maturity level detection
title An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level
title_full An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level
title_fullStr An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level
title_full_unstemmed An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level
title_short An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level
title_sort improved mask rcnn model for segmentation of kyoho i vitis labruscana i grape bunch and detection of its maturity level
topic Mask RCNN algorithm
instance segmentation
attention module
convolutional neural network
grape maturity level detection
url https://www.mdpi.com/2077-0472/13/4/914
work_keys_str_mv AT yaneli animprovedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel
AT yingwang animprovedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel
AT dayuxu animprovedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel
AT jiaojiaozhang animprovedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel
AT junwen animprovedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel
AT yaneli improvedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel
AT yingwang improvedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel
AT dayuxu improvedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel
AT jiaojiaozhang improvedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel
AT junwen improvedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel