An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level
The ‘Kyoho’ (<i>Vitis labruscana</i>) grape is one of the mainly fresh fruits; it is important to accurately segment the grape bunch and to detect its maturity level for the construction of an intelligent grape orchard. Grapes in the natural environment have different shapes, occlusion,...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-04-01
|
Series: | Agriculture |
Subjects: | |
Online Access: | https://www.mdpi.com/2077-0472/13/4/914 |
_version_ | 1797606795595743232 |
---|---|
author | Yane Li Ying Wang Dayu Xu Jiaojiao Zhang Jun Wen |
author_facet | Yane Li Ying Wang Dayu Xu Jiaojiao Zhang Jun Wen |
author_sort | Yane Li |
collection | DOAJ |
description | The ‘Kyoho’ (<i>Vitis labruscana</i>) grape is one of the mainly fresh fruits; it is important to accurately segment the grape bunch and to detect its maturity level for the construction of an intelligent grape orchard. Grapes in the natural environment have different shapes, occlusion, complex backgrounds, and varying illumination; this leads to poor accuracy in grape maturity detection. In this paper, an improved Mask RCNN-based algorithm was proposed by adding attention mechanism modules to establish a grape bunch segmentation and maturity level detection model. The dataset had 656 grape bunches of different backgrounds, acquired from a grape growing environment of natural conditions. This dataset was divided into four groups according to maturity level. In this study, we first compared different grape bunch segmentation and maturity level detection models established with YoloV3, Solov2, Yolact, and Mask RCNN to select the backbone network. By comparing the performances of the different models established with these methods, Mask RCNN was selected as the backbone network. Then, three different attention mechanism modules, including squeeze-and-excitation attention (SE), the convolutional block attention module (CBAM), and coordinate attention (CA), were introduced to the backbone network of the ResNet50/101 in Mask RCNN, respectively. The results showed that the mean average precision (<i>mAP</i>) and <i>mAP</i><sub>0.75</sub> and the average accuracy of the model established with ResNet101 + CA reached 0.934, 0.891, and 0.944, which were 6.1%, 4.4%, and 9.4% higher than the ResNet101-based model, respectively. The error rate of this model was 5.6%, which was less than the ResNet101-based model. In addition, we compared the performances of the models established with MASK RCNN, adding different attention mechanism modules. The results showed that the <i>mAP</i> and <i>mAP</i><sub>0.75</sub> and the accuracy for the Mask RCNN50/101 + CA-based model were higher than those of the Mask RCNN50/101 + SE- and Mask RCNN50/101 + CBAM-based models. Furthermore, the performances of the models constructed with different network layers of ResNet50- and ResNet101-based attention mechanism modules in a combination method were compared. The results showed that the performance of the ResNet101-based combination with CA model was better than the ResNet50-based combination with CA model. The results showed that the proposed model of Mask RCNN ResNet101 + CA was good for capturing the features of a grape bunch. The proposed model has practical significance for the segmentation of grape bunches and the evaluation of the grape maturity level, which contributes to the construction of intelligent vineyards. |
first_indexed | 2024-03-11T05:20:08Z |
format | Article |
id | doaj.art-a642fbf06979427f8aa9e82b5e3188fd |
institution | Directory Open Access Journal |
issn | 2077-0472 |
language | English |
last_indexed | 2024-03-11T05:20:08Z |
publishDate | 2023-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Agriculture |
spelling | doaj.art-a642fbf06979427f8aa9e82b5e3188fd2023-11-17T17:55:17ZengMDPI AGAgriculture2077-04722023-04-0113491410.3390/agriculture13040914An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity LevelYane Li0Ying Wang1Dayu Xu2Jiaojiao Zhang3Jun Wen4College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou 311300, ChinaCollege of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou 311300, ChinaCollege of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou 311300, ChinaCollege of Food and Health, Zhejiang A&F University, Hangzhou 311300, ChinaDepartment of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USAThe ‘Kyoho’ (<i>Vitis labruscana</i>) grape is one of the mainly fresh fruits; it is important to accurately segment the grape bunch and to detect its maturity level for the construction of an intelligent grape orchard. Grapes in the natural environment have different shapes, occlusion, complex backgrounds, and varying illumination; this leads to poor accuracy in grape maturity detection. In this paper, an improved Mask RCNN-based algorithm was proposed by adding attention mechanism modules to establish a grape bunch segmentation and maturity level detection model. The dataset had 656 grape bunches of different backgrounds, acquired from a grape growing environment of natural conditions. This dataset was divided into four groups according to maturity level. In this study, we first compared different grape bunch segmentation and maturity level detection models established with YoloV3, Solov2, Yolact, and Mask RCNN to select the backbone network. By comparing the performances of the different models established with these methods, Mask RCNN was selected as the backbone network. Then, three different attention mechanism modules, including squeeze-and-excitation attention (SE), the convolutional block attention module (CBAM), and coordinate attention (CA), were introduced to the backbone network of the ResNet50/101 in Mask RCNN, respectively. The results showed that the mean average precision (<i>mAP</i>) and <i>mAP</i><sub>0.75</sub> and the average accuracy of the model established with ResNet101 + CA reached 0.934, 0.891, and 0.944, which were 6.1%, 4.4%, and 9.4% higher than the ResNet101-based model, respectively. The error rate of this model was 5.6%, which was less than the ResNet101-based model. In addition, we compared the performances of the models established with MASK RCNN, adding different attention mechanism modules. The results showed that the <i>mAP</i> and <i>mAP</i><sub>0.75</sub> and the accuracy for the Mask RCNN50/101 + CA-based model were higher than those of the Mask RCNN50/101 + SE- and Mask RCNN50/101 + CBAM-based models. Furthermore, the performances of the models constructed with different network layers of ResNet50- and ResNet101-based attention mechanism modules in a combination method were compared. The results showed that the performance of the ResNet101-based combination with CA model was better than the ResNet50-based combination with CA model. The results showed that the proposed model of Mask RCNN ResNet101 + CA was good for capturing the features of a grape bunch. The proposed model has practical significance for the segmentation of grape bunches and the evaluation of the grape maturity level, which contributes to the construction of intelligent vineyards.https://www.mdpi.com/2077-0472/13/4/914Mask RCNN algorithminstance segmentationattention moduleconvolutional neural networkgrape maturity level detection |
spellingShingle | Yane Li Ying Wang Dayu Xu Jiaojiao Zhang Jun Wen An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level Agriculture Mask RCNN algorithm instance segmentation attention module convolutional neural network grape maturity level detection |
title | An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level |
title_full | An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level |
title_fullStr | An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level |
title_full_unstemmed | An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level |
title_short | An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (<i>Vitis labruscana</i>) Grape Bunch and Detection of Its Maturity Level |
title_sort | improved mask rcnn model for segmentation of kyoho i vitis labruscana i grape bunch and detection of its maturity level |
topic | Mask RCNN algorithm instance segmentation attention module convolutional neural network grape maturity level detection |
url | https://www.mdpi.com/2077-0472/13/4/914 |
work_keys_str_mv | AT yaneli animprovedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel AT yingwang animprovedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel AT dayuxu animprovedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel AT jiaojiaozhang animprovedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel AT junwen animprovedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel AT yaneli improvedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel AT yingwang improvedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel AT dayuxu improvedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel AT jiaojiaozhang improvedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel AT junwen improvedmaskrcnnmodelforsegmentationofkyohoivitislabruscanaigrapebunchanddetectionofitsmaturitylevel |