High Quality Object Detection for Multiresolution Remote Sensing Imagery Using Cascaded Multi-Stage Detectors

Deep-learning-based object detectors have substantially improved state-of-the-art object detection in remote sensing images in terms of precision and degree of automation. Nevertheless, the large variation of the object scales makes it difficult to achieve high-quality detection across multiresoluti...

Full description

Bibliographic Details
Main Authors: Binglong Wu, Yuan Shen, Shanxin Guo, Jinsong Chen, Luyi Sun, Hongzhong Li, Yong Ao
Format: Article
Language:English
Published: MDPI AG 2022-04-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/14/9/2091
_version_ 1797503082902323200
author Binglong Wu
Yuan Shen
Shanxin Guo
Jinsong Chen
Luyi Sun
Hongzhong Li
Yong Ao
author_facet Binglong Wu
Yuan Shen
Shanxin Guo
Jinsong Chen
Luyi Sun
Hongzhong Li
Yong Ao
author_sort Binglong Wu
collection DOAJ
description Deep-learning-based object detectors have substantially improved state-of-the-art object detection in remote sensing images in terms of precision and degree of automation. Nevertheless, the large variation of the object scales makes it difficult to achieve high-quality detection across multiresolution remote sensing images, where the quality is defined by the Intersection over Union (IoU) threshold used in training. In addition, the imbalance between the positive and negative samples across multiresolution images worsens the detection precision. Recently, it was found that a Cascade region-based convolutional neural network (R-CNN) can potentially achieve a higher quality of detection by introducing a cascaded three-stage structure using progressively improved IoU thresholds. However, the performance of Cascade R-CNN degraded when the fourth stage was added. We investigated the cause and found that the mismatch between the ROI features and the classifier could be responsible for the degradation of performance. Herein, we propose a Cascade R-CNN++ structure to address this issue and extend the three-stage architecture to multiple stages for general use. Specifically, for cascaded classification, we propose a new ensemble strategy for the classifier and region of interest (RoI) features to improve classification accuracy at inference. In localization, we modified the loss function of the bounding box regressor to obtain higher sensitivity around zero. Experiments on the DOTA dataset demonstrated that Cascade R-CNN++ outperforms Cascade R-CNN in terms of precision and detection quality. We conducted further analysis on multiresolution remote sensing images to verify model transferability across different object scales.
first_indexed 2024-03-10T03:45:26Z
format Article
id doaj.art-f2dd7a03ec354c3cbbe10f4006b8c0c7
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-10T03:45:26Z
publishDate 2022-04-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-f2dd7a03ec354c3cbbe10f4006b8c0c72023-11-23T09:10:26ZengMDPI AGRemote Sensing2072-42922022-04-01149209110.3390/rs14092091High Quality Object Detection for Multiresolution Remote Sensing Imagery Using Cascaded Multi-Stage DetectorsBinglong Wu0Yuan Shen1Shanxin Guo2Jinsong Chen3Luyi Sun4Hongzhong Li5Yong Ao6Center for Geo-Spatial Information, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, ChinaCenter for Geo-Spatial Information, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, ChinaCenter for Geo-Spatial Information, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, ChinaCenter for Geo-Spatial Information, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, ChinaCenter for Geo-Spatial Information, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, ChinaCenter for Geo-Spatial Information, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, ChinaSchool of Earth Science and Resources, Chang’an University, 126 Yanta Road, Xi’an 710054, ChinaDeep-learning-based object detectors have substantially improved state-of-the-art object detection in remote sensing images in terms of precision and degree of automation. Nevertheless, the large variation of the object scales makes it difficult to achieve high-quality detection across multiresolution remote sensing images, where the quality is defined by the Intersection over Union (IoU) threshold used in training. In addition, the imbalance between the positive and negative samples across multiresolution images worsens the detection precision. Recently, it was found that a Cascade region-based convolutional neural network (R-CNN) can potentially achieve a higher quality of detection by introducing a cascaded three-stage structure using progressively improved IoU thresholds. However, the performance of Cascade R-CNN degraded when the fourth stage was added. We investigated the cause and found that the mismatch between the ROI features and the classifier could be responsible for the degradation of performance. Herein, we propose a Cascade R-CNN++ structure to address this issue and extend the three-stage architecture to multiple stages for general use. Specifically, for cascaded classification, we propose a new ensemble strategy for the classifier and region of interest (RoI) features to improve classification accuracy at inference. In localization, we modified the loss function of the bounding box regressor to obtain higher sensitivity around zero. Experiments on the DOTA dataset demonstrated that Cascade R-CNN++ outperforms Cascade R-CNN in terms of precision and detection quality. We conducted further analysis on multiresolution remote sensing images to verify model transferability across different object scales.https://www.mdpi.com/2072-4292/14/9/2091object detectioncascaded detectorsIntersection over Union (IoU) thresholdclassification ensemblebounding box regressionmultiresolution remote sensing images
spellingShingle Binglong Wu
Yuan Shen
Shanxin Guo
Jinsong Chen
Luyi Sun
Hongzhong Li
Yong Ao
High Quality Object Detection for Multiresolution Remote Sensing Imagery Using Cascaded Multi-Stage Detectors
Remote Sensing
object detection
cascaded detectors
Intersection over Union (IoU) threshold
classification ensemble
bounding box regression
multiresolution remote sensing images
title High Quality Object Detection for Multiresolution Remote Sensing Imagery Using Cascaded Multi-Stage Detectors
title_full High Quality Object Detection for Multiresolution Remote Sensing Imagery Using Cascaded Multi-Stage Detectors
title_fullStr High Quality Object Detection for Multiresolution Remote Sensing Imagery Using Cascaded Multi-Stage Detectors
title_full_unstemmed High Quality Object Detection for Multiresolution Remote Sensing Imagery Using Cascaded Multi-Stage Detectors
title_short High Quality Object Detection for Multiresolution Remote Sensing Imagery Using Cascaded Multi-Stage Detectors
title_sort high quality object detection for multiresolution remote sensing imagery using cascaded multi stage detectors
topic object detection
cascaded detectors
Intersection over Union (IoU) threshold
classification ensemble
bounding box regression
multiresolution remote sensing images
url https://www.mdpi.com/2072-4292/14/9/2091
work_keys_str_mv AT binglongwu highqualityobjectdetectionformultiresolutionremotesensingimageryusingcascadedmultistagedetectors
AT yuanshen highqualityobjectdetectionformultiresolutionremotesensingimageryusingcascadedmultistagedetectors
AT shanxinguo highqualityobjectdetectionformultiresolutionremotesensingimageryusingcascadedmultistagedetectors
AT jinsongchen highqualityobjectdetectionformultiresolutionremotesensingimageryusingcascadedmultistagedetectors
AT luyisun highqualityobjectdetectionformultiresolutionremotesensingimageryusingcascadedmultistagedetectors
AT hongzhongli highqualityobjectdetectionformultiresolutionremotesensingimageryusingcascadedmultistagedetectors
AT yongao highqualityobjectdetectionformultiresolutionremotesensingimageryusingcascadedmultistagedetectors