A Dual-Model Architecture with Grouping-Attention-Fusion for Remote Sensing Scene Classification

Remote sensing images contain complex backgrounds and multi-scale objects, which pose a challenging task for scene classification. The performance is highly dependent on the capacity of the scene representation as well as the discriminability of the classifier. Although multiple models possess bette...

Full description

Bibliographic Details
Main Authors:	Junge Shen, Tong Zhang, Yichen Wang, Ruxin Wang, Qi Wang, Min Qi
Format:	Article
Language:	English
Published:	MDPI AG 2021-01-01
Series:	Remote Sensing
Subjects:	remote sensing dual-model architecture grouping-attention-fusion scene classification
Online Access:	https://www.mdpi.com/2072-4292/13/3/433

_version_	1827597652259241984
author	Junge Shen Tong Zhang Yichen Wang Ruxin Wang Qi Wang Min Qi
author_facet	Junge Shen Tong Zhang Yichen Wang Ruxin Wang Qi Wang Min Qi
author_sort	Junge Shen
collection	DOAJ
description	Remote sensing images contain complex backgrounds and multi-scale objects, which pose a challenging task for scene classification. The performance is highly dependent on the capacity of the scene representation as well as the discriminability of the classifier. Although multiple models possess better properties than a single model on these aspects, the fusion strategy for these models is a key component to maximize the final accuracy. In this paper, we construct a novel dual-model architecture with a grouping-attention-fusion strategy to improve the performance of scene classification. Specifically, the model employs two different convolutional neural networks (CNNs) for feature extraction, where the grouping-attention-fusion strategy is used to fuse the features of the CNNs in a fine and multi-scale manner. In this way, the resultant feature representation of the scene is enhanced. Moreover, to address the issue of similar appearances between different scenes, we develop a loss function which encourages small intra-class diversities and large inter-class distances. Extensive experiments are conducted on four scene classification datasets include the UCM land-use dataset, the WHU-RS19 dataset, the AID dataset, and the OPTIMAL-31 dataset. The experimental results demonstrate the superiority of the proposed method in comparison with the state-of-the-arts.
first_indexed	2024-03-09T03:36:22Z
format	Article
id	doaj.art-50d6772456fd4f098b332a03f5a2a559
institution	Directory Open Access Journal
issn	2072-4292
language	English
last_indexed	2024-03-09T03:36:22Z
publishDate	2021-01-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj.art-50d6772456fd4f098b332a03f5a2a5592023-12-03T14:47:38ZengMDPI AGRemote Sensing2072-42922021-01-0113343310.3390/rs13030433A Dual-Model Architecture with Grouping-Attention-Fusion for Remote Sensing Scene ClassificationJunge Shen0Tong Zhang1Yichen Wang2Ruxin Wang3Qi Wang4Min Qi5Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an 710072, ChinaUnmanned System Research Institute, Northwestern Polytechnical University, Xi’an 710072, ChinaUnmanned System Research Institute, Northwestern Polytechnical University, Xi’an 710072, ChinaNational Pilot School of Software, Yunnan University, Kunming 650504, ChinaUnmanned System Research Institute, Northwestern Polytechnical University, Xi’an 710072, ChinaSchool of Electronics and Information, Northwestern Polytechnical University, Xi’an 710072, ChinaRemote sensing images contain complex backgrounds and multi-scale objects, which pose a challenging task for scene classification. The performance is highly dependent on the capacity of the scene representation as well as the discriminability of the classifier. Although multiple models possess better properties than a single model on these aspects, the fusion strategy for these models is a key component to maximize the final accuracy. In this paper, we construct a novel dual-model architecture with a grouping-attention-fusion strategy to improve the performance of scene classification. Specifically, the model employs two different convolutional neural networks (CNNs) for feature extraction, where the grouping-attention-fusion strategy is used to fuse the features of the CNNs in a fine and multi-scale manner. In this way, the resultant feature representation of the scene is enhanced. Moreover, to address the issue of similar appearances between different scenes, we develop a loss function which encourages small intra-class diversities and large inter-class distances. Extensive experiments are conducted on four scene classification datasets include the UCM land-use dataset, the WHU-RS19 dataset, the AID dataset, and the OPTIMAL-31 dataset. The experimental results demonstrate the superiority of the proposed method in comparison with the state-of-the-arts.https://www.mdpi.com/2072-4292/13/3/433remote sensingdual-model architecturegrouping-attention-fusionscene classification
spellingShingle	Junge Shen Tong Zhang Yichen Wang Ruxin Wang Qi Wang Min Qi A Dual-Model Architecture with Grouping-Attention-Fusion for Remote Sensing Scene Classification Remote Sensing remote sensing dual-model architecture grouping-attention-fusion scene classification
title	A Dual-Model Architecture with Grouping-Attention-Fusion for Remote Sensing Scene Classification
title_full	A Dual-Model Architecture with Grouping-Attention-Fusion for Remote Sensing Scene Classification
title_fullStr	A Dual-Model Architecture with Grouping-Attention-Fusion for Remote Sensing Scene Classification
title_full_unstemmed	A Dual-Model Architecture with Grouping-Attention-Fusion for Remote Sensing Scene Classification
title_short	A Dual-Model Architecture with Grouping-Attention-Fusion for Remote Sensing Scene Classification
title_sort	dual model architecture with grouping attention fusion for remote sensing scene classification
topic	remote sensing dual-model architecture grouping-attention-fusion scene classification
url	https://www.mdpi.com/2072-4292/13/3/433
work_keys_str_mv	AT jungeshen adualmodelarchitecturewithgroupingattentionfusionforremotesensingsceneclassification AT tongzhang adualmodelarchitecturewithgroupingattentionfusionforremotesensingsceneclassification AT yichenwang adualmodelarchitecturewithgroupingattentionfusionforremotesensingsceneclassification AT ruxinwang adualmodelarchitecturewithgroupingattentionfusionforremotesensingsceneclassification AT qiwang adualmodelarchitecturewithgroupingattentionfusionforremotesensingsceneclassification AT minqi adualmodelarchitecturewithgroupingattentionfusionforremotesensingsceneclassification AT jungeshen dualmodelarchitecturewithgroupingattentionfusionforremotesensingsceneclassification AT tongzhang dualmodelarchitecturewithgroupingattentionfusionforremotesensingsceneclassification AT yichenwang dualmodelarchitecturewithgroupingattentionfusionforremotesensingsceneclassification AT ruxinwang dualmodelarchitecturewithgroupingattentionfusionforremotesensingsceneclassification AT qiwang dualmodelarchitecturewithgroupingattentionfusionforremotesensingsceneclassification AT minqi dualmodelarchitecturewithgroupingattentionfusionforremotesensingsceneclassification

A Dual-Model Architecture with Grouping-Attention-Fusion for Remote Sensing Scene Classification

Similar Items