Ensemble Learning Approaches Based on Covariance Pooling of CNN Features for High Resolution Remote Sensing Scene Classification

Remote sensing image scene classification, which consists of labeling remote sensing images with a set of categories based on their content, has received remarkable attention for many applications such as land use mapping. Standard approaches are based on the multi-layer representation of first-orde...

Full description

Bibliographic Details
Main Authors:	Sara Akodad, Lionel Bombrun, Junshi Xia, Yannick Berthoumieu, Christian Germain
Format:	Article
Language:	English
Published:	MDPI AG 2020-10-01
Series:	Remote Sensing
Subjects:	transfer learning covariance matrices log-euclidean metric ensemble learning remote sensing scene classification fisher vector
Online Access:	https://www.mdpi.com/2072-4292/12/20/3292

_version_	1797551433489317888
author	Sara Akodad Lionel Bombrun Junshi Xia Yannick Berthoumieu Christian Germain
author_facet	Sara Akodad Lionel Bombrun Junshi Xia Yannick Berthoumieu Christian Germain
author_sort	Sara Akodad
collection	DOAJ
description	Remote sensing image scene classification, which consists of labeling remote sensing images with a set of categories based on their content, has received remarkable attention for many applications such as land use mapping. Standard approaches are based on the multi-layer representation of first-order convolutional neural network (CNN) features. However, second-order CNNs have recently been shown to outperform traditional first-order CNNs for many computer vision tasks. Hence, the aim of this paper is to show the use of second-order statistics of CNN features for remote sensing scene classification. This takes the form of covariance matrices computed locally or globally on the output of a CNN. However, these datapoints do not lie in an Euclidean space but a Riemannian manifold. To manipulate them, Euclidean tools are not adapted. Other metrics should be considered such as the log-Euclidean one. This consists of projecting the set of covariance matrices on a tangent space defined at a reference point. In this tangent plane, which is a vector space, conventional machine learning algorithms can be considered, such as the Fisher vector encoding or SVM classifier. Based on this log-Euclidean framework, we propose a novel transfer learning approach composed of two hybrid architectures based on covariance pooling of CNN features, the first is local and the second is global. They rely on the extraction of features from models pre-trained on the ImageNet dataset processed with some machine learning algorithms. The first hybrid architecture consists of an ensemble learning approach with the log-Euclidean Fisher vector encoding of region covariance matrices computed locally on the first layers of a CNN. The second one concerns an ensemble learning approach based on the covariance pooling of CNN features extracted globally from the deepest layers. These two ensemble learning approaches are then combined together based on the strategy of the most diverse ensembles. For validation and comparison purposes, the proposed approach is tested on various challenging remote sensing datasets. Experimental results exhibit a significant gain of approximately <inline-formula><math display="inline"><semantics><mrow><mn>2</mn><mo>%</mo></mrow></semantics></math></inline-formula> in overall accuracy for the proposed approach compared to a similar state-of-the-art method based on covariance pooling of CNN features (on the UC Merced dataset).
first_indexed	2024-03-10T15:44:39Z
format	Article
id	doaj.art-b3d5a766f29d44af8f3dc2b5b0b5f5f1
institution	Directory Open Access Journal
issn	2072-4292
language	English
last_indexed	2024-03-10T15:44:39Z
publishDate	2020-10-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj.art-b3d5a766f29d44af8f3dc2b5b0b5f5f12023-11-20T16:32:30ZengMDPI AGRemote Sensing2072-42922020-10-011220329210.3390/rs12203292Ensemble Learning Approaches Based on Covariance Pooling of CNN Features for High Resolution Remote Sensing Scene ClassificationSara Akodad0Lionel Bombrun1Junshi Xia2Yannick Berthoumieu3Christian Germain4CNRS, IMS, UMR n∘5218, Groupe Signal et Image, University of Bordeaux, F-33405 Talence, FranceCNRS, IMS, UMR n∘5218, Groupe Signal et Image, University of Bordeaux, F-33405 Talence, FranceRIKEN Center for Advanced Intelligence Project (AIP), RIKEN, Tokyo 103-0027, JapanCNRS, IMS, UMR n∘5218, Groupe Signal et Image, University of Bordeaux, F-33405 Talence, FranceCNRS, IMS, UMR n∘5218, Groupe Signal et Image, University of Bordeaux, F-33405 Talence, FranceRemote sensing image scene classification, which consists of labeling remote sensing images with a set of categories based on their content, has received remarkable attention for many applications such as land use mapping. Standard approaches are based on the multi-layer representation of first-order convolutional neural network (CNN) features. However, second-order CNNs have recently been shown to outperform traditional first-order CNNs for many computer vision tasks. Hence, the aim of this paper is to show the use of second-order statistics of CNN features for remote sensing scene classification. This takes the form of covariance matrices computed locally or globally on the output of a CNN. However, these datapoints do not lie in an Euclidean space but a Riemannian manifold. To manipulate them, Euclidean tools are not adapted. Other metrics should be considered such as the log-Euclidean one. This consists of projecting the set of covariance matrices on a tangent space defined at a reference point. In this tangent plane, which is a vector space, conventional machine learning algorithms can be considered, such as the Fisher vector encoding or SVM classifier. Based on this log-Euclidean framework, we propose a novel transfer learning approach composed of two hybrid architectures based on covariance pooling of CNN features, the first is local and the second is global. They rely on the extraction of features from models pre-trained on the ImageNet dataset processed with some machine learning algorithms. The first hybrid architecture consists of an ensemble learning approach with the log-Euclidean Fisher vector encoding of region covariance matrices computed locally on the first layers of a CNN. The second one concerns an ensemble learning approach based on the covariance pooling of CNN features extracted globally from the deepest layers. These two ensemble learning approaches are then combined together based on the strategy of the most diverse ensembles. For validation and comparison purposes, the proposed approach is tested on various challenging remote sensing datasets. Experimental results exhibit a significant gain of approximately <inline-formula><math display="inline"><semantics><mrow><mn>2</mn><mo>%</mo></mrow></semantics></math></inline-formula> in overall accuracy for the proposed approach compared to a similar state-of-the-art method based on covariance pooling of CNN features (on the UC Merced dataset).https://www.mdpi.com/2072-4292/12/20/3292transfer learningcovariance matriceslog-euclidean metricensemble learningremote sensing scene classificationfisher vector
spellingShingle	Sara Akodad Lionel Bombrun Junshi Xia Yannick Berthoumieu Christian Germain Ensemble Learning Approaches Based on Covariance Pooling of CNN Features for High Resolution Remote Sensing Scene Classification Remote Sensing transfer learning covariance matrices log-euclidean metric ensemble learning remote sensing scene classification fisher vector
title	Ensemble Learning Approaches Based on Covariance Pooling of CNN Features for High Resolution Remote Sensing Scene Classification
title_full	Ensemble Learning Approaches Based on Covariance Pooling of CNN Features for High Resolution Remote Sensing Scene Classification
title_fullStr	Ensemble Learning Approaches Based on Covariance Pooling of CNN Features for High Resolution Remote Sensing Scene Classification
title_full_unstemmed	Ensemble Learning Approaches Based on Covariance Pooling of CNN Features for High Resolution Remote Sensing Scene Classification
title_short	Ensemble Learning Approaches Based on Covariance Pooling of CNN Features for High Resolution Remote Sensing Scene Classification
title_sort	ensemble learning approaches based on covariance pooling of cnn features for high resolution remote sensing scene classification
topic	transfer learning covariance matrices log-euclidean metric ensemble learning remote sensing scene classification fisher vector
url	https://www.mdpi.com/2072-4292/12/20/3292
work_keys_str_mv	AT saraakodad ensemblelearningapproachesbasedoncovariancepoolingofcnnfeaturesforhighresolutionremotesensingsceneclassification AT lionelbombrun ensemblelearningapproachesbasedoncovariancepoolingofcnnfeaturesforhighresolutionremotesensingsceneclassification AT junshixia ensemblelearningapproachesbasedoncovariancepoolingofcnnfeaturesforhighresolutionremotesensingsceneclassification AT yannickberthoumieu ensemblelearningapproachesbasedoncovariancepoolingofcnnfeaturesforhighresolutionremotesensingsceneclassification AT christiangermain ensemblelearningapproachesbasedoncovariancepoolingofcnnfeaturesforhighresolutionremotesensingsceneclassification

Ensemble Learning Approaches Based on Covariance Pooling of CNN Features for High Resolution Remote Sensing Scene Classification

Similar Items