Less is more: optimizing classification performance through feature selection in a very-high-resolution remote sensing object-based urban application

This study evaluates the impact of four feature selection (FS) algorithms in an object-based image analysis framework for very-high-resolution land use-land cover classification. The selected FS algorithms, correlation-based feature selection, mean decrease in accuracy, random forest (RF) based recu...

Full description

Bibliographic Details
Main Authors: Stefanos Georganos, Tais Grippa, Sabine Vanhuysse, Moritz Lennert, Michal Shimoni, Stamatis Kalogirou, Eleonore Wolff
Format: Article
Language:English
Published: Taylor & Francis Group 2018-03-01
Series:GIScience & Remote Sensing
Subjects:
Online Access:http://dx.doi.org/10.1080/15481603.2017.1408892
_version_ 1827811799812014080
author Stefanos Georganos
Tais Grippa
Sabine Vanhuysse
Moritz Lennert
Michal Shimoni
Stamatis Kalogirou
Eleonore Wolff
author_facet Stefanos Georganos
Tais Grippa
Sabine Vanhuysse
Moritz Lennert
Michal Shimoni
Stamatis Kalogirou
Eleonore Wolff
author_sort Stefanos Georganos
collection DOAJ
description This study evaluates the impact of four feature selection (FS) algorithms in an object-based image analysis framework for very-high-resolution land use-land cover classification. The selected FS algorithms, correlation-based feature selection, mean decrease in accuracy, random forest (RF) based recursive feature elimination, and variable selection using random forest, were tested on the extreme gradient boosting, support vector machine, K-nearest neighbor, RF, and recursive partitioningclassifiers, respectively. The results demonstrate that the selection of an appropriate FS method can be crucial to the performance of a machine learning classifier in terms of accuracy but also parsimony. In this scope, we propose a new metric to perform model selection named classification optimization score (COS) that rewards model simplicity and indirectly penalizes for increased computational time and processing requirements using the number of features for a given classification model as a surrogate. Our findings suggest that applying rigorous FS along with utilizing the COS metric may significantly reduce the processing time and the storage space while at the same time producing higher classification accuracy than using the initial dataset.
first_indexed 2024-03-11T23:09:45Z
format Article
id doaj.art-bfaeb9b3c9464fd0b26ad3e75ed49db0
institution Directory Open Access Journal
issn 1548-1603
1943-7226
language English
last_indexed 2024-03-11T23:09:45Z
publishDate 2018-03-01
publisher Taylor & Francis Group
record_format Article
series GIScience & Remote Sensing
spelling doaj.art-bfaeb9b3c9464fd0b26ad3e75ed49db02023-09-21T12:34:14ZengTaylor & Francis GroupGIScience & Remote Sensing1548-16031943-72262018-03-0155222124210.1080/15481603.2017.14088921408892Less is more: optimizing classification performance through feature selection in a very-high-resolution remote sensing object-based urban applicationStefanos Georganos0Tais Grippa1Sabine Vanhuysse2Moritz Lennert3Michal Shimoni4Stamatis Kalogirou5Eleonore Wolff6Universite Libre de BruxellesUniversite Libre de BruxellesUniversite Libre de BruxellesUniversite Libre de BruxellesRoyal Military AcademyHarokopio University of AthensUniversite Libre de BruxellesThis study evaluates the impact of four feature selection (FS) algorithms in an object-based image analysis framework for very-high-resolution land use-land cover classification. The selected FS algorithms, correlation-based feature selection, mean decrease in accuracy, random forest (RF) based recursive feature elimination, and variable selection using random forest, were tested on the extreme gradient boosting, support vector machine, K-nearest neighbor, RF, and recursive partitioningclassifiers, respectively. The results demonstrate that the selection of an appropriate FS method can be crucial to the performance of a machine learning classifier in terms of accuracy but also parsimony. In this scope, we propose a new metric to perform model selection named classification optimization score (COS) that rewards model simplicity and indirectly penalizes for increased computational time and processing requirements using the number of features for a given classification model as a surrogate. Our findings suggest that applying rigorous FS along with utilizing the COS metric may significantly reduce the processing time and the storage space while at the same time producing higher classification accuracy than using the initial dataset.http://dx.doi.org/10.1080/15481603.2017.1408892obialand cover classificationextreme gradient boostingfeature selectionmachine learning
spellingShingle Stefanos Georganos
Tais Grippa
Sabine Vanhuysse
Moritz Lennert
Michal Shimoni
Stamatis Kalogirou
Eleonore Wolff
Less is more: optimizing classification performance through feature selection in a very-high-resolution remote sensing object-based urban application
GIScience & Remote Sensing
obia
land cover classification
extreme gradient boosting
feature selection
machine learning
title Less is more: optimizing classification performance through feature selection in a very-high-resolution remote sensing object-based urban application
title_full Less is more: optimizing classification performance through feature selection in a very-high-resolution remote sensing object-based urban application
title_fullStr Less is more: optimizing classification performance through feature selection in a very-high-resolution remote sensing object-based urban application
title_full_unstemmed Less is more: optimizing classification performance through feature selection in a very-high-resolution remote sensing object-based urban application
title_short Less is more: optimizing classification performance through feature selection in a very-high-resolution remote sensing object-based urban application
title_sort less is more optimizing classification performance through feature selection in a very high resolution remote sensing object based urban application
topic obia
land cover classification
extreme gradient boosting
feature selection
machine learning
url http://dx.doi.org/10.1080/15481603.2017.1408892
work_keys_str_mv AT stefanosgeorganos lessismoreoptimizingclassificationperformancethroughfeatureselectioninaveryhighresolutionremotesensingobjectbasedurbanapplication
AT taisgrippa lessismoreoptimizingclassificationperformancethroughfeatureselectioninaveryhighresolutionremotesensingobjectbasedurbanapplication
AT sabinevanhuysse lessismoreoptimizingclassificationperformancethroughfeatureselectioninaveryhighresolutionremotesensingobjectbasedurbanapplication
AT moritzlennert lessismoreoptimizingclassificationperformancethroughfeatureselectioninaveryhighresolutionremotesensingobjectbasedurbanapplication
AT michalshimoni lessismoreoptimizingclassificationperformancethroughfeatureselectioninaveryhighresolutionremotesensingobjectbasedurbanapplication
AT stamatiskalogirou lessismoreoptimizingclassificationperformancethroughfeatureselectioninaveryhighresolutionremotesensingobjectbasedurbanapplication
AT eleonorewolff lessismoreoptimizingclassificationperformancethroughfeatureselectioninaveryhighresolutionremotesensingobjectbasedurbanapplication