Permuted KPCA and SMOTE to Guide GAN-Based Oversampling for Imbalanced HSI Classification

Lack of sufficient and balanced data is one of the major challenges in hyperspectral image classification. This problem can cause poor classification performance, especially in detecting or classifying samples of minority classes. The easiest way to overcome the problem is by resampling or creating...

Full description

Bibliographic Details
Main Authors: Tajul Miftahushudur, Bruce Grieve, Hujun Yin
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10312814/
_version_ 1797448896536903680
author Tajul Miftahushudur
Bruce Grieve
Hujun Yin
author_facet Tajul Miftahushudur
Bruce Grieve
Hujun Yin
author_sort Tajul Miftahushudur
collection DOAJ
description Lack of sufficient and balanced data is one of the major challenges in hyperspectral image classification. This problem can cause poor classification performance, especially in detecting or classifying samples of minority classes. The easiest way to overcome the problem is by resampling or creating synthetic samples to balance the class distributions. As the most advanced generative method, generative adversarial networks (GANs) have been used for generating synthetic data. However, GANs need a large amount or sufficient minority class data to train. In this article, we propose to leverage the synthetic minority oversampling technique (SMOTE) in GANs for creating high quality synthetic data to tackle the imbalance problem. The main idea is to train the generator of the GAN to synthesize data from pattern vectors instead of random noise vectors so to guide the GAN to produce data that can expand the minority class data on the decision boundaries. We used kernel principal component analysis and SMOTE to create the pattern vectors and used a silhouette score to control and prevent overlapping issues. In addition, we applied a self-attention module and an automatic data filter to further minimize potentially wrongly labeled or overlapping samples before being added into the training set. Experimental results on both hyperspectral and remote sensing datasets show that the proposed technique can generate more realistic, diverse, and unambiguous synthetic data, resulting in significantly improved classification performances over the existing oversampling techniques.
first_indexed 2024-03-09T14:17:03Z
format Article
id doaj.art-b14549cee4c545e8868ad08fd85bcad2
institution Directory Open Access Journal
issn 2151-1535
language English
last_indexed 2024-03-09T14:17:03Z
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj.art-b14549cee4c545e8868ad08fd85bcad22023-11-29T00:00:45ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing2151-15352024-01-011748950510.1109/JSTARS.2023.332696310312814Permuted KPCA and SMOTE to Guide GAN-Based Oversampling for Imbalanced HSI ClassificationTajul Miftahushudur0https://orcid.org/0000-0002-9184-8183Bruce Grieve1https://orcid.org/0000-0002-5130-3592Hujun Yin2https://orcid.org/0000-0002-9198-5401Department of Electrical and Electronic Engineering, The University of Manchester, Manchester, U.K.Department of Electrical and Electronic Engineering, The University of Manchester, Manchester, U.K.Department of Electrical and Electronic Engineering, The University of Manchester, Manchester, U.K.Lack of sufficient and balanced data is one of the major challenges in hyperspectral image classification. This problem can cause poor classification performance, especially in detecting or classifying samples of minority classes. The easiest way to overcome the problem is by resampling or creating synthetic samples to balance the class distributions. As the most advanced generative method, generative adversarial networks (GANs) have been used for generating synthetic data. However, GANs need a large amount or sufficient minority class data to train. In this article, we propose to leverage the synthetic minority oversampling technique (SMOTE) in GANs for creating high quality synthetic data to tackle the imbalance problem. The main idea is to train the generator of the GAN to synthesize data from pattern vectors instead of random noise vectors so to guide the GAN to produce data that can expand the minority class data on the decision boundaries. We used kernel principal component analysis and SMOTE to create the pattern vectors and used a silhouette score to control and prevent overlapping issues. In addition, we applied a self-attention module and an automatic data filter to further minimize potentially wrongly labeled or overlapping samples before being added into the training set. Experimental results on both hyperspectral and remote sensing datasets show that the proposed technique can generate more realistic, diverse, and unambiguous synthetic data, resulting in significantly improved classification performances over the existing oversampling techniques.https://ieeexplore.ieee.org/document/10312814/Generative adversarial network (GAN)hyperspectral image (HSI)imbalance classificationkernel principal component analysis (kernel PCA)synthetic minority oversampling technique (SMOTE)
spellingShingle Tajul Miftahushudur
Bruce Grieve
Hujun Yin
Permuted KPCA and SMOTE to Guide GAN-Based Oversampling for Imbalanced HSI Classification
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Generative adversarial network (GAN)
hyperspectral image (HSI)
imbalance classification
kernel principal component analysis (kernel PCA)
synthetic minority oversampling technique (SMOTE)
title Permuted KPCA and SMOTE to Guide GAN-Based Oversampling for Imbalanced HSI Classification
title_full Permuted KPCA and SMOTE to Guide GAN-Based Oversampling for Imbalanced HSI Classification
title_fullStr Permuted KPCA and SMOTE to Guide GAN-Based Oversampling for Imbalanced HSI Classification
title_full_unstemmed Permuted KPCA and SMOTE to Guide GAN-Based Oversampling for Imbalanced HSI Classification
title_short Permuted KPCA and SMOTE to Guide GAN-Based Oversampling for Imbalanced HSI Classification
title_sort permuted kpca and smote to guide gan based oversampling for imbalanced hsi classification
topic Generative adversarial network (GAN)
hyperspectral image (HSI)
imbalance classification
kernel principal component analysis (kernel PCA)
synthetic minority oversampling technique (SMOTE)
url https://ieeexplore.ieee.org/document/10312814/
work_keys_str_mv AT tajulmiftahushudur permutedkpcaandsmotetoguideganbasedoversamplingforimbalancedhsiclassification
AT brucegrieve permutedkpcaandsmotetoguideganbasedoversamplingforimbalancedhsiclassification
AT hujunyin permutedkpcaandsmotetoguideganbasedoversamplingforimbalancedhsiclassification