CDEST: Class Distinguishability-Enhanced Self-Training Method for Adopting Pre-Trained Models to Downstream Remote Sensing Image Semantic Segmentation

The self-supervised learning (SSL) technique, driven by massive unlabeled data, is expected to be a promising solution for semantic segmentation of remote sensing images (RSIs) with limited labeled data, revolutionizing transfer learning. Traditional ‘local-to-local’ transfer from small, local datas...

Full description

Bibliographic Details
Main Authors:	Ming Zhang, Xin Gu, Ji Qi, Zhenshi Zhang, Hemeng Yang, Jun Xu, Chengli Peng, Haifeng Li
Format:	Article
Language:	English
Published:	MDPI AG 2024-04-01
Series:	Remote Sensing
Subjects:	semantic segmentation remote sensing (RS) transfer learning fine-tuning method contrastive learning self-training
Online Access:	https://www.mdpi.com/2072-4292/16/7/1293

_version_	1797211992893685760
author	Ming Zhang Xin Gu Ji Qi Zhenshi Zhang Hemeng Yang Jun Xu Chengli Peng Haifeng Li
author_facet	Ming Zhang Xin Gu Ji Qi Zhenshi Zhang Hemeng Yang Jun Xu Chengli Peng Haifeng Li
author_sort	Ming Zhang
collection	DOAJ
description	The self-supervised learning (SSL) technique, driven by massive unlabeled data, is expected to be a promising solution for semantic segmentation of remote sensing images (RSIs) with limited labeled data, revolutionizing transfer learning. Traditional ‘local-to-local’ transfer from small, local datasets to another target dataset plays an ever-shrinking role due to RSIs’ diverse distribution shifts. Instead, SSL promotes a ‘global-to-local’ transfer paradigm, in which generalized models pre-trained on arbitrarily large unlabeled datasets are fine-tuned to the target dataset to overcome data distribution shifts. However, the SSL pre-trained models may contain both useful and useless features for the downstream semantic segmentation task, due to the gap between the SSL tasks and the downstream task. To adapt such pre-trained models to semantic segmentation tasks, traditional supervised fine-tuning methods that use only a small number of labeled samples may drop out useful features due to overfitting. The main reason behind this is that supervised fine-tuning aims to map a few training samples from the high-dimensional, sparse image space to the low-dimensional, compact semantic space defined by the downstream labels, resulting in a degradation of the distinguishability. To address the above issues, we propose a class distinguishability-enhanced self-training (CDEST) method to support global-to-local transfer. First, the self-training module in CDEST introduces a semi-supervised learning mechanism to fully utilize the large amount of unlabeled data in the downstream task to increase the size and diversity of the training data, thus alleviating the problem of biased overfitting of the model. Second, the supervised and semi-supervised contrastive learning modules of CDEST can explicitly enhance the class distinguishability of features, helping to preserve the useful features learned from pre-training while adapting to downstream tasks. We evaluate the proposed CDEST method on four RSI semantic segmentation datasets, and our method achieves optimal experimental results on all four datasets compared to supervised fine-tuning as well as three semi-supervised fine-tuning methods.
first_indexed	2024-04-24T10:35:18Z
format	Article
id	doaj.art-23aa0c3aadeb4f0fbacc1655848c7594
institution	Directory Open Access Journal
issn	2072-4292
language	English
last_indexed	2024-04-24T10:35:18Z
publishDate	2024-04-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj.art-23aa0c3aadeb4f0fbacc1655848c75942024-04-12T13:25:55ZengMDPI AGRemote Sensing2072-42922024-04-01167129310.3390/rs16071293CDEST: Class Distinguishability-Enhanced Self-Training Method for Adopting Pre-Trained Models to Downstream Remote Sensing Image Semantic SegmentationMing Zhang0Xin Gu1Ji Qi2Zhenshi Zhang3Hemeng Yang4Jun Xu5Chengli Peng6Haifeng Li7School of Geosciences and Info-Physics, Central South University, Changsha 410083, ChinaChina Academy of Launch Vehicle Technology Research and Development Center, Beijing 100076, ChinaSchool of Geosciences and Info-Physics, Central South University, Changsha 410083, ChinaUndergraduate School, National University of Defense Technology, Changsha 410080, ChinaTianjin Zhongwei Aerospace Data System Technology Co., Ltd., Tianjin 300301, ChinaElectric Power Research Institute of State Grid Fujian Electric Power Co., Ltd., Fuzhou 350007, ChinaSchool of Geosciences and Info-Physics, Central South University, Changsha 410083, ChinaSchool of Geosciences and Info-Physics, Central South University, Changsha 410083, ChinaThe self-supervised learning (SSL) technique, driven by massive unlabeled data, is expected to be a promising solution for semantic segmentation of remote sensing images (RSIs) with limited labeled data, revolutionizing transfer learning. Traditional ‘local-to-local’ transfer from small, local datasets to another target dataset plays an ever-shrinking role due to RSIs’ diverse distribution shifts. Instead, SSL promotes a ‘global-to-local’ transfer paradigm, in which generalized models pre-trained on arbitrarily large unlabeled datasets are fine-tuned to the target dataset to overcome data distribution shifts. However, the SSL pre-trained models may contain both useful and useless features for the downstream semantic segmentation task, due to the gap between the SSL tasks and the downstream task. To adapt such pre-trained models to semantic segmentation tasks, traditional supervised fine-tuning methods that use only a small number of labeled samples may drop out useful features due to overfitting. The main reason behind this is that supervised fine-tuning aims to map a few training samples from the high-dimensional, sparse image space to the low-dimensional, compact semantic space defined by the downstream labels, resulting in a degradation of the distinguishability. To address the above issues, we propose a class distinguishability-enhanced self-training (CDEST) method to support global-to-local transfer. First, the self-training module in CDEST introduces a semi-supervised learning mechanism to fully utilize the large amount of unlabeled data in the downstream task to increase the size and diversity of the training data, thus alleviating the problem of biased overfitting of the model. Second, the supervised and semi-supervised contrastive learning modules of CDEST can explicitly enhance the class distinguishability of features, helping to preserve the useful features learned from pre-training while adapting to downstream tasks. We evaluate the proposed CDEST method on four RSI semantic segmentation datasets, and our method achieves optimal experimental results on all four datasets compared to supervised fine-tuning as well as three semi-supervised fine-tuning methods.https://www.mdpi.com/2072-4292/16/7/1293semantic segmentationremote sensing (RS)transfer learningfine-tuning methodcontrastive learningself-training
spellingShingle	Ming Zhang Xin Gu Ji Qi Zhenshi Zhang Hemeng Yang Jun Xu Chengli Peng Haifeng Li CDEST: Class Distinguishability-Enhanced Self-Training Method for Adopting Pre-Trained Models to Downstream Remote Sensing Image Semantic Segmentation Remote Sensing semantic segmentation remote sensing (RS) transfer learning fine-tuning method contrastive learning self-training
title	CDEST: Class Distinguishability-Enhanced Self-Training Method for Adopting Pre-Trained Models to Downstream Remote Sensing Image Semantic Segmentation
title_full	CDEST: Class Distinguishability-Enhanced Self-Training Method for Adopting Pre-Trained Models to Downstream Remote Sensing Image Semantic Segmentation
title_fullStr	CDEST: Class Distinguishability-Enhanced Self-Training Method for Adopting Pre-Trained Models to Downstream Remote Sensing Image Semantic Segmentation
title_full_unstemmed	CDEST: Class Distinguishability-Enhanced Self-Training Method for Adopting Pre-Trained Models to Downstream Remote Sensing Image Semantic Segmentation
title_short	CDEST: Class Distinguishability-Enhanced Self-Training Method for Adopting Pre-Trained Models to Downstream Remote Sensing Image Semantic Segmentation
title_sort	cdest class distinguishability enhanced self training method for adopting pre trained models to downstream remote sensing image semantic segmentation
topic	semantic segmentation remote sensing (RS) transfer learning fine-tuning method contrastive learning self-training
url	https://www.mdpi.com/2072-4292/16/7/1293
work_keys_str_mv	AT mingzhang cdestclassdistinguishabilityenhancedselftrainingmethodforadoptingpretrainedmodelstodownstreamremotesensingimagesemanticsegmentation AT xingu cdestclassdistinguishabilityenhancedselftrainingmethodforadoptingpretrainedmodelstodownstreamremotesensingimagesemanticsegmentation AT jiqi cdestclassdistinguishabilityenhancedselftrainingmethodforadoptingpretrainedmodelstodownstreamremotesensingimagesemanticsegmentation AT zhenshizhang cdestclassdistinguishabilityenhancedselftrainingmethodforadoptingpretrainedmodelstodownstreamremotesensingimagesemanticsegmentation AT hemengyang cdestclassdistinguishabilityenhancedselftrainingmethodforadoptingpretrainedmodelstodownstreamremotesensingimagesemanticsegmentation AT junxu cdestclassdistinguishabilityenhancedselftrainingmethodforadoptingpretrainedmodelstodownstreamremotesensingimagesemanticsegmentation AT chenglipeng cdestclassdistinguishabilityenhancedselftrainingmethodforadoptingpretrainedmodelstodownstreamremotesensingimagesemanticsegmentation AT haifengli cdestclassdistinguishabilityenhancedselftrainingmethodforadoptingpretrainedmodelstodownstreamremotesensingimagesemanticsegmentation

CDEST: Class Distinguishability-Enhanced Self-Training Method for Adopting Pre-Trained Models to Downstream Remote Sensing Image Semantic Segmentation

Similar Items