Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation

RGB thermal semantic segmentation facilitates unmanned platforms to perceive and characterize their surrounding environment, which is critical for autonomous driving tasks. Deep-learning-based algorithms have achieved dominance in terms of accuracy and robustness. However, their large parameter size...

Full description

Bibliographic Details
Main Authors:	Guo, Xiaodong, Zhou, Wujie, Liu, Tong
Other Authors:	School of Computer Science and Engineering
Format:	Journal Article
Language:	English
Published:	2024
Subjects:	Computer and Information Science Urban scene Autonomous driving
Online Access:	https://hdl.handle.net/10356/180181

_version_	1811691806711087104
author	Guo, Xiaodong Zhou, Wujie Liu, Tong
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Guo, Xiaodong Zhou, Wujie Liu, Tong
author_sort	Guo, Xiaodong
collection	NTU
description	RGB thermal semantic segmentation facilitates unmanned platforms to perceive and characterize their surrounding environment, which is critical for autonomous driving tasks. Deep-learning-based algorithms have achieved dominance in terms of accuracy and robustness. However, their large parameter sizes and significant computational demands impede their application in terminal devices. To address this challenge, we propose a novel strategy for achieving a balance between effectiveness and compactness. It includes a robust teacher network, CLNet-T, and a streamlined student network, CLNet-S. Using knowledge distillation (KD), we obtained an optimized model called CLNet-S. Specifically, CLNet-T and CLNet-S were identical in all aspects except for the feature extraction component. They included a multi-attribute hierarchical feature interaction module (MHFI) and a detail-guided semantic decoder (DGSD). The MHFI initially filters features by considering the characteristics of the low- and high-level features. It gradually combines complementary and common features from various modalities in distinct receptive fields. DGSD uses edge and distribution information to guide semantic decoding, thereby improving the segmentation accuracy at class boundaries. To enhance the performance of the compact student model, our KD strategy includes detail, semantic response distillation (DSRD), and contrastive learning-based feature distillation (CLFD). Practically, DSRD enables the student model to gain knowledge from the teacher model at both the detailed and semantic levels. At the same time, CLFD increases the similarity of features within the same categories and emphasizes the distinctiveness of features between different categories in both the student and teacher models. Extensive experiments conducted on two standard datasets have consistently demonstrated that both CLNet-T and CLNet-S outperform other state-of-the-art methods. The code and results are available at https://github.com/xiaodonguo/CLNet.
first_indexed	2024-10-01T06:25:45Z
format	Journal Article
id	ntu-10356/180181
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T06:25:45Z
publishDate	2024
record_format	dspace
spelling	ntu-10356/1801812024-09-23T05:02:29Z Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation Guo, Xiaodong Zhou, Wujie Liu, Tong School of Computer Science and Engineering Computer and Information Science Urban scene Autonomous driving RGB thermal semantic segmentation facilitates unmanned platforms to perceive and characterize their surrounding environment, which is critical for autonomous driving tasks. Deep-learning-based algorithms have achieved dominance in terms of accuracy and robustness. However, their large parameter sizes and significant computational demands impede their application in terminal devices. To address this challenge, we propose a novel strategy for achieving a balance between effectiveness and compactness. It includes a robust teacher network, CLNet-T, and a streamlined student network, CLNet-S. Using knowledge distillation (KD), we obtained an optimized model called CLNet-S. Specifically, CLNet-T and CLNet-S were identical in all aspects except for the feature extraction component. They included a multi-attribute hierarchical feature interaction module (MHFI) and a detail-guided semantic decoder (DGSD). The MHFI initially filters features by considering the characteristics of the low- and high-level features. It gradually combines complementary and common features from various modalities in distinct receptive fields. DGSD uses edge and distribution information to guide semantic decoding, thereby improving the segmentation accuracy at class boundaries. To enhance the performance of the compact student model, our KD strategy includes detail, semantic response distillation (DSRD), and contrastive learning-based feature distillation (CLFD). Practically, DSRD enables the student model to gain knowledge from the teacher model at both the detailed and semantic levels. At the same time, CLFD increases the similarity of features within the same categories and emphasizes the distinctiveness of features between different categories in both the student and teacher models. Extensive experiments conducted on two standard datasets have consistently demonstrated that both CLNet-T and CLNet-S outperform other state-of-the-art methods. The code and results are available at https://github.com/xiaodonguo/CLNet. This work was supported by the National Natural Science Foundation of China (62371422). 2024-09-23T05:02:28Z 2024-09-23T05:02:28Z 2024 Journal Article Guo, X., Zhou, W. & Liu, T. (2024). Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation. Knowledge-Based Systems, 292, 111588-. https://dx.doi.org/10.1016/j.knosys.2024.111588 0950-7051 https://hdl.handle.net/10356/180181 10.1016/j.knosys.2024.111588 2-s2.0-85188845184 292 111588 en Knowledge-Based Systems © 2024 Elsevier B.V. All rights reserved.
spellingShingle	Computer and Information Science Urban scene Autonomous driving Guo, Xiaodong Zhou, Wujie Liu, Tong Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation
title	Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation
title_full	Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation
title_fullStr	Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation
title_full_unstemmed	Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation
title_short	Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation
title_sort	contrastive learning based knowledge distillation for rgb thermal urban scene semantic segmentation
topic	Computer and Information Science Urban scene Autonomous driving
url	https://hdl.handle.net/10356/180181
work_keys_str_mv	AT guoxiaodong contrastivelearningbasedknowledgedistillationforrgbthermalurbanscenesemanticsegmentation AT zhouwujie contrastivelearningbasedknowledgedistillationforrgbthermalurbanscenesemanticsegmentation AT liutong contrastivelearningbasedknowledgedistillationforrgbthermalurbanscenesemanticsegmentation

Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation

Similar Items