Cross-domain robustness in visual place recognition: an adversarial-based domain alignment approach

Visual Place Recognition (VPR) serves as an important component in robotic sensing, mainly utilized within navigation systems, such as autonomous vehicles. VPR enables large-scale localization by comparing current query visual cues within a geo-tagged database of previously visited locations. The ma...

Full description

Bibliographic Details
Main Author:	Lin, Yingying
Other Authors:	Wang Dan Wei
Format:	Thesis-Master by Coursework
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Engineering Visual place recognition Domain adaptation Image retrieval Adversarial network
Online Access:	https://hdl.handle.net/10356/175346

_version_	1811689134335459328
author	Lin, Yingying
author2	Wang Dan Wei
author_facet	Wang Dan Wei Lin, Yingying
author_sort	Lin, Yingying
collection	NTU
description	Visual Place Recognition (VPR) serves as an important component in robotic sensing, mainly utilized within navigation systems, such as autonomous vehicles. VPR enables large-scale localization by comparing current query visual cues within a geo-tagged database of previously visited locations. The main approach in VPR is using weak-supervised representation learning to generate compact but discriminative place descriptors, which are then used for image retrieve the closest match location from the database. NetVLAD has achieved high recall performance within datasets sampled from the same distribution, such as when inference is conducted within the same city and conditions. However, real applications may face significant challenges due to environmental changes, such as variations in illumination, viewpoint, and architectural styles. These different image styles could be viewed as samples from another data distribution, or a different domain. A VPR model trained on a source domain datasets may suffer a sharp decline in performance when tested on a different target domain. This dissertation aims to relief this low cross-domain robustness problem and enhance domain generalization capability. This dissertation focuses on improving the widely-used NetVLAD architecture by employing Domain Adaptation strategies to get place-discriminative while domain-invariant descriptors. An exhaustive theoretical exploration of VPR and Domain Adaptation is first conducted, identifying that resolving the inconsistency between geographic supervision and semantic information is key to better place descriptors. Meanwhile, mitigating the impact of domain-specific features relies most on adjusting the clustering center and soft-alignment parameters in the NetVLAD aggregation layer. Based on theoretical research insights, this work would perform domain alignment by introducing a small-scale target domain guidance to add prior information of the target domain during the training process, thus adapting the VPR model to new environments. Inspired by GAN principles to blur domain-specific information and enlightened by Domain Adversarial Neural Networks (DANN), this dissertation proposes three levels of domain alignment: pixel-level, local feature level, and representation level. Each approach is theoretically and empirically analyzed, comparing their advantages and limitations. Experimental outcomes shows that representation-level alignment most effectively meets the research objectives, outperforming both pixel and local feature level alignments. This increase is attributed to its alignment with the essence of representation learning, being highly task-relevant, and directly modifying descriptors, thus successfully enhancing target domain robustness while preserving source domain performance. For some limitations of this modification, the dissertation also gives recommendations for future work, such as enriching dataset information or simplifying model complexity to ensure model generalization ability.
first_indexed	2024-10-01T05:43:16Z
format	Thesis-Master by Coursework
id	ntu-10356/175346
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T05:43:16Z
publishDate	2024
publisher	Nanyang Technological University
record_format	dspace
spelling	ntu-10356/1753462024-04-26T16:00:18Z Cross-domain robustness in visual place recognition: an adversarial-based domain alignment approach Lin, Yingying Wang Dan Wei School of Electrical and Electronic Engineering EDWWANG@ntu.edu.sg Engineering Visual place recognition Domain adaptation Image retrieval Adversarial network Visual Place Recognition (VPR) serves as an important component in robotic sensing, mainly utilized within navigation systems, such as autonomous vehicles. VPR enables large-scale localization by comparing current query visual cues within a geo-tagged database of previously visited locations. The main approach in VPR is using weak-supervised representation learning to generate compact but discriminative place descriptors, which are then used for image retrieve the closest match location from the database. NetVLAD has achieved high recall performance within datasets sampled from the same distribution, such as when inference is conducted within the same city and conditions. However, real applications may face significant challenges due to environmental changes, such as variations in illumination, viewpoint, and architectural styles. These different image styles could be viewed as samples from another data distribution, or a different domain. A VPR model trained on a source domain datasets may suffer a sharp decline in performance when tested on a different target domain. This dissertation aims to relief this low cross-domain robustness problem and enhance domain generalization capability. This dissertation focuses on improving the widely-used NetVLAD architecture by employing Domain Adaptation strategies to get place-discriminative while domain-invariant descriptors. An exhaustive theoretical exploration of VPR and Domain Adaptation is first conducted, identifying that resolving the inconsistency between geographic supervision and semantic information is key to better place descriptors. Meanwhile, mitigating the impact of domain-specific features relies most on adjusting the clustering center and soft-alignment parameters in the NetVLAD aggregation layer. Based on theoretical research insights, this work would perform domain alignment by introducing a small-scale target domain guidance to add prior information of the target domain during the training process, thus adapting the VPR model to new environments. Inspired by GAN principles to blur domain-specific information and enlightened by Domain Adversarial Neural Networks (DANN), this dissertation proposes three levels of domain alignment: pixel-level, local feature level, and representation level. Each approach is theoretically and empirically analyzed, comparing their advantages and limitations. Experimental outcomes shows that representation-level alignment most effectively meets the research objectives, outperforming both pixel and local feature level alignments. This increase is attributed to its alignment with the essence of representation learning, being highly task-relevant, and directly modifying descriptors, thus successfully enhancing target domain robustness while preserving source domain performance. For some limitations of this modification, the dissertation also gives recommendations for future work, such as enriching dataset information or simplifying model complexity to ensure model generalization ability. Master's degree 2024-04-22T04:38:38Z 2024-04-22T04:38:38Z 2024 Thesis-Master by Coursework Lin, Y. (2024). Cross-domain robustness in visual place recognition: an adversarial-based domain alignment approach. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175346 https://hdl.handle.net/10356/175346 en application/pdf Nanyang Technological University
spellingShingle	Engineering Visual place recognition Domain adaptation Image retrieval Adversarial network Lin, Yingying Cross-domain robustness in visual place recognition: an adversarial-based domain alignment approach
title	Cross-domain robustness in visual place recognition: an adversarial-based domain alignment approach
title_full	Cross-domain robustness in visual place recognition: an adversarial-based domain alignment approach
title_fullStr	Cross-domain robustness in visual place recognition: an adversarial-based domain alignment approach
title_full_unstemmed	Cross-domain robustness in visual place recognition: an adversarial-based domain alignment approach
title_short	Cross-domain robustness in visual place recognition: an adversarial-based domain alignment approach
title_sort	cross domain robustness in visual place recognition an adversarial based domain alignment approach
topic	Engineering Visual place recognition Domain adaptation Image retrieval Adversarial network
url	https://hdl.handle.net/10356/175346
work_keys_str_mv	AT linyingying crossdomainrobustnessinvisualplacerecognitionanadversarialbaseddomainalignmentapproach

Cross-domain robustness in visual place recognition: an adversarial-based domain alignment approach

Similar Items