UAV’s Status Is Worth Considering: A Fusion Representations Matching Method for Geo-Localization

Visual geo-localization plays a crucial role in positioning and navigation for unmanned aerial vehicles, whose goal is to match the same geographic target from different views. This is a challenging task due to the drastic variations in different viewpoints and appearances. Previous methods have bee...

Full description

Bibliographic Details
Main Authors: Runzhe Zhu, Mingze Yang, Ling Yin, Fei Wu, Yuncheng Yang
Format: Article
Language:English
Published: MDPI AG 2023-01-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/2/720
_version_ 1797437314556755968
author Runzhe Zhu
Mingze Yang
Ling Yin
Fei Wu
Yuncheng Yang
author_facet Runzhe Zhu
Mingze Yang
Ling Yin
Fei Wu
Yuncheng Yang
author_sort Runzhe Zhu
collection DOAJ
description Visual geo-localization plays a crucial role in positioning and navigation for unmanned aerial vehicles, whose goal is to match the same geographic target from different views. This is a challenging task due to the drastic variations in different viewpoints and appearances. Previous methods have been focused on mining features inside the images. However, they underestimated the influence of external elements and the interaction of various representations. Inspired by multimodal and bilinear pooling, we proposed a pioneering feature fusion network (MBF) to address these inherent differences between drone and satellite views. We observe that UAV’s status, such as flight height, leads to changes in the size of image field of view. In addition, local parts of the target scene act a role of importance in extracting discriminative features. Therefore, we present two approaches to exploit those priors. The first module is to add status information to network by transforming them into word embeddings. Note that they concatenate with image embeddings in Transformer block to learn status-aware features. Then, global and local part feature maps from the same viewpoint are correlated and reinforced by hierarchical bilinear pooling (HBP) to improve the robustness of feature representation. By the above approaches, we achieve more discriminative deep representations facilitating the geo-localization more effectively. Our experiments on existing benchmark datasets show significant performance boosting, reaching the new state-of-the-art result. Remarkably, the recall@1 accuracy achieves 89.05% in drone localization task and 93.15% in drone navigation task in University-1652, and shows strong robustness at different flight heights in the SUES-200 dataset.
first_indexed 2024-03-09T11:18:24Z
format Article
id doaj.art-9364a56c3bad4797b1d0b2dbffe4d80c
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T11:18:24Z
publishDate 2023-01-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-9364a56c3bad4797b1d0b2dbffe4d80c2023-12-01T00:26:25ZengMDPI AGSensors1424-82202023-01-0123272010.3390/s23020720UAV’s Status Is Worth Considering: A Fusion Representations Matching Method for Geo-LocalizationRunzhe Zhu0Mingze Yang1Ling Yin2Fei Wu3Yuncheng Yang4School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201602, ChinaSchool of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201602, ChinaSchool of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201602, ChinaSchool of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201602, ChinaSchool of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201602, ChinaVisual geo-localization plays a crucial role in positioning and navigation for unmanned aerial vehicles, whose goal is to match the same geographic target from different views. This is a challenging task due to the drastic variations in different viewpoints and appearances. Previous methods have been focused on mining features inside the images. However, they underestimated the influence of external elements and the interaction of various representations. Inspired by multimodal and bilinear pooling, we proposed a pioneering feature fusion network (MBF) to address these inherent differences between drone and satellite views. We observe that UAV’s status, such as flight height, leads to changes in the size of image field of view. In addition, local parts of the target scene act a role of importance in extracting discriminative features. Therefore, we present two approaches to exploit those priors. The first module is to add status information to network by transforming them into word embeddings. Note that they concatenate with image embeddings in Transformer block to learn status-aware features. Then, global and local part feature maps from the same viewpoint are correlated and reinforced by hierarchical bilinear pooling (HBP) to improve the robustness of feature representation. By the above approaches, we achieve more discriminative deep representations facilitating the geo-localization more effectively. Our experiments on existing benchmark datasets show significant performance boosting, reaching the new state-of-the-art result. Remarkably, the recall@1 accuracy achieves 89.05% in drone localization task and 93.15% in drone navigation task in University-1652, and shows strong robustness at different flight heights in the SUES-200 dataset.https://www.mdpi.com/1424-8220/23/2/720cross-view image matchinggeo-localizationUAV image localizationmultimodaltransformerbilinear pooling
spellingShingle Runzhe Zhu
Mingze Yang
Ling Yin
Fei Wu
Yuncheng Yang
UAV’s Status Is Worth Considering: A Fusion Representations Matching Method for Geo-Localization
Sensors
cross-view image matching
geo-localization
UAV image localization
multimodal
transformer
bilinear pooling
title UAV’s Status Is Worth Considering: A Fusion Representations Matching Method for Geo-Localization
title_full UAV’s Status Is Worth Considering: A Fusion Representations Matching Method for Geo-Localization
title_fullStr UAV’s Status Is Worth Considering: A Fusion Representations Matching Method for Geo-Localization
title_full_unstemmed UAV’s Status Is Worth Considering: A Fusion Representations Matching Method for Geo-Localization
title_short UAV’s Status Is Worth Considering: A Fusion Representations Matching Method for Geo-Localization
title_sort uav s status is worth considering a fusion representations matching method for geo localization
topic cross-view image matching
geo-localization
UAV image localization
multimodal
transformer
bilinear pooling
url https://www.mdpi.com/1424-8220/23/2/720
work_keys_str_mv AT runzhezhu uavsstatusisworthconsideringafusionrepresentationsmatchingmethodforgeolocalization
AT mingzeyang uavsstatusisworthconsideringafusionrepresentationsmatchingmethodforgeolocalization
AT lingyin uavsstatusisworthconsideringafusionrepresentationsmatchingmethodforgeolocalization
AT feiwu uavsstatusisworthconsideringafusionrepresentationsmatchingmethodforgeolocalization
AT yunchengyang uavsstatusisworthconsideringafusionrepresentationsmatchingmethodforgeolocalization