City-scale Cross-view Geolocalization with Generalization to Unseen Environments
Cross-view geolocalization, a supplement or replacement for GPS, provides an estimate of an agent's global position by matching a local ground image to an overhead satellite image. It is challenging to reliably match these two sets of images in part because they have significantly different vie...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2024
|
Online Access: | https://hdl.handle.net/1721.1/153793 https://orcid.org/0000-0001-5753-0617 |
_version_ | 1826198816136626176 |
---|---|
author | Downes, Lena M. |
author2 | How, Jonathan P. |
author_facet | How, Jonathan P. Downes, Lena M. |
author_sort | Downes, Lena M. |
collection | MIT |
description | Cross-view geolocalization, a supplement or replacement for GPS, provides an estimate of an agent's global position by matching a local ground image to an overhead satellite image. It is challenging to reliably match these two sets of images in part because they have significantly different viewpoints. Existing works have demonstrated geolocalization in constrained scenarios over small areas using panoramic cameras, yielding methods that have limited generalization to unseen environments or conditions and that do not quantify uncertainty. This thesis details Wide-Area Geolocalization (WAG) and Restricted FOV Wide-Area Geolocalization (ReWAG) that combine a neural network with a particle filter to achieve global position estimates for a moving agent in a GPS-denied environment while scaling efficiently to city-sized regions in unseen environments and working with either panoramic or non-panoramic cameras. One contribution is a trinomial loss function that enables accurate and computation-efficient localization across city-scale search areas of nearly 300 km^2 in size by improving image retrieval on the off-center image pairs that result from a coarsely discretized satellite image database. Another contribution is a computationally efficient method to incorporate pose information with input image pairs, which improves localization accuracy with non-panoramic cameras and off-center ground images. An additional contribution is the GKL uncertainty measure for localization outputs, which enables detection of particle filter false convergence through characterization of the particle distribution. The final contribution is a demonstration of ReWAG's ability to generalize across different times of day, seasons, weather, and cameras on data collected from a moving car in Cambridge, Massachusetts, as well as the public release of a challenging imagery dataset collected on this vehicle platform. WAG and ReWAG localize from over 1 km to less than 100 m of localization error while performing particle filter updates with less than 1% of the computation required for previous approaches. |
first_indexed | 2024-09-23T11:10:06Z |
format | Thesis |
id | mit-1721.1/153793 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T11:10:06Z |
publishDate | 2024 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1537932024-03-16T03:43:21Z City-scale Cross-view Geolocalization with Generalization to Unseen Environments Downes, Lena M. How, Jonathan P. Steiner III, Theodore J. Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Cross-view geolocalization, a supplement or replacement for GPS, provides an estimate of an agent's global position by matching a local ground image to an overhead satellite image. It is challenging to reliably match these two sets of images in part because they have significantly different viewpoints. Existing works have demonstrated geolocalization in constrained scenarios over small areas using panoramic cameras, yielding methods that have limited generalization to unseen environments or conditions and that do not quantify uncertainty. This thesis details Wide-Area Geolocalization (WAG) and Restricted FOV Wide-Area Geolocalization (ReWAG) that combine a neural network with a particle filter to achieve global position estimates for a moving agent in a GPS-denied environment while scaling efficiently to city-sized regions in unseen environments and working with either panoramic or non-panoramic cameras. One contribution is a trinomial loss function that enables accurate and computation-efficient localization across city-scale search areas of nearly 300 km^2 in size by improving image retrieval on the off-center image pairs that result from a coarsely discretized satellite image database. Another contribution is a computationally efficient method to incorporate pose information with input image pairs, which improves localization accuracy with non-panoramic cameras and off-center ground images. An additional contribution is the GKL uncertainty measure for localization outputs, which enables detection of particle filter false convergence through characterization of the particle distribution. The final contribution is a demonstration of ReWAG's ability to generalize across different times of day, seasons, weather, and cameras on data collected from a moving car in Cambridge, Massachusetts, as well as the public release of a challenging imagery dataset collected on this vehicle platform. WAG and ReWAG localize from over 1 km to less than 100 m of localization error while performing particle filter updates with less than 1% of the computation required for previous approaches. Ph.D. 2024-03-15T19:24:25Z 2024-03-15T19:24:25Z 2024-02 2024-02-16T20:55:50.865Z Thesis https://hdl.handle.net/1721.1/153793 https://orcid.org/0000-0001-5753-0617 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Downes, Lena M. City-scale Cross-view Geolocalization with Generalization to Unseen Environments |
title | City-scale Cross-view Geolocalization with Generalization to Unseen Environments |
title_full | City-scale Cross-view Geolocalization with Generalization to Unseen Environments |
title_fullStr | City-scale Cross-view Geolocalization with Generalization to Unseen Environments |
title_full_unstemmed | City-scale Cross-view Geolocalization with Generalization to Unseen Environments |
title_short | City-scale Cross-view Geolocalization with Generalization to Unseen Environments |
title_sort | city scale cross view geolocalization with generalization to unseen environments |
url | https://hdl.handle.net/1721.1/153793 https://orcid.org/0000-0001-5753-0617 |
work_keys_str_mv | AT downeslenam cityscalecrossviewgeolocalizationwithgeneralizationtounseenenvironments |