Performance of bandit methods in acoustic relay positioning

We consider the problem of maximizing underwater acoustic data transmission, by adaptively positioning a mobile relay. This is a classic exploration vs. exploitation scenario well-described by a multi-armed bandit formulation, which in its canonical form is optimally solved by the Gittins index rule...

Full description

Bibliographic Details
Main Authors: Cheung, Mei Yi, Leighton, Joshua C., Mitra, Urbashi, Singh, Hanumant, Hover, Franz S.
Other Authors: Massachusetts Institute of Technology. Department of Mechanical Engineering
Format: Article
Language:en_US
Published: Institute of Electrical and Electronics Engineers (IEEE) 2015
Online Access:http://hdl.handle.net/1721.1/98404
https://orcid.org/0000-0002-2621-7633
https://orcid.org/0000-0002-3138-7346
_version_ 1826203412637679616
author Cheung, Mei Yi
Leighton, Joshua C.
Mitra, Urbashi
Singh, Hanumant
Hover, Franz S.
author2 Massachusetts Institute of Technology. Department of Mechanical Engineering
author_facet Massachusetts Institute of Technology. Department of Mechanical Engineering
Cheung, Mei Yi
Leighton, Joshua C.
Mitra, Urbashi
Singh, Hanumant
Hover, Franz S.
author_sort Cheung, Mei Yi
collection MIT
description We consider the problem of maximizing underwater acoustic data transmission, by adaptively positioning a mobile relay. This is a classic exploration vs. exploitation scenario well-described by a multi-armed bandit formulation, which in its canonical form is optimally solved by the Gittins index rule. For an ocean vehicle traveling between distant waypoints, however, switching costs are significant, and the MAB with switching costs has no optimal index policy. To address this we have developed a strong adaptation of the Gittins index rule that employs limited-horizon enumeration. We describe autonomous shallow-water field experiments conducted in the Charles River (Boston, MA) with unmanned vehicles and acoustic modems, and compare the performance of different algorithms. Our switching-costs-aware MAB heuristic offers both superior real-time performance in decision-making and efficient learning of the unknown field.
first_indexed 2024-09-23T12:36:10Z
format Article
id mit-1721.1/98404
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T12:36:10Z
publishDate 2015
publisher Institute of Electrical and Electronics Engineers (IEEE)
record_format dspace
spelling mit-1721.1/984042022-10-01T10:01:23Z Performance of bandit methods in acoustic relay positioning Cheung, Mei Yi Leighton, Joshua C. Mitra, Urbashi Singh, Hanumant Hover, Franz S. Massachusetts Institute of Technology. Department of Mechanical Engineering Hover, Franz S. Cheung, Mei Yi Leighton, Joshua C. We consider the problem of maximizing underwater acoustic data transmission, by adaptively positioning a mobile relay. This is a classic exploration vs. exploitation scenario well-described by a multi-armed bandit formulation, which in its canonical form is optimally solved by the Gittins index rule. For an ocean vehicle traveling between distant waypoints, however, switching costs are significant, and the MAB with switching costs has no optimal index policy. To address this we have developed a strong adaptation of the Gittins index rule that employs limited-horizon enumeration. We describe autonomous shallow-water field experiments conducted in the Charles River (Boston, MA) with unmanned vehicles and acoustic modems, and compare the performance of different algorithms. Our switching-costs-aware MAB heuristic offers both superior real-time performance in decision-making and efficient learning of the unknown field. United States. Office of Naval Research (Grant N00014-09-1-0700) National Science Foundation (U.S.) (Contract CNS-1212597) Finmeccanica 2015-09-08T17:52:45Z 2015-09-08T17:52:45Z 2014-06 Article http://purl.org/eprint/type/ConferencePaper 978-1-4799-3274-0 978-1-4799-3272-6 978-1-4799-3271-9 0743-1619 http://hdl.handle.net/1721.1/98404 Cheung, Mei Yi, Joshua Leighton, Urbashi Mitra, Hanumant Singh, and Franz S. Hover. “Performance of Bandit Methods in Acoustic Relay Positioning.” 2014 American Control Conference (June 2014). https://orcid.org/0000-0002-2621-7633 https://orcid.org/0000-0002-3138-7346 en_US http://dx.doi.org/10.1109/ACC.2014.6859385 Proceedings of the 2014 American Control Conference Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) Prof. Hover via Angie Locknar
spellingShingle Cheung, Mei Yi
Leighton, Joshua C.
Mitra, Urbashi
Singh, Hanumant
Hover, Franz S.
Performance of bandit methods in acoustic relay positioning
title Performance of bandit methods in acoustic relay positioning
title_full Performance of bandit methods in acoustic relay positioning
title_fullStr Performance of bandit methods in acoustic relay positioning
title_full_unstemmed Performance of bandit methods in acoustic relay positioning
title_short Performance of bandit methods in acoustic relay positioning
title_sort performance of bandit methods in acoustic relay positioning
url http://hdl.handle.net/1721.1/98404
https://orcid.org/0000-0002-2621-7633
https://orcid.org/0000-0002-3138-7346
work_keys_str_mv AT cheungmeiyi performanceofbanditmethodsinacousticrelaypositioning
AT leightonjoshuac performanceofbanditmethodsinacousticrelaypositioning
AT mitraurbashi performanceofbanditmethodsinacousticrelaypositioning
AT singhhanumant performanceofbanditmethodsinacousticrelaypositioning
AT hoverfranzs performanceofbanditmethodsinacousticrelaypositioning