DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs

© 2020 Inst. Sci. inf., Univ. Defence in Belgrade. All rights reserved. A major difficulty of solving continuous POMDPs is to infer the multi-modal distribution of the unobserved true states and to make the planning algorithm dependent on the perceived uncertainty. We cast POMDP filtering and planni...

Full description

Bibliographic Details
Main Authors: Wang, Yunbo, Liu, Bo, Wu, Jiajun, Zhu, Yuke, Du, Simon S, Fei-Fei, Li, Tenenbaum, Joshua B
Other Authors: Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
Format: Article
Language:English
Published: International Joint Conferences on Artificial Intelligence Organization 2021
Online Access:https://hdl.handle.net/1721.1/138359
_version_ 1811084178326814720
author Wang, Yunbo
Liu, Bo
Wu, Jiajun
Zhu, Yuke
Du, Simon S
Fei-Fei, Li
Tenenbaum, Joshua B
author2 Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
author_facet Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
Wang, Yunbo
Liu, Bo
Wu, Jiajun
Zhu, Yuke
Du, Simon S
Fei-Fei, Li
Tenenbaum, Joshua B
author_sort Wang, Yunbo
collection MIT
description © 2020 Inst. Sci. inf., Univ. Defence in Belgrade. All rights reserved. A major difficulty of solving continuous POMDPs is to infer the multi-modal distribution of the unobserved true states and to make the planning algorithm dependent on the perceived uncertainty. We cast POMDP filtering and planning problems as two closely related Sequential Monte Carlo (SMC) processes, one over the real states and the other over the future optimal trajectories, and combine the merits of these two parts in a new model named the DualSMC network. In particular, we first introduce an adversarial particle filter that leverages the adversarial relationship between its internal components. Based on the filtering results, we then propose a planning algorithm that extends the previous SMC planning approach [Piche et al., 2018] to continuous POMDPs with an uncertainty-dependent policy. Crucially, not only can DualSMC handle complex observations such as image input but also it remains highly interpretable. It is shown to be effective in three continuous POMDP domains: the floor positioning domain, the 3D light-dark navigation domain, and a modified Reacher domain.
first_indexed 2024-09-23T12:46:08Z
format Article
id mit-1721.1/138359
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T12:46:08Z
publishDate 2021
publisher International Joint Conferences on Artificial Intelligence Organization
record_format dspace
spelling mit-1721.1/1383592023-02-03T20:00:47Z DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs Wang, Yunbo Liu, Bo Wu, Jiajun Zhu, Yuke Du, Simon S Fei-Fei, Li Tenenbaum, Joshua B Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory © 2020 Inst. Sci. inf., Univ. Defence in Belgrade. All rights reserved. A major difficulty of solving continuous POMDPs is to infer the multi-modal distribution of the unobserved true states and to make the planning algorithm dependent on the perceived uncertainty. We cast POMDP filtering and planning problems as two closely related Sequential Monte Carlo (SMC) processes, one over the real states and the other over the future optimal trajectories, and combine the merits of these two parts in a new model named the DualSMC network. In particular, we first introduce an adversarial particle filter that leverages the adversarial relationship between its internal components. Based on the filtering results, we then propose a planning algorithm that extends the previous SMC planning approach [Piche et al., 2018] to continuous POMDPs with an uncertainty-dependent policy. Crucially, not only can DualSMC handle complex observations such as image input but also it remains highly interpretable. It is shown to be effective in three continuous POMDP domains: the floor positioning domain, the 3D light-dark navigation domain, and a modified Reacher domain. 2021-12-07T19:14:34Z 2021-12-07T19:14:34Z 2020 2021-12-07T19:08:50Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/138359 Wang, Yunbo, Liu, Bo, Wu, Jiajun, Zhu, Yuke, Du, Simon S et al. 2020. "DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs." IJCAI International Joint Conference on Artificial Intelligence, 2021-January. en 10.24963/IJCAI.2020/579 IJCAI International Joint Conference on Artificial Intelligence Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf International Joint Conferences on Artificial Intelligence Organization arXiv
spellingShingle Wang, Yunbo
Liu, Bo
Wu, Jiajun
Zhu, Yuke
Du, Simon S
Fei-Fei, Li
Tenenbaum, Joshua B
DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs
title DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs
title_full DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs
title_fullStr DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs
title_full_unstemmed DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs
title_short DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs
title_sort dualsmc tunneling differentiable filtering and planning under continuous pomdps
url https://hdl.handle.net/1721.1/138359
work_keys_str_mv AT wangyunbo dualsmctunnelingdifferentiablefilteringandplanningundercontinuouspomdps
AT liubo dualsmctunnelingdifferentiablefilteringandplanningundercontinuouspomdps
AT wujiajun dualsmctunnelingdifferentiablefilteringandplanningundercontinuouspomdps
AT zhuyuke dualsmctunnelingdifferentiablefilteringandplanningundercontinuouspomdps
AT dusimons dualsmctunnelingdifferentiablefilteringandplanningundercontinuouspomdps
AT feifeili dualsmctunnelingdifferentiablefilteringandplanningundercontinuouspomdps
AT tenenbaumjoshuab dualsmctunnelingdifferentiablefilteringandplanningundercontinuouspomdps