Safe Reinforcement Learning With Model Uncertainty Estimates

Many current autonomous systems are being designed with a strong reliance on black box predictions from deep neural networks (DNNs). However, DNNs tend to be overconfident in predictions on unseen data and can give unpredictable results for far-from-distribution test data. The importance of predicti...

Full description

Bibliographic Details
Main Authors:	Lutjens, Bjorn, Everett, Michael F, How, Jonathan P
Other Authors:	Massachusetts Institute of Technology. Aerospace Controls Laboratory
Format:	Article
Language:	English
Published:	IEEE 2020
Online Access:	https://hdl.handle.net/1721.1/125488

_version_	1826194877815193600
author	Lutjens, Bjorn Everett, Michael F How, Jonathan P
author2	Massachusetts Institute of Technology. Aerospace Controls Laboratory
author_facet	Massachusetts Institute of Technology. Aerospace Controls Laboratory Lutjens, Bjorn Everett, Michael F How, Jonathan P
author_sort	Lutjens, Bjorn
collection	MIT
description	Many current autonomous systems are being designed with a strong reliance on black box predictions from deep neural networks (DNNs). However, DNNs tend to be overconfident in predictions on unseen data and can give unpredictable results for far-from-distribution test data. The importance of predictions that are robust to this distributional shift is evident for safety-critical applications, such as collision avoidance around pedestrians. Measures of model uncertainty can be used to identify unseen data, but the state-of-the-art extraction methods such as Bayesian neural networks are mostly intractable to compute. This paper uses MC-Dropout and Bootstrapping to give computationally tractable and parallelizable uncertainty estimates. The methods are embedded in a Safe Reinforcement Learning framework to form uncertainty-aware navigation around pedestrians. The result is a collision avoidance policy that knows what it does not know and cautiously avoids pedestrians that exhibit unseen behavior. The policy is demonstrated in simulation to be more robust to novel observations and take safer actions than an uncertainty-unaware baseline. Keywords: Uncertainty; Collision avoidance; Neural networks; Computational modeling; Training; Data models; Reinforcement learning
first_indexed	2024-09-23T10:03:15Z
format	Article
id	mit-1721.1/125488
institution	Massachusetts Institute of Technology
language	English
last_indexed	2024-09-23T10:03:15Z
publishDate	2020
publisher	IEEE
record_format	dspace
spelling	mit-1721.1/1254882022-09-30T18:37:14Z Safe Reinforcement Learning With Model Uncertainty Estimates Lutjens, Bjorn Everett, Michael F How, Jonathan P Massachusetts Institute of Technology. Aerospace Controls Laboratory Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Many current autonomous systems are being designed with a strong reliance on black box predictions from deep neural networks (DNNs). However, DNNs tend to be overconfident in predictions on unseen data and can give unpredictable results for far-from-distribution test data. The importance of predictions that are robust to this distributional shift is evident for safety-critical applications, such as collision avoidance around pedestrians. Measures of model uncertainty can be used to identify unseen data, but the state-of-the-art extraction methods such as Bayesian neural networks are mostly intractable to compute. This paper uses MC-Dropout and Bootstrapping to give computationally tractable and parallelizable uncertainty estimates. The methods are embedded in a Safe Reinforcement Learning framework to form uncertainty-aware navigation around pedestrians. The result is a collision avoidance policy that knows what it does not know and cautiously avoids pedestrians that exhibit unseen behavior. The policy is demonstrated in simulation to be more robust to novel observations and take safer actions than an uncertainty-unaware baseline. Keywords: Uncertainty; Collision avoidance; Neural networks; Computational modeling; Training; Data models; Reinforcement learning 2020-05-27T13:08:41Z 2020-05-27T13:08:41Z 2019-08 2019-10-28T17:45:18Z Article http://purl.org/eprint/type/ConferencePaper 9781538660270 978-1-5386-6026-3 https://hdl.handle.net/1721.1/125488 Lutjens, Bjorn, Everett, Michael, and How, Jonathan P., "Safe Reinforcement Learning With Model Uncertainty Estimates." 2019 International Conference on Robotics and Automation (ICRA), May 2019, Montreal, Canada, IEEE, August 2019. en https://dx.doi.org/10.1109/icra.2019.8793611 2019 International Conference on Robotics and Automation Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf IEEE arXiv
spellingShingle	Lutjens, Bjorn Everett, Michael F How, Jonathan P Safe Reinforcement Learning With Model Uncertainty Estimates
title	Safe Reinforcement Learning With Model Uncertainty Estimates
title_full	Safe Reinforcement Learning With Model Uncertainty Estimates
title_fullStr	Safe Reinforcement Learning With Model Uncertainty Estimates
title_full_unstemmed	Safe Reinforcement Learning With Model Uncertainty Estimates
title_short	Safe Reinforcement Learning With Model Uncertainty Estimates
title_sort	safe reinforcement learning with model uncertainty estimates
url	https://hdl.handle.net/1721.1/125488
work_keys_str_mv	AT lutjensbjorn safereinforcementlearningwithmodeluncertaintyestimates AT everettmichaelf safereinforcementlearningwithmodeluncertaintyestimates AT howjonathanp safereinforcementlearningwithmodeluncertaintyestimates

Safe Reinforcement Learning With Model Uncertainty Estimates

Similar Items