Importance-Weighted Variational Inference Model Estimation for Offline Bayesian Model-Based Reinforcement Learning

This paper proposes a model estimation method in offline Bayesian model-based reinforcement learning (MBRL). Learning a Bayes-adaptive Markov decision process (BAMDP) model using standard variational inference often suffers from poor predictive performance due to covariate shift between offline data...

Full description

Bibliographic Details
Main Authors: Toru Hishinuma, Kei Senda
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10368011/
_version_ 1797373108146929664
author Toru Hishinuma
Kei Senda
author_facet Toru Hishinuma
Kei Senda
author_sort Toru Hishinuma
collection DOAJ
description This paper proposes a model estimation method in offline Bayesian model-based reinforcement learning (MBRL). Learning a Bayes-adaptive Markov decision process (BAMDP) model using standard variational inference often suffers from poor predictive performance due to covariate shift between offline data and future data distributions. To tackle this problem, this paper applies an importance-weighting technique for covariate shift to variational inference learning of a BAMDP model. Consequently, this paper uses a unified objective function to optimize both model and policy. The unified objective function can be seen as an importance-weighted variational objective function for model training. The unified objective function is also considered as the expected return for policy planning penalized by the model’s error, which is a standard objective function in MBRL. This paper proposes an algorithm optimizing the unified objective function. The proposed algorithm performs better than algorithms using standard variational inference without importance-weighting. Numerical experiments demonstrate the effectiveness of the proposed algorithm.
first_indexed 2024-03-08T18:45:37Z
format Article
id doaj.art-04737077dd514f7fbcf2f0292b99d110
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-08T18:45:37Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-04737077dd514f7fbcf2f0292b99d1102023-12-29T00:03:40ZengIEEEIEEE Access2169-35362023-01-011114557914559010.1109/ACCESS.2023.334579910368011Importance-Weighted Variational Inference Model Estimation for Offline Bayesian Model-Based Reinforcement LearningToru Hishinuma0https://orcid.org/0000-0002-6922-1595Kei Senda1https://orcid.org/0000-0001-5720-6155Department of Aeronautics and Astronautics, Kyoto University, Kyoto, JapanDepartment of Aeronautics and Astronautics, Kyoto University, Kyoto, JapanThis paper proposes a model estimation method in offline Bayesian model-based reinforcement learning (MBRL). Learning a Bayes-adaptive Markov decision process (BAMDP) model using standard variational inference often suffers from poor predictive performance due to covariate shift between offline data and future data distributions. To tackle this problem, this paper applies an importance-weighting technique for covariate shift to variational inference learning of a BAMDP model. Consequently, this paper uses a unified objective function to optimize both model and policy. The unified objective function can be seen as an importance-weighted variational objective function for model training. The unified objective function is also considered as the expected return for policy planning penalized by the model’s error, which is a standard objective function in MBRL. This paper proposes an algorithm optimizing the unified objective function. The proposed algorithm performs better than algorithms using standard variational inference without importance-weighting. Numerical experiments demonstrate the effectiveness of the proposed algorithm.https://ieeexplore.ieee.org/document/10368011/Bayesian model-based reinforcement learningdecision-aware reinforcement learningoffline reinforcement learning
spellingShingle Toru Hishinuma
Kei Senda
Importance-Weighted Variational Inference Model Estimation for Offline Bayesian Model-Based Reinforcement Learning
IEEE Access
Bayesian model-based reinforcement learning
decision-aware reinforcement learning
offline reinforcement learning
title Importance-Weighted Variational Inference Model Estimation for Offline Bayesian Model-Based Reinforcement Learning
title_full Importance-Weighted Variational Inference Model Estimation for Offline Bayesian Model-Based Reinforcement Learning
title_fullStr Importance-Weighted Variational Inference Model Estimation for Offline Bayesian Model-Based Reinforcement Learning
title_full_unstemmed Importance-Weighted Variational Inference Model Estimation for Offline Bayesian Model-Based Reinforcement Learning
title_short Importance-Weighted Variational Inference Model Estimation for Offline Bayesian Model-Based Reinforcement Learning
title_sort importance weighted variational inference model estimation for offline bayesian model based reinforcement learning
topic Bayesian model-based reinforcement learning
decision-aware reinforcement learning
offline reinforcement learning
url https://ieeexplore.ieee.org/document/10368011/
work_keys_str_mv AT toruhishinuma importanceweightedvariationalinferencemodelestimationforofflinebayesianmodelbasedreinforcementlearning
AT keisenda importanceweightedvariationalinferencemodelestimationforofflinebayesianmodelbasedreinforcementlearning