A Hybrid PAC Reinforcement Learning Algorithm for Human-Robot Interaction

This paper offers a new hybrid probably approximately correct (PAC) reinforcement learning (RL) algorithm for Markov decision processes (MDPs) that intelligently maintains favorable features of both model-based and model-free methodologies. The designed algorithm, referred to as the Dyna-Delayed Q-l...

詳細記述

書誌詳細
主要な著者:	Ashkan Zehfroosh, Herbert G. Tanner
フォーマット:	論文
言語:	English
出版事項:	Frontiers Media S.A. 2022-03-01
シリーズ:	Frontiers in Robotics and AI
主題:	reinforcement learning probably approximately correct markov decision process human-robot interaction sample complexity
オンライン･アクセス:	https://www.frontiersin.org/articles/10.3389/frobt.2022.797213/full

インターネット

https://www.frontiersin.org/articles/10.3389/frobt.2022.797213/full

A Hybrid PAC Reinforcement Learning Algorithm for Human-Robot Interaction

インターネット

類似資料