Deep Reinforcement Learning by Balancing Offline Monte Carlo and Online Temporal Difference Use Based on Environment Experiences

Owing to the complexity involved in training an agent in a real-time environment, e.g., using the Internet of Things (IoT), reinforcement learning (RL) using a deep neural network, i.e., deep reinforcement learning (DRL) has been widely adopted on an online basis without prior knowledge and complica...

Full description

Bibliographic Details
Main Author:	Chayoung Kim
Format:	Article
Language:	English
Published:	MDPI AG 2020-10-01
Series:	Symmetry
Subjects:	Q-learning (off-policy) sarsa (on-policy) reinforcement learning (RL) internet of things (IoT) monte carlo (offline) Q-learning (online)
Online Access:	https://www.mdpi.com/2073-8994/12/10/1685

Internet

https://www.mdpi.com/2073-8994/12/10/1685

Deep Reinforcement Learning by Balancing Offline Monte Carlo and Online Temporal Difference Use Based on Environment Experiences

Internet

Similar Items