Target-Network Update Linked with Learning Rate Decay Based on Mutual Information and Reward in Deep Reinforcement Learning

In this study, a target-network update of deep reinforcement learning (DRL) based on mutual information (MI) and rewards is proposed. In DRL, updating the target network from the Q network was used to reduce training diversity and contribute to the stability of learning. If it is not properly update...

Full description

Bibliographic Details
Main Author: Chayoung Kim
Format: Article
Language:English
Published: MDPI AG 2023-09-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/15/10/1840