Scheduled Curiosity-Deep Dyna-Q: Efficient Exploration for Dialog Policy Learning

Training task-oriented dialog agents based on reinforcement learning is time-consuming and requires a large number of interactions with real users. How to grasp dialog policy within limited dialog experiences remains an obstacle that makes the agent training process less efficient. In addition, most...

Full description

Bibliographic Details
Main Authors: Xuecheng Niu, Akinori Ito, Takashi Nose
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10468605/