Variational Reward Estimator Bottleneck: Towards Robust Reward Estimator for Multidomain Task-Oriented Dialogue

Despite its significant effectiveness in adversarial training approaches to multidomain task-oriented dialogue systems, adversarial inverse reinforcement learning of the dialogue policy frequently fails to balance the performance of the reward estimator and policy generator. During the optimization...

全面介绍

书目详细资料
Main Authors:	Jeiyoon Park, Chanhee Lee, Chanjun Park, Kuekyeng Kim, Heuiseok Lim
格式:	文件
语言:	English
出版:	MDPI AG 2021-07-01
丛编:	Applied Sciences
主题:	task-oriented dialogue dialogue policy reinforcement learning inverse reinforcement learning
在线阅读:	https://www.mdpi.com/2076-3417/11/14/6624

因特网

https://www.mdpi.com/2076-3417/11/14/6624

Variational Reward Estimator Bottleneck: Towards Robust Reward Estimator for Multidomain Task-Oriented Dialogue

因特网

相似书籍