Curriculum Reinforcement Learning Based on K-Fold Cross Validation

With the continuous development of deep reinforcement learning in intelligent control, combining automatic curriculum learning and deep reinforcement learning can improve the training performance and efficiency of algorithms from easy to difficult. Most existing automatic curriculum learning algorit...

Full description

Bibliographic Details
Main Authors:	Zeyang Lin, Jun Lai, Xiliang Chen, Lei Cao, Jun Wang
Format:	Article
Language:	English
Published:	MDPI AG 2022-12-01
Series:	Entropy
Subjects:	deep reinforcement learning automatic curriculum learning K-fold cross validation replay buffer
Online Access:	https://www.mdpi.com/1099-4300/24/12/1787

_version_	1797459232776257536
author	Zeyang Lin Jun Lai Xiliang Chen Lei Cao Jun Wang
author_facet	Zeyang Lin Jun Lai Xiliang Chen Lei Cao Jun Wang
author_sort	Zeyang Lin
collection	DOAJ
description	With the continuous development of deep reinforcement learning in intelligent control, combining automatic curriculum learning and deep reinforcement learning can improve the training performance and efficiency of algorithms from easy to difficult. Most existing automatic curriculum learning algorithms perform curriculum ranking through expert experience and a single network, which has the problems of difficult curriculum task ranking and slow convergence speed. In this paper, we propose a curriculum reinforcement learning method based on K-Fold Cross Validation that can estimate the relativity score of task curriculum difficulty. Drawing lessons from the human concept of curriculum learning from easy to difficult, this method divides automatic curriculum learning into a curriculum difficulty assessment stage and a curriculum sorting stage. Through parallel training of the teacher model and cross-evaluation of task sample difficulty, the method can better sequence curriculum learning tasks. Finally, simulation comparison experiments were carried out in two types of multi-agent experimental environments. The experimental results show that the automatic curriculum learning method based on K-Fold cross-validation can improve the training speed of the MADDPG algorithm, and at the same time has a certain generality for multi-agent deep reinforcement learning algorithm based on the replay buffer mechanism.
first_indexed	2024-03-09T16:48:30Z
format	Article
id	doaj.art-064b3677fd734f60bf2dce501af09aab
institution	Directory Open Access Journal
issn	1099-4300
language	English
last_indexed	2024-03-09T16:48:30Z
publishDate	2022-12-01
publisher	MDPI AG
record_format	Article
series	Entropy
spelling	doaj.art-064b3677fd734f60bf2dce501af09aab2023-11-24T14:42:56ZengMDPI AGEntropy1099-43002022-12-012412178710.3390/e24121787Curriculum Reinforcement Learning Based on K-Fold Cross ValidationZeyang Lin0Jun Lai1Xiliang Chen2Lei Cao3Jun Wang4Command & Control Engineering College, Army Engineering University of PLA, Nanjing 210007, ChinaCommand & Control Engineering College, Army Engineering University of PLA, Nanjing 210007, ChinaCommand & Control Engineering College, Army Engineering University of PLA, Nanjing 210007, ChinaCommand & Control Engineering College, Army Engineering University of PLA, Nanjing 210007, ChinaCommand & Control Engineering College, Army Engineering University of PLA, Nanjing 210007, ChinaWith the continuous development of deep reinforcement learning in intelligent control, combining automatic curriculum learning and deep reinforcement learning can improve the training performance and efficiency of algorithms from easy to difficult. Most existing automatic curriculum learning algorithms perform curriculum ranking through expert experience and a single network, which has the problems of difficult curriculum task ranking and slow convergence speed. In this paper, we propose a curriculum reinforcement learning method based on K-Fold Cross Validation that can estimate the relativity score of task curriculum difficulty. Drawing lessons from the human concept of curriculum learning from easy to difficult, this method divides automatic curriculum learning into a curriculum difficulty assessment stage and a curriculum sorting stage. Through parallel training of the teacher model and cross-evaluation of task sample difficulty, the method can better sequence curriculum learning tasks. Finally, simulation comparison experiments were carried out in two types of multi-agent experimental environments. The experimental results show that the automatic curriculum learning method based on K-Fold cross-validation can improve the training speed of the MADDPG algorithm, and at the same time has a certain generality for multi-agent deep reinforcement learning algorithm based on the replay buffer mechanism.https://www.mdpi.com/1099-4300/24/12/1787deep reinforcement learningautomatic curriculum learningK-fold cross validationreplay buffer
spellingShingle	Zeyang Lin Jun Lai Xiliang Chen Lei Cao Jun Wang Curriculum Reinforcement Learning Based on K-Fold Cross Validation Entropy deep reinforcement learning automatic curriculum learning K-fold cross validation replay buffer
title	Curriculum Reinforcement Learning Based on K-Fold Cross Validation
title_full	Curriculum Reinforcement Learning Based on K-Fold Cross Validation
title_fullStr	Curriculum Reinforcement Learning Based on K-Fold Cross Validation
title_full_unstemmed	Curriculum Reinforcement Learning Based on K-Fold Cross Validation
title_short	Curriculum Reinforcement Learning Based on K-Fold Cross Validation
title_sort	curriculum reinforcement learning based on k fold cross validation
topic	deep reinforcement learning automatic curriculum learning K-fold cross validation replay buffer
url	https://www.mdpi.com/1099-4300/24/12/1787
work_keys_str_mv	AT zeyanglin curriculumreinforcementlearningbasedonkfoldcrossvalidation AT junlai curriculumreinforcementlearningbasedonkfoldcrossvalidation AT xiliangchen curriculumreinforcementlearningbasedonkfoldcrossvalidation AT leicao curriculumreinforcementlearningbasedonkfoldcrossvalidation AT junwang curriculumreinforcementlearningbasedonkfoldcrossvalidation

Curriculum Reinforcement Learning Based on K-Fold Cross Validation

Similar Items