Feasibility Analysis and Application of Reinforcement Learning Algorithm Based on Dynamic Parameter Adjustment

Reinforcement learning, as a branch of machine learning, has been gradually applied in the control field. However, in the practical application of the algorithm, the hyperparametric approach to network settings for deep reinforcement learning still follows the empirical attempts of traditional machi...

Full description

Bibliographic Details
Main Authors:	Menglin Li, Xueqiang Gu, Chengyi Zeng, Yuan Feng
Format:	Article
Language:	English
Published:	MDPI AG 2020-09-01
Series:	Algorithms
Subjects:	reinforcement learning control system parameter adjustment
Online Access:	https://www.mdpi.com/1999-4893/13/9/239

_version_	1797553008516530176
author	Menglin Li Xueqiang Gu Chengyi Zeng Yuan Feng
author_facet	Menglin Li Xueqiang Gu Chengyi Zeng Yuan Feng
author_sort	Menglin Li
collection	DOAJ
description	Reinforcement learning, as a branch of machine learning, has been gradually applied in the control field. However, in the practical application of the algorithm, the hyperparametric approach to network settings for deep reinforcement learning still follows the empirical attempts of traditional machine learning (supervised learning and unsupervised learning). This method ignores part of the information generated by agents exploring the environment contained in the updating of the reinforcement learning value function, which will affect the performance of the convergence and cumulative return of reinforcement learning. The reinforcement learning algorithm based on dynamic parameter adjustment is a new method for setting learning rate parameters of deep reinforcement learning. Based on the traditional method of setting parameters for reinforcement learning, this method analyzes the advantages of different learning rates at different stages of reinforcement learning and dynamically adjusts the learning rates in combination with the temporal-difference (TD) error values to achieve the advantages of different learning rates in different stages to improve the rationality of the algorithm in practical application. At the same time, by combining the Robbins–Monro approximation algorithm and deep reinforcement learning algorithm, it is proved that the algorithm of dynamic regulation learning rate can theoretically meet the convergence requirements of the intelligent control algorithm. In the experiment, the effect of this method is analyzed through the continuous control scenario in the standard experimental environment of ”Car-on-The-Hill” of reinforcement learning, and it is verified that the new method can achieve better results than the traditional reinforcement learning in practical application. According to the model characteristics of the deep reinforcement learning, a more suitable setting method for the learning rate of the deep reinforcement learning network proposed. At the same time, the feasibility of the method has been proved both in theory and in the application. Therefore, the method of setting the learning rate parameter is worthy of further development and research.
first_indexed	2024-03-10T16:09:18Z
format	Article
id	doaj.art-05020d78803440c0b175a06d8e1ff900
institution	Directory Open Access Journal
issn	1999-4893
language	English
last_indexed	2024-03-10T16:09:18Z
publishDate	2020-09-01
publisher	MDPI AG
record_format	Article
series	Algorithms
spelling	doaj.art-05020d78803440c0b175a06d8e1ff9002023-11-20T14:35:27ZengMDPI AGAlgorithms1999-48932020-09-0113923910.3390/a13090239Feasibility Analysis and Application of Reinforcement Learning Algorithm Based on Dynamic Parameter AdjustmentMenglin Li0Xueqiang Gu1Chengyi Zeng2Yuan Feng3College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaReinforcement learning, as a branch of machine learning, has been gradually applied in the control field. However, in the practical application of the algorithm, the hyperparametric approach to network settings for deep reinforcement learning still follows the empirical attempts of traditional machine learning (supervised learning and unsupervised learning). This method ignores part of the information generated by agents exploring the environment contained in the updating of the reinforcement learning value function, which will affect the performance of the convergence and cumulative return of reinforcement learning. The reinforcement learning algorithm based on dynamic parameter adjustment is a new method for setting learning rate parameters of deep reinforcement learning. Based on the traditional method of setting parameters for reinforcement learning, this method analyzes the advantages of different learning rates at different stages of reinforcement learning and dynamically adjusts the learning rates in combination with the temporal-difference (TD) error values to achieve the advantages of different learning rates in different stages to improve the rationality of the algorithm in practical application. At the same time, by combining the Robbins–Monro approximation algorithm and deep reinforcement learning algorithm, it is proved that the algorithm of dynamic regulation learning rate can theoretically meet the convergence requirements of the intelligent control algorithm. In the experiment, the effect of this method is analyzed through the continuous control scenario in the standard experimental environment of ”Car-on-The-Hill” of reinforcement learning, and it is verified that the new method can achieve better results than the traditional reinforcement learning in practical application. According to the model characteristics of the deep reinforcement learning, a more suitable setting method for the learning rate of the deep reinforcement learning network proposed. At the same time, the feasibility of the method has been proved both in theory and in the application. Therefore, the method of setting the learning rate parameter is worthy of further development and research.https://www.mdpi.com/1999-4893/13/9/239reinforcement learningcontrol systemparameter adjustment
spellingShingle	Menglin Li Xueqiang Gu Chengyi Zeng Yuan Feng Feasibility Analysis and Application of Reinforcement Learning Algorithm Based on Dynamic Parameter Adjustment Algorithms reinforcement learning control system parameter adjustment
title	Feasibility Analysis and Application of Reinforcement Learning Algorithm Based on Dynamic Parameter Adjustment
title_full	Feasibility Analysis and Application of Reinforcement Learning Algorithm Based on Dynamic Parameter Adjustment
title_fullStr	Feasibility Analysis and Application of Reinforcement Learning Algorithm Based on Dynamic Parameter Adjustment
title_full_unstemmed	Feasibility Analysis and Application of Reinforcement Learning Algorithm Based on Dynamic Parameter Adjustment
title_short	Feasibility Analysis and Application of Reinforcement Learning Algorithm Based on Dynamic Parameter Adjustment
title_sort	feasibility analysis and application of reinforcement learning algorithm based on dynamic parameter adjustment
topic	reinforcement learning control system parameter adjustment
url	https://www.mdpi.com/1999-4893/13/9/239
work_keys_str_mv	AT menglinli feasibilityanalysisandapplicationofreinforcementlearningalgorithmbasedondynamicparameteradjustment AT xueqianggu feasibilityanalysisandapplicationofreinforcementlearningalgorithmbasedondynamicparameteradjustment AT chengyizeng feasibilityanalysisandapplicationofreinforcementlearningalgorithmbasedondynamicparameteradjustment AT yuanfeng feasibilityanalysisandapplicationofreinforcementlearningalgorithmbasedondynamicparameteradjustment

Feasibility Analysis and Application of Reinforcement Learning Algorithm Based on Dynamic Parameter Adjustment

Similar Items