Hidden Markov Random Field for Multi-Agent Optimal Decision in Top-Coal Caving

Applying model-based learning for the optimal decision of the multi-agent system is not trivial due to the expensive price or even the impossibility of obtaining the ground truth for training the model of the complex environment. Such as learning the optimal action of hydraulic supports in the top-c...

Full description

Bibliographic Details
Main Authors:	Yi Yang, Zhiwei Lin, Bingfeng Li, Xinwei Li, Lizhi Cui, Keping Wang
Format:	Article
Language:	English
Published:	IEEE 2020-01-01
Series:	IEEE Access
Subjects:	Hidden Markov random field optimal decision multi-agent top-coal caving
Online Access:	https://ieeexplore.ieee.org/document/9052760/

_version_	1818558000402530304
author	Yi Yang Zhiwei Lin Bingfeng Li Xinwei Li Lizhi Cui Keping Wang
author_facet	Yi Yang Zhiwei Lin Bingfeng Li Xinwei Li Lizhi Cui Keping Wang
author_sort	Yi Yang
collection	DOAJ
description	Applying model-based learning for the optimal decision of the multi-agent system is not trivial due to the expensive price or even the impossibility of obtaining the ground truth for training the model of the complex environment. Such as learning the optimal action of hydraulic supports in the top-coal caving, the optimal action could not accessible as the ground truth of the corresponding state in the intricate processes. Regarding the latent ground truth as the hidden variable is an effective method in the hidden Markov model. This paper extends the hidden variable of ground truth to the multi-agent system and proposes the hidden Markov random field (HMRF) model with reinforcement learning for optimizing the action decision of the multi-agent. In the HMRF model, the input states and the output actions of the multi-agent are considered as an observable random field and a latent Markov random field, respectively. Based on the HMRF model, the optimal decision is inferred by the maximum posterior probability with the prior probability obtained by Q-learning. Meanwhile, the parameters of the HMRF model are estimated by the expectation maximum algorithm. In the experiment, the top-coal caving demonstrates the effectiveness of the proposed method that the recall of top-coal is improved prominently with a very small price of increasing the rock-rate. Furthermore, the proposed method is employed to deal with the predator-preys problem in the gym. The experiment result shows that the communication between agents by the HMRF increases the reward of the preys.
first_indexed	2024-12-14T00:07:05Z
format	Article
id	doaj.art-b34a57531f9a4e6aa572048826ff3250
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-14T00:07:05Z
publishDate	2020-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-b34a57531f9a4e6aa572048826ff32502022-12-21T23:25:58ZengIEEEIEEE Access2169-35362020-01-018765967660910.1109/ACCESS.2020.29847869052760Hidden Markov Random Field for Multi-Agent Optimal Decision in Top-Coal CavingYi Yang0https://orcid.org/0000-0003-1858-4641Zhiwei Lin1https://orcid.org/0000-0002-9060-4021Bingfeng Li2https://orcid.org/0000-0002-7140-1122Xinwei Li3https://orcid.org/0000-0003-0911-5130Lizhi Cui4https://orcid.org/0000-0003-0633-2295Keping Wang5https://orcid.org/0000-0002-8138-6181School of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo, ChinaSchool of Computing, Ulster University, Newtownabbey, U.K.School of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo, ChinaSchool of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo, ChinaSchool of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo, ChinaSchool of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo, ChinaApplying model-based learning for the optimal decision of the multi-agent system is not trivial due to the expensive price or even the impossibility of obtaining the ground truth for training the model of the complex environment. Such as learning the optimal action of hydraulic supports in the top-coal caving, the optimal action could not accessible as the ground truth of the corresponding state in the intricate processes. Regarding the latent ground truth as the hidden variable is an effective method in the hidden Markov model. This paper extends the hidden variable of ground truth to the multi-agent system and proposes the hidden Markov random field (HMRF) model with reinforcement learning for optimizing the action decision of the multi-agent. In the HMRF model, the input states and the output actions of the multi-agent are considered as an observable random field and a latent Markov random field, respectively. Based on the HMRF model, the optimal decision is inferred by the maximum posterior probability with the prior probability obtained by Q-learning. Meanwhile, the parameters of the HMRF model are estimated by the expectation maximum algorithm. In the experiment, the top-coal caving demonstrates the effectiveness of the proposed method that the recall of top-coal is improved prominently with a very small price of increasing the rock-rate. Furthermore, the proposed method is employed to deal with the predator-preys problem in the gym. The experiment result shows that the communication between agents by the HMRF increases the reward of the preys.https://ieeexplore.ieee.org/document/9052760/Hidden Markov random fieldoptimal decisionmulti-agenttop-coal caving
spellingShingle	Yi Yang Zhiwei Lin Bingfeng Li Xinwei Li Lizhi Cui Keping Wang Hidden Markov Random Field for Multi-Agent Optimal Decision in Top-Coal Caving IEEE Access Hidden Markov random field optimal decision multi-agent top-coal caving
title	Hidden Markov Random Field for Multi-Agent Optimal Decision in Top-Coal Caving
title_full	Hidden Markov Random Field for Multi-Agent Optimal Decision in Top-Coal Caving
title_fullStr	Hidden Markov Random Field for Multi-Agent Optimal Decision in Top-Coal Caving
title_full_unstemmed	Hidden Markov Random Field for Multi-Agent Optimal Decision in Top-Coal Caving
title_short	Hidden Markov Random Field for Multi-Agent Optimal Decision in Top-Coal Caving
title_sort	hidden markov random field for multi agent optimal decision in top coal caving
topic	Hidden Markov random field optimal decision multi-agent top-coal caving
url	https://ieeexplore.ieee.org/document/9052760/
work_keys_str_mv	AT yiyang hiddenmarkovrandomfieldformultiagentoptimaldecisionintopcoalcaving AT zhiweilin hiddenmarkovrandomfieldformultiagentoptimaldecisionintopcoalcaving AT bingfengli hiddenmarkovrandomfieldformultiagentoptimaldecisionintopcoalcaving AT xinweili hiddenmarkovrandomfieldformultiagentoptimaldecisionintopcoalcaving AT lizhicui hiddenmarkovrandomfieldformultiagentoptimaldecisionintopcoalcaving AT kepingwang hiddenmarkovrandomfieldformultiagentoptimaldecisionintopcoalcaving

Hidden Markov Random Field for Multi-Agent Optimal Decision in Top-Coal Caving

Similar Items