Pruning Stochastic Game Trees Using Neural Networks for Reduced Action Space Approximation

Monte Carlo Tree Search has proved to be very efficient in the broad domain of Game AI, though it suffers from high dimensionality in cases of large branching factors. Several pruning techniques have been proposed to tackle this problem, most of which require explicit domain knowledge. In this study...

Full description

Bibliographic Details
Main Authors:	Tasos Papagiannis, Georgios Alexandridis, Andreas Stafylopatis
Format:	Article
Language:	English
Published:	MDPI AG 2022-05-01
Series:	Mathematics
Subjects:	Monte Carlo Tree Search pruning neural networks multi-armed bandit Upper Confidence Bound Hearthstone
Online Access:	https://www.mdpi.com/2227-7390/10/9/1509

_version_	1797503872599588864
author	Tasos Papagiannis Georgios Alexandridis Andreas Stafylopatis
author_facet	Tasos Papagiannis Georgios Alexandridis Andreas Stafylopatis
author_sort	Tasos Papagiannis
collection	DOAJ
description	Monte Carlo Tree Search has proved to be very efficient in the broad domain of Game AI, though it suffers from high dimensionality in cases of large branching factors. Several pruning techniques have been proposed to tackle this problem, most of which require explicit domain knowledge. In this study, an approach using neural networks to determine the number of actions to be pruned, depending on the iterations run and the total number of possible actions, is proposed. Multi-armed bandit simulations with the UCB1 formula are employed to generate suitable datasets for the networks’ training and a specifically designed process is followed to select the best combination of the number of iterations and actions for pruning. Two pruning Monte Carlo Tree Search variants are investigated, based on different actions’ expected rewards’ distributions, and they are evaluated in the collectible card game Hearthstone. The proposed technique improves the performance of the Monte Carlo Tree Search algorithm in different setups of computational limitations regarding the available number of tree search iterations and is significantly boosted when combined with supervised learning trained-state value predicting models.
first_indexed	2024-03-10T03:56:35Z
format	Article
id	doaj.art-65ef2b4358df4ddcb789ebfc925d5a1e
institution	Directory Open Access Journal
issn	2227-7390
language	English
last_indexed	2024-03-10T03:56:35Z
publishDate	2022-05-01
publisher	MDPI AG
record_format	Article
series	Mathematics
spelling	doaj.art-65ef2b4358df4ddcb789ebfc925d5a1e2023-11-23T08:45:25ZengMDPI AGMathematics2227-73902022-05-01109150910.3390/math10091509Pruning Stochastic Game Trees Using Neural Networks for Reduced Action Space ApproximationTasos Papagiannis0Georgios Alexandridis1Andreas Stafylopatis2Zografou Campus, School of Electrical & Computer Engineering, National Technical University of Athens, 15780 Athens, GreeceZografou Campus, School of Electrical & Computer Engineering, National Technical University of Athens, 15780 Athens, GreeceZografou Campus, School of Electrical & Computer Engineering, National Technical University of Athens, 15780 Athens, GreeceMonte Carlo Tree Search has proved to be very efficient in the broad domain of Game AI, though it suffers from high dimensionality in cases of large branching factors. Several pruning techniques have been proposed to tackle this problem, most of which require explicit domain knowledge. In this study, an approach using neural networks to determine the number of actions to be pruned, depending on the iterations run and the total number of possible actions, is proposed. Multi-armed bandit simulations with the UCB1 formula are employed to generate suitable datasets for the networks’ training and a specifically designed process is followed to select the best combination of the number of iterations and actions for pruning. Two pruning Monte Carlo Tree Search variants are investigated, based on different actions’ expected rewards’ distributions, and they are evaluated in the collectible card game Hearthstone. The proposed technique improves the performance of the Monte Carlo Tree Search algorithm in different setups of computational limitations regarding the available number of tree search iterations and is significantly boosted when combined with supervised learning trained-state value predicting models.https://www.mdpi.com/2227-7390/10/9/1509Monte Carlo Tree Searchpruningneural networksmulti-armed banditUpper Confidence BoundHearthstone
spellingShingle	Tasos Papagiannis Georgios Alexandridis Andreas Stafylopatis Pruning Stochastic Game Trees Using Neural Networks for Reduced Action Space Approximation Mathematics Monte Carlo Tree Search pruning neural networks multi-armed bandit Upper Confidence Bound Hearthstone
title	Pruning Stochastic Game Trees Using Neural Networks for Reduced Action Space Approximation
title_full	Pruning Stochastic Game Trees Using Neural Networks for Reduced Action Space Approximation
title_fullStr	Pruning Stochastic Game Trees Using Neural Networks for Reduced Action Space Approximation
title_full_unstemmed	Pruning Stochastic Game Trees Using Neural Networks for Reduced Action Space Approximation
title_short	Pruning Stochastic Game Trees Using Neural Networks for Reduced Action Space Approximation
title_sort	pruning stochastic game trees using neural networks for reduced action space approximation
topic	Monte Carlo Tree Search pruning neural networks multi-armed bandit Upper Confidence Bound Hearthstone
url	https://www.mdpi.com/2227-7390/10/9/1509
work_keys_str_mv	AT tasospapagiannis pruningstochasticgametreesusingneuralnetworksforreducedactionspaceapproximation AT georgiosalexandridis pruningstochasticgametreesusingneuralnetworksforreducedactionspaceapproximation AT andreasstafylopatis pruningstochasticgametreesusingneuralnetworksforreducedactionspaceapproximation

Pruning Stochastic Game Trees Using Neural Networks for Reduced Action Space Approximation

Similar Items