Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions

A Deep-Q-Network (DQN) controls a virtual agent as the level of a player using only screenshots as inputs. Replay memory selects a limited number of experience replays according to an arbitrary batch size and updates them using the associated Q-function. Hence, relatively fewer experience replays of...

Full description

Bibliographic Details
Main Authors:	Bonwoo Gu, Yunsick Sung
Format:	Article
Language:	English
Published:	MDPI AG 2021-11-01
Series:	Applied Sciences
Subjects:	Gomoku game artificial intelligence replay memory Deep-Q-Network reinforcement learning
Online Access:	https://www.mdpi.com/2076-3417/11/23/11162

_version_	1797508173761871872
author	Bonwoo Gu Yunsick Sung
author_facet	Bonwoo Gu Yunsick Sung
author_sort	Bonwoo Gu
collection	DOAJ
description	A Deep-Q-Network (DQN) controls a virtual agent as the level of a player using only screenshots as inputs. Replay memory selects a limited number of experience replays according to an arbitrary batch size and updates them using the associated Q-function. Hence, relatively fewer experience replays of different states are utilized when the number of states is fixed and the state of the randomly selected transitions becomes identical or similar. The DQN may not be applicable in some environments where it is necessary to perform the learning process using more experience replays than is required by the limited batch size. In addition, because it is unknown whether each action can be executed, a problem of an increasing amount of repetitive learning occurs as more non-executable actions are selected. In this study, an enhanced DQN framework is proposed to resolve the batch size problem and reduce the learning time of a DQN in an environment with numerous non-executable actions. In the proposed framework, non-executable actions are filtered to reduce the number of selectable actions to identify the optimal action for the current state. The proposed method was validated in Gomoku, a strategy board game, in which the application of a traditional DQN would be difficult.
first_indexed	2024-03-10T04:58:32Z
format	Article
id	doaj.art-c26b7fd444b14e07a39d85efa1ccb4b9
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-10T04:58:32Z
publishDate	2021-11-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-c26b7fd444b14e07a39d85efa1ccb4b92023-11-23T02:03:25ZengMDPI AGApplied Sciences2076-34172021-11-0111231116210.3390/app112311162Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable ActionsBonwoo Gu0Yunsick Sung1Department of Multimedia Engineering, Dongguk University-Seoul, Seoul 04620, KoreaDepartment of Multimedia Engineering, Dongguk University-Seoul, Seoul 04620, KoreaA Deep-Q-Network (DQN) controls a virtual agent as the level of a player using only screenshots as inputs. Replay memory selects a limited number of experience replays according to an arbitrary batch size and updates them using the associated Q-function. Hence, relatively fewer experience replays of different states are utilized when the number of states is fixed and the state of the randomly selected transitions becomes identical or similar. The DQN may not be applicable in some environments where it is necessary to perform the learning process using more experience replays than is required by the limited batch size. In addition, because it is unknown whether each action can be executed, a problem of an increasing amount of repetitive learning occurs as more non-executable actions are selected. In this study, an enhanced DQN framework is proposed to resolve the batch size problem and reduce the learning time of a DQN in an environment with numerous non-executable actions. In the proposed framework, non-executable actions are filtered to reduce the number of selectable actions to identify the optimal action for the current state. The proposed method was validated in Gomoku, a strategy board game, in which the application of a traditional DQN would be difficult.https://www.mdpi.com/2076-3417/11/23/11162Gomokugame artificial intelligencereplay memoryDeep-Q-Networkreinforcement learning
spellingShingle	Bonwoo Gu Yunsick Sung Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions Applied Sciences Gomoku game artificial intelligence replay memory Deep-Q-Network reinforcement learning
title	Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions
title_full	Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions
title_fullStr	Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions
title_full_unstemmed	Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions
title_short	Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions
title_sort	enhanced dqn framework for selecting actions and updating replay memory considering massive non executable actions
topic	Gomoku game artificial intelligence replay memory Deep-Q-Network reinforcement learning
url	https://www.mdpi.com/2076-3417/11/23/11162
work_keys_str_mv	AT bonwoogu enhanceddqnframeworkforselectingactionsandupdatingreplaymemoryconsideringmassivenonexecutableactions AT yunsicksung enhanceddqnframeworkforselectingactionsandupdatingreplaymemoryconsideringmassivenonexecutableactions

Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions

Similar Items