Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions

A Deep-Q-Network (DQN) controls a virtual agent as the level of a player using only screenshots as inputs. Replay memory selects a limited number of experience replays according to an arbitrary batch size and updates them using the associated Q-function. Hence, relatively fewer experience replays of...

Full description

Bibliographic Details
Main Authors: Bonwoo Gu, Yunsick Sung
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/23/11162
_version_ 1797508173761871872
author Bonwoo Gu
Yunsick Sung
author_facet Bonwoo Gu
Yunsick Sung
author_sort Bonwoo Gu
collection DOAJ
description A Deep-Q-Network (DQN) controls a virtual agent as the level of a player using only screenshots as inputs. Replay memory selects a limited number of experience replays according to an arbitrary batch size and updates them using the associated Q-function. Hence, relatively fewer experience replays of different states are utilized when the number of states is fixed and the state of the randomly selected transitions becomes identical or similar. The DQN may not be applicable in some environments where it is necessary to perform the learning process using more experience replays than is required by the limited batch size. In addition, because it is unknown whether each action can be executed, a problem of an increasing amount of repetitive learning occurs as more non-executable actions are selected. In this study, an enhanced DQN framework is proposed to resolve the batch size problem and reduce the learning time of a DQN in an environment with numerous non-executable actions. In the proposed framework, non-executable actions are filtered to reduce the number of selectable actions to identify the optimal action for the current state. The proposed method was validated in Gomoku, a strategy board game, in which the application of a traditional DQN would be difficult.
first_indexed 2024-03-10T04:58:32Z
format Article
id doaj.art-c26b7fd444b14e07a39d85efa1ccb4b9
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T04:58:32Z
publishDate 2021-11-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-c26b7fd444b14e07a39d85efa1ccb4b92023-11-23T02:03:25ZengMDPI AGApplied Sciences2076-34172021-11-0111231116210.3390/app112311162Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable ActionsBonwoo Gu0Yunsick Sung1Department of Multimedia Engineering, Dongguk University-Seoul, Seoul 04620, KoreaDepartment of Multimedia Engineering, Dongguk University-Seoul, Seoul 04620, KoreaA Deep-Q-Network (DQN) controls a virtual agent as the level of a player using only screenshots as inputs. Replay memory selects a limited number of experience replays according to an arbitrary batch size and updates them using the associated Q-function. Hence, relatively fewer experience replays of different states are utilized when the number of states is fixed and the state of the randomly selected transitions becomes identical or similar. The DQN may not be applicable in some environments where it is necessary to perform the learning process using more experience replays than is required by the limited batch size. In addition, because it is unknown whether each action can be executed, a problem of an increasing amount of repetitive learning occurs as more non-executable actions are selected. In this study, an enhanced DQN framework is proposed to resolve the batch size problem and reduce the learning time of a DQN in an environment with numerous non-executable actions. In the proposed framework, non-executable actions are filtered to reduce the number of selectable actions to identify the optimal action for the current state. The proposed method was validated in Gomoku, a strategy board game, in which the application of a traditional DQN would be difficult.https://www.mdpi.com/2076-3417/11/23/11162Gomokugame artificial intelligencereplay memoryDeep-Q-Networkreinforcement learning
spellingShingle Bonwoo Gu
Yunsick Sung
Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions
Applied Sciences
Gomoku
game artificial intelligence
replay memory
Deep-Q-Network
reinforcement learning
title Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions
title_full Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions
title_fullStr Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions
title_full_unstemmed Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions
title_short Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions
title_sort enhanced dqn framework for selecting actions and updating replay memory considering massive non executable actions
topic Gomoku
game artificial intelligence
replay memory
Deep-Q-Network
reinforcement learning
url https://www.mdpi.com/2076-3417/11/23/11162
work_keys_str_mv AT bonwoogu enhanceddqnframeworkforselectingactionsandupdatingreplaymemoryconsideringmassivenonexecutableactions
AT yunsicksung enhanceddqnframeworkforselectingactionsandupdatingreplaymemoryconsideringmassivenonexecutableactions