Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions
A Deep-Q-Network (DQN) controls a virtual agent as the level of a player using only screenshots as inputs. Replay memory selects a limited number of experience replays according to an arbitrary batch size and updates them using the associated Q-function. Hence, relatively fewer experience replays of...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-11-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/11/23/11162 |
_version_ | 1797508173761871872 |
---|---|
author | Bonwoo Gu Yunsick Sung |
author_facet | Bonwoo Gu Yunsick Sung |
author_sort | Bonwoo Gu |
collection | DOAJ |
description | A Deep-Q-Network (DQN) controls a virtual agent as the level of a player using only screenshots as inputs. Replay memory selects a limited number of experience replays according to an arbitrary batch size and updates them using the associated Q-function. Hence, relatively fewer experience replays of different states are utilized when the number of states is fixed and the state of the randomly selected transitions becomes identical or similar. The DQN may not be applicable in some environments where it is necessary to perform the learning process using more experience replays than is required by the limited batch size. In addition, because it is unknown whether each action can be executed, a problem of an increasing amount of repetitive learning occurs as more non-executable actions are selected. In this study, an enhanced DQN framework is proposed to resolve the batch size problem and reduce the learning time of a DQN in an environment with numerous non-executable actions. In the proposed framework, non-executable actions are filtered to reduce the number of selectable actions to identify the optimal action for the current state. The proposed method was validated in Gomoku, a strategy board game, in which the application of a traditional DQN would be difficult. |
first_indexed | 2024-03-10T04:58:32Z |
format | Article |
id | doaj.art-c26b7fd444b14e07a39d85efa1ccb4b9 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T04:58:32Z |
publishDate | 2021-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-c26b7fd444b14e07a39d85efa1ccb4b92023-11-23T02:03:25ZengMDPI AGApplied Sciences2076-34172021-11-0111231116210.3390/app112311162Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable ActionsBonwoo Gu0Yunsick Sung1Department of Multimedia Engineering, Dongguk University-Seoul, Seoul 04620, KoreaDepartment of Multimedia Engineering, Dongguk University-Seoul, Seoul 04620, KoreaA Deep-Q-Network (DQN) controls a virtual agent as the level of a player using only screenshots as inputs. Replay memory selects a limited number of experience replays according to an arbitrary batch size and updates them using the associated Q-function. Hence, relatively fewer experience replays of different states are utilized when the number of states is fixed and the state of the randomly selected transitions becomes identical or similar. The DQN may not be applicable in some environments where it is necessary to perform the learning process using more experience replays than is required by the limited batch size. In addition, because it is unknown whether each action can be executed, a problem of an increasing amount of repetitive learning occurs as more non-executable actions are selected. In this study, an enhanced DQN framework is proposed to resolve the batch size problem and reduce the learning time of a DQN in an environment with numerous non-executable actions. In the proposed framework, non-executable actions are filtered to reduce the number of selectable actions to identify the optimal action for the current state. The proposed method was validated in Gomoku, a strategy board game, in which the application of a traditional DQN would be difficult.https://www.mdpi.com/2076-3417/11/23/11162Gomokugame artificial intelligencereplay memoryDeep-Q-Networkreinforcement learning |
spellingShingle | Bonwoo Gu Yunsick Sung Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions Applied Sciences Gomoku game artificial intelligence replay memory Deep-Q-Network reinforcement learning |
title | Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions |
title_full | Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions |
title_fullStr | Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions |
title_full_unstemmed | Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions |
title_short | Enhanced DQN Framework for Selecting Actions and Updating Replay Memory Considering Massive Non-Executable Actions |
title_sort | enhanced dqn framework for selecting actions and updating replay memory considering massive non executable actions |
topic | Gomoku game artificial intelligence replay memory Deep-Q-Network reinforcement learning |
url | https://www.mdpi.com/2076-3417/11/23/11162 |
work_keys_str_mv | AT bonwoogu enhanceddqnframeworkforselectingactionsandupdatingreplaymemoryconsideringmassivenonexecutableactions AT yunsicksung enhanceddqnframeworkforselectingactionsandupdatingreplaymemoryconsideringmassivenonexecutableactions |