Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning
Control intelligence is a typical field where there is a trade-off between target objectives, and researchers in this field have longed for artificial intelligence that achieves the target objectives. Multi-objective deep reinforcement learning was sufficient to satisfy this need. In particular, mul...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-03-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/11/7/1069 |
_version_ | 1797439721643704320 |
---|---|
author | Man-Je Kim Hyunsoo Park Chang Wook Ahn |
author_facet | Man-Je Kim Hyunsoo Park Chang Wook Ahn |
author_sort | Man-Je Kim |
collection | DOAJ |
description | Control intelligence is a typical field where there is a trade-off between target objectives, and researchers in this field have longed for artificial intelligence that achieves the target objectives. Multi-objective deep reinforcement learning was sufficient to satisfy this need. In particular, multi-objective deep reinforcement learning methods based on policy optimization are leading the optimization of control intelligence. However, multi-objective reinforcement learning has difficulties when finding various Pareto optimals of multi-objectives due to the greedy nature of reinforcement learning. We propose a method of policy assimilation to solve this problem. This method was applied to MO-V-MPO, one of preference-based multi-objective reinforcement learning, to increase diversity. The performance of this method has been verified through experiments in a continuous control environment. |
first_indexed | 2024-03-09T11:57:14Z |
format | Article |
id | doaj.art-1429c01f35154a499f9688c9eb8850ad |
institution | Directory Open Access Journal |
issn | 2079-9292 |
language | English |
last_indexed | 2024-03-09T11:57:14Z |
publishDate | 2022-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Electronics |
spelling | doaj.art-1429c01f35154a499f9688c9eb8850ad2023-11-30T23:06:57ZengMDPI AGElectronics2079-92922022-03-01117106910.3390/electronics11071069Nondominated Policy-Guided Learning in Multi-Objective Reinforcement LearningMan-Je Kim0Hyunsoo Park1Chang Wook Ahn2AI Graduate School, Gwangju Institute of Science and Technology, Gwangju 61005, KoreaNCSOFT, Seongnam-si 13494, KoreaAI Graduate School, Gwangju Institute of Science and Technology, Gwangju 61005, KoreaControl intelligence is a typical field where there is a trade-off between target objectives, and researchers in this field have longed for artificial intelligence that achieves the target objectives. Multi-objective deep reinforcement learning was sufficient to satisfy this need. In particular, multi-objective deep reinforcement learning methods based on policy optimization are leading the optimization of control intelligence. However, multi-objective reinforcement learning has difficulties when finding various Pareto optimals of multi-objectives due to the greedy nature of reinforcement learning. We propose a method of policy assimilation to solve this problem. This method was applied to MO-V-MPO, one of preference-based multi-objective reinforcement learning, to increase diversity. The performance of this method has been verified through experiments in a continuous control environment.https://www.mdpi.com/2079-9292/11/7/1069reinforcement learningmulti-objective optimizationreal-time environment |
spellingShingle | Man-Je Kim Hyunsoo Park Chang Wook Ahn Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning Electronics reinforcement learning multi-objective optimization real-time environment |
title | Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning |
title_full | Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning |
title_fullStr | Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning |
title_full_unstemmed | Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning |
title_short | Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning |
title_sort | nondominated policy guided learning in multi objective reinforcement learning |
topic | reinforcement learning multi-objective optimization real-time environment |
url | https://www.mdpi.com/2079-9292/11/7/1069 |
work_keys_str_mv | AT manjekim nondominatedpolicyguidedlearninginmultiobjectivereinforcementlearning AT hyunsoopark nondominatedpolicyguidedlearninginmultiobjectivereinforcementlearning AT changwookahn nondominatedpolicyguidedlearninginmultiobjectivereinforcementlearning |