Practical Application of Deep Reinforcement Learning to Optimal Trade Execution
Although deep reinforcement learning (DRL) has recently emerged as a promising technique for optimal trade execution, two problems still remain unsolved: (1) the lack of a generalized model for a large collection of stocks and execution time horizons; and (2) the inability to accurately train algori...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-06-01
|
Series: | FinTech |
Subjects: | |
Online Access: | https://www.mdpi.com/2674-1032/2/3/23 |
_version_ | 1797580185818628096 |
---|---|
author | Woo Jae Byun Bumkyu Choi Seongmin Kim Joohyun Jo |
author_facet | Woo Jae Byun Bumkyu Choi Seongmin Kim Joohyun Jo |
author_sort | Woo Jae Byun |
collection | DOAJ |
description | Although deep reinforcement learning (DRL) has recently emerged as a promising technique for optimal trade execution, two problems still remain unsolved: (1) the lack of a generalized model for a large collection of stocks and execution time horizons; and (2) the inability to accurately train algorithms due to the discrepancy between the simulation environment and real market. In this article, we address the two issues by utilizing a widely used reinforcement learning (RL) algorithm called proximal policy optimization (PPO) with a long short-term memory (LSTM) network and by building our proprietary order execution simulation environment based on historical level 3 market data of the Korea Stock Exchange (KRX). This paper, to the best of our knowledge, is the first to achieve generalization across 50 stocks and across an execution time horizon ranging from 165 to 380 min along with dynamic target volume. The experimental results demonstrate that the proposed algorithm outperforms the popular benchmark, the volume-weighted average price (VWAP), highlighting the potential use of DRL for optimal trade execution in real-world financial markets. Furthermore, our algorithm is the first commercialized DRL-based optimal trade execution algorithm in the South Korea stock market. |
first_indexed | 2024-03-10T22:46:45Z |
format | Article |
id | doaj.art-e3f606a1cce14107a7deaf4fe8c897bc |
institution | Directory Open Access Journal |
issn | 2674-1032 |
language | English |
last_indexed | 2024-03-10T22:46:45Z |
publishDate | 2023-06-01 |
publisher | MDPI AG |
record_format | Article |
series | FinTech |
spelling | doaj.art-e3f606a1cce14107a7deaf4fe8c897bc2023-11-19T10:39:43ZengMDPI AGFinTech2674-10322023-06-012341442910.3390/fintech2030023Practical Application of Deep Reinforcement Learning to Optimal Trade ExecutionWoo Jae Byun0Bumkyu Choi1Seongmin Kim2Joohyun Jo3Qraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of KoreaQraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of KoreaQraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of KoreaQraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of KoreaAlthough deep reinforcement learning (DRL) has recently emerged as a promising technique for optimal trade execution, two problems still remain unsolved: (1) the lack of a generalized model for a large collection of stocks and execution time horizons; and (2) the inability to accurately train algorithms due to the discrepancy between the simulation environment and real market. In this article, we address the two issues by utilizing a widely used reinforcement learning (RL) algorithm called proximal policy optimization (PPO) with a long short-term memory (LSTM) network and by building our proprietary order execution simulation environment based on historical level 3 market data of the Korea Stock Exchange (KRX). This paper, to the best of our knowledge, is the first to achieve generalization across 50 stocks and across an execution time horizon ranging from 165 to 380 min along with dynamic target volume. The experimental results demonstrate that the proposed algorithm outperforms the popular benchmark, the volume-weighted average price (VWAP), highlighting the potential use of DRL for optimal trade execution in real-world financial markets. Furthermore, our algorithm is the first commercialized DRL-based optimal trade execution algorithm in the South Korea stock market.https://www.mdpi.com/2674-1032/2/3/23deep reinforcement learningoptimal trade executionartificial intelligencemarket microstructurefinancial application |
spellingShingle | Woo Jae Byun Bumkyu Choi Seongmin Kim Joohyun Jo Practical Application of Deep Reinforcement Learning to Optimal Trade Execution FinTech deep reinforcement learning optimal trade execution artificial intelligence market microstructure financial application |
title | Practical Application of Deep Reinforcement Learning to Optimal Trade Execution |
title_full | Practical Application of Deep Reinforcement Learning to Optimal Trade Execution |
title_fullStr | Practical Application of Deep Reinforcement Learning to Optimal Trade Execution |
title_full_unstemmed | Practical Application of Deep Reinforcement Learning to Optimal Trade Execution |
title_short | Practical Application of Deep Reinforcement Learning to Optimal Trade Execution |
title_sort | practical application of deep reinforcement learning to optimal trade execution |
topic | deep reinforcement learning optimal trade execution artificial intelligence market microstructure financial application |
url | https://www.mdpi.com/2674-1032/2/3/23 |
work_keys_str_mv | AT woojaebyun practicalapplicationofdeepreinforcementlearningtooptimaltradeexecution AT bumkyuchoi practicalapplicationofdeepreinforcementlearningtooptimaltradeexecution AT seongminkim practicalapplicationofdeepreinforcementlearningtooptimaltradeexecution AT joohyunjo practicalapplicationofdeepreinforcementlearningtooptimaltradeexecution |