Practical Application of Deep Reinforcement Learning to Optimal Trade Execution

Although deep reinforcement learning (DRL) has recently emerged as a promising technique for optimal trade execution, two problems still remain unsolved: (1) the lack of a generalized model for a large collection of stocks and execution time horizons; and (2) the inability to accurately train algori...

Full description

Bibliographic Details
Main Authors: Woo Jae Byun, Bumkyu Choi, Seongmin Kim, Joohyun Jo
Format: Article
Language:English
Published: MDPI AG 2023-06-01
Series:FinTech
Subjects:
Online Access:https://www.mdpi.com/2674-1032/2/3/23
_version_ 1797580185818628096
author Woo Jae Byun
Bumkyu Choi
Seongmin Kim
Joohyun Jo
author_facet Woo Jae Byun
Bumkyu Choi
Seongmin Kim
Joohyun Jo
author_sort Woo Jae Byun
collection DOAJ
description Although deep reinforcement learning (DRL) has recently emerged as a promising technique for optimal trade execution, two problems still remain unsolved: (1) the lack of a generalized model for a large collection of stocks and execution time horizons; and (2) the inability to accurately train algorithms due to the discrepancy between the simulation environment and real market. In this article, we address the two issues by utilizing a widely used reinforcement learning (RL) algorithm called proximal policy optimization (PPO) with a long short-term memory (LSTM) network and by building our proprietary order execution simulation environment based on historical level 3 market data of the Korea Stock Exchange (KRX). This paper, to the best of our knowledge, is the first to achieve generalization across 50 stocks and across an execution time horizon ranging from 165 to 380 min along with dynamic target volume. The experimental results demonstrate that the proposed algorithm outperforms the popular benchmark, the volume-weighted average price (VWAP), highlighting the potential use of DRL for optimal trade execution in real-world financial markets. Furthermore, our algorithm is the first commercialized DRL-based optimal trade execution algorithm in the South Korea stock market.
first_indexed 2024-03-10T22:46:45Z
format Article
id doaj.art-e3f606a1cce14107a7deaf4fe8c897bc
institution Directory Open Access Journal
issn 2674-1032
language English
last_indexed 2024-03-10T22:46:45Z
publishDate 2023-06-01
publisher MDPI AG
record_format Article
series FinTech
spelling doaj.art-e3f606a1cce14107a7deaf4fe8c897bc2023-11-19T10:39:43ZengMDPI AGFinTech2674-10322023-06-012341442910.3390/fintech2030023Practical Application of Deep Reinforcement Learning to Optimal Trade ExecutionWoo Jae Byun0Bumkyu Choi1Seongmin Kim2Joohyun Jo3Qraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of KoreaQraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of KoreaQraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of KoreaQraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of KoreaAlthough deep reinforcement learning (DRL) has recently emerged as a promising technique for optimal trade execution, two problems still remain unsolved: (1) the lack of a generalized model for a large collection of stocks and execution time horizons; and (2) the inability to accurately train algorithms due to the discrepancy between the simulation environment and real market. In this article, we address the two issues by utilizing a widely used reinforcement learning (RL) algorithm called proximal policy optimization (PPO) with a long short-term memory (LSTM) network and by building our proprietary order execution simulation environment based on historical level 3 market data of the Korea Stock Exchange (KRX). This paper, to the best of our knowledge, is the first to achieve generalization across 50 stocks and across an execution time horizon ranging from 165 to 380 min along with dynamic target volume. The experimental results demonstrate that the proposed algorithm outperforms the popular benchmark, the volume-weighted average price (VWAP), highlighting the potential use of DRL for optimal trade execution in real-world financial markets. Furthermore, our algorithm is the first commercialized DRL-based optimal trade execution algorithm in the South Korea stock market.https://www.mdpi.com/2674-1032/2/3/23deep reinforcement learningoptimal trade executionartificial intelligencemarket microstructurefinancial application
spellingShingle Woo Jae Byun
Bumkyu Choi
Seongmin Kim
Joohyun Jo
Practical Application of Deep Reinforcement Learning to Optimal Trade Execution
FinTech
deep reinforcement learning
optimal trade execution
artificial intelligence
market microstructure
financial application
title Practical Application of Deep Reinforcement Learning to Optimal Trade Execution
title_full Practical Application of Deep Reinforcement Learning to Optimal Trade Execution
title_fullStr Practical Application of Deep Reinforcement Learning to Optimal Trade Execution
title_full_unstemmed Practical Application of Deep Reinforcement Learning to Optimal Trade Execution
title_short Practical Application of Deep Reinforcement Learning to Optimal Trade Execution
title_sort practical application of deep reinforcement learning to optimal trade execution
topic deep reinforcement learning
optimal trade execution
artificial intelligence
market microstructure
financial application
url https://www.mdpi.com/2674-1032/2/3/23
work_keys_str_mv AT woojaebyun practicalapplicationofdeepreinforcementlearningtooptimaltradeexecution
AT bumkyuchoi practicalapplicationofdeepreinforcementlearningtooptimaltradeexecution
AT seongminkim practicalapplicationofdeepreinforcementlearningtooptimaltradeexecution
AT joohyunjo practicalapplicationofdeepreinforcementlearningtooptimaltradeexecution