Practical Application of Deep Reinforcement Learning to Optimal Trade Execution

Although deep reinforcement learning (DRL) has recently emerged as a promising technique for optimal trade execution, two problems still remain unsolved: (1) the lack of a generalized model for a large collection of stocks and execution time horizons; and (2) the inability to accurately train algori...

Full description

Bibliographic Details
Main Authors:	Woo Jae Byun, Bumkyu Choi, Seongmin Kim, Joohyun Jo
Format:	Article
Language:	English
Published:	MDPI AG 2023-06-01
Series:	FinTech
Subjects:	deep reinforcement learning optimal trade execution artificial intelligence market microstructure financial application
Online Access:	https://www.mdpi.com/2674-1032/2/3/23

_version_	1797580185818628096
author	Woo Jae Byun Bumkyu Choi Seongmin Kim Joohyun Jo
author_facet	Woo Jae Byun Bumkyu Choi Seongmin Kim Joohyun Jo
author_sort	Woo Jae Byun
collection	DOAJ
description	Although deep reinforcement learning (DRL) has recently emerged as a promising technique for optimal trade execution, two problems still remain unsolved: (1) the lack of a generalized model for a large collection of stocks and execution time horizons; and (2) the inability to accurately train algorithms due to the discrepancy between the simulation environment and real market. In this article, we address the two issues by utilizing a widely used reinforcement learning (RL) algorithm called proximal policy optimization (PPO) with a long short-term memory (LSTM) network and by building our proprietary order execution simulation environment based on historical level 3 market data of the Korea Stock Exchange (KRX). This paper, to the best of our knowledge, is the first to achieve generalization across 50 stocks and across an execution time horizon ranging from 165 to 380 min along with dynamic target volume. The experimental results demonstrate that the proposed algorithm outperforms the popular benchmark, the volume-weighted average price (VWAP), highlighting the potential use of DRL for optimal trade execution in real-world financial markets. Furthermore, our algorithm is the first commercialized DRL-based optimal trade execution algorithm in the South Korea stock market.
first_indexed	2024-03-10T22:46:45Z
format	Article
id	doaj.art-e3f606a1cce14107a7deaf4fe8c897bc
institution	Directory Open Access Journal
issn	2674-1032
language	English
last_indexed	2024-03-10T22:46:45Z
publishDate	2023-06-01
publisher	MDPI AG
record_format	Article
series	FinTech
spelling	doaj.art-e3f606a1cce14107a7deaf4fe8c897bc2023-11-19T10:39:43ZengMDPI AGFinTech2674-10322023-06-012341442910.3390/fintech2030023Practical Application of Deep Reinforcement Learning to Optimal Trade ExecutionWoo Jae Byun0Bumkyu Choi1Seongmin Kim2Joohyun Jo3Qraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of KoreaQraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of KoreaQraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of KoreaQraft Technologies, Inc., 3040 Three IFC, 10 Gukjegeumyung-ro, Yeongdeungpo-gu, Seoul 07326, Republic of KoreaAlthough deep reinforcement learning (DRL) has recently emerged as a promising technique for optimal trade execution, two problems still remain unsolved: (1) the lack of a generalized model for a large collection of stocks and execution time horizons; and (2) the inability to accurately train algorithms due to the discrepancy between the simulation environment and real market. In this article, we address the two issues by utilizing a widely used reinforcement learning (RL) algorithm called proximal policy optimization (PPO) with a long short-term memory (LSTM) network and by building our proprietary order execution simulation environment based on historical level 3 market data of the Korea Stock Exchange (KRX). This paper, to the best of our knowledge, is the first to achieve generalization across 50 stocks and across an execution time horizon ranging from 165 to 380 min along with dynamic target volume. The experimental results demonstrate that the proposed algorithm outperforms the popular benchmark, the volume-weighted average price (VWAP), highlighting the potential use of DRL for optimal trade execution in real-world financial markets. Furthermore, our algorithm is the first commercialized DRL-based optimal trade execution algorithm in the South Korea stock market.https://www.mdpi.com/2674-1032/2/3/23deep reinforcement learningoptimal trade executionartificial intelligencemarket microstructurefinancial application
spellingShingle	Woo Jae Byun Bumkyu Choi Seongmin Kim Joohyun Jo Practical Application of Deep Reinforcement Learning to Optimal Trade Execution FinTech deep reinforcement learning optimal trade execution artificial intelligence market microstructure financial application
title	Practical Application of Deep Reinforcement Learning to Optimal Trade Execution
title_full	Practical Application of Deep Reinforcement Learning to Optimal Trade Execution
title_fullStr	Practical Application of Deep Reinforcement Learning to Optimal Trade Execution
title_full_unstemmed	Practical Application of Deep Reinforcement Learning to Optimal Trade Execution
title_short	Practical Application of Deep Reinforcement Learning to Optimal Trade Execution
title_sort	practical application of deep reinforcement learning to optimal trade execution
topic	deep reinforcement learning optimal trade execution artificial intelligence market microstructure financial application
url	https://www.mdpi.com/2674-1032/2/3/23
work_keys_str_mv	AT woojaebyun practicalapplicationofdeepreinforcementlearningtooptimaltradeexecution AT bumkyuchoi practicalapplicationofdeepreinforcementlearningtooptimaltradeexecution AT seongminkim practicalapplicationofdeepreinforcementlearningtooptimaltradeexecution AT joohyunjo practicalapplicationofdeepreinforcementlearningtooptimaltradeexecution

Practical Application of Deep Reinforcement Learning to Optimal Trade Execution

Similar Items