Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets

We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it...

Full description

Bibliographic Details
Main Authors:	Peer Nagy, Jan-Peter Calliess, Stefan Zohren
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2023-09-01
Series:	Frontiers in Artificial Intelligence
Subjects:	limit order books quantitative finance reinforcement learning LOBSTER algorithmic trading
Online Access:	https://www.frontiersin.org/articles/10.3389/frai.2023.1151003/full

_version_	1827807161985531904
author	Peer Nagy Jan-Peter Calliess Stefan Zohren Stefan Zohren Stefan Zohren
author_facet	Peer Nagy Jan-Peter Calliess Stefan Zohren Stefan Zohren Stefan Zohren
author_sort	Peer Nagy
collection	DOAJ
description	We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximize its trading return in this environment, we use Deep Dueling Double Q-learning with the APEX (asynchronous prioritized experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilizing synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal.
first_indexed	2024-03-11T21:53:18Z
format	Article
id	doaj.art-0f9348e74457474ba9d2f4e4778cb098
institution	Directory Open Access Journal
issn	2624-8212
language	English
last_indexed	2024-03-11T21:53:18Z
publishDate	2023-09-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Artificial Intelligence
spelling	doaj.art-0f9348e74457474ba9d2f4e4778cb0982023-09-26T06:07:26ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122023-09-01610.3389/frai.2023.11510031151003Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book marketsPeer Nagy0Jan-Peter Calliess1Stefan Zohren2Stefan Zohren3Stefan Zohren4Department of Engineering Science, Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, United KingdomDepartment of Engineering Science, Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, United KingdomDepartment of Engineering Science, Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, United KingdomMan Group, London, United KingdomAlan Turing Institute, London, United KingdomWe employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximize its trading return in this environment, we use Deep Dueling Double Q-learning with the APEX (asynchronous prioritized experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilizing synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal.https://www.frontiersin.org/articles/10.3389/frai.2023.1151003/fulllimit order booksquantitative financereinforcement learningLOBSTERalgorithmic trading
spellingShingle	Peer Nagy Jan-Peter Calliess Stefan Zohren Stefan Zohren Stefan Zohren Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets Frontiers in Artificial Intelligence limit order books quantitative finance reinforcement learning LOBSTER algorithmic trading
title	Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets
title_full	Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets
title_fullStr	Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets
title_full_unstemmed	Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets
title_short	Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets
title_sort	asynchronous deep double dueling q learning for trading signal execution in limit order book markets
topic	limit order books quantitative finance reinforcement learning LOBSTER algorithmic trading
url	https://www.frontiersin.org/articles/10.3389/frai.2023.1151003/full
work_keys_str_mv	AT peernagy asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets AT janpetercalliess asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets AT stefanzohren asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets AT stefanzohren asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets AT stefanzohren asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets

Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets

Similar Items