Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets

We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it...

Full description

Bibliographic Details
Main Authors: Peer Nagy, Jan-Peter Calliess, Stefan Zohren
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-09-01
Series:Frontiers in Artificial Intelligence
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frai.2023.1151003/full
_version_ 1797673968432316416
author Peer Nagy
Jan-Peter Calliess
Stefan Zohren
Stefan Zohren
Stefan Zohren
author_facet Peer Nagy
Jan-Peter Calliess
Stefan Zohren
Stefan Zohren
Stefan Zohren
author_sort Peer Nagy
collection DOAJ
description We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximize its trading return in this environment, we use Deep Dueling Double Q-learning with the APEX (asynchronous prioritized experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilizing synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal.
first_indexed 2024-03-11T21:53:18Z
format Article
id doaj.art-0f9348e74457474ba9d2f4e4778cb098
institution Directory Open Access Journal
issn 2624-8212
language English
last_indexed 2024-03-11T21:53:18Z
publishDate 2023-09-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Artificial Intelligence
spelling doaj.art-0f9348e74457474ba9d2f4e4778cb0982023-09-26T06:07:26ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122023-09-01610.3389/frai.2023.11510031151003Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book marketsPeer Nagy0Jan-Peter Calliess1Stefan Zohren2Stefan Zohren3Stefan Zohren4Department of Engineering Science, Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, United KingdomDepartment of Engineering Science, Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, United KingdomDepartment of Engineering Science, Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, United KingdomMan Group, London, United KingdomAlan Turing Institute, London, United KingdomWe employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximize its trading return in this environment, we use Deep Dueling Double Q-learning with the APEX (asynchronous prioritized experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilizing synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal.https://www.frontiersin.org/articles/10.3389/frai.2023.1151003/fulllimit order booksquantitative financereinforcement learningLOBSTERalgorithmic trading
spellingShingle Peer Nagy
Jan-Peter Calliess
Stefan Zohren
Stefan Zohren
Stefan Zohren
Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets
Frontiers in Artificial Intelligence
limit order books
quantitative finance
reinforcement learning
LOBSTER
algorithmic trading
title Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets
title_full Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets
title_fullStr Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets
title_full_unstemmed Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets
title_short Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets
title_sort asynchronous deep double dueling q learning for trading signal execution in limit order book markets
topic limit order books
quantitative finance
reinforcement learning
LOBSTER
algorithmic trading
url https://www.frontiersin.org/articles/10.3389/frai.2023.1151003/full
work_keys_str_mv AT peernagy asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets
AT janpetercalliess asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets
AT stefanzohren asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets
AT stefanzohren asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets
AT stefanzohren asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets