Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets
We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2023-09-01
|
Series: | Frontiers in Artificial Intelligence |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/frai.2023.1151003/full |
_version_ | 1797673968432316416 |
---|---|
author | Peer Nagy Jan-Peter Calliess Stefan Zohren Stefan Zohren Stefan Zohren |
author_facet | Peer Nagy Jan-Peter Calliess Stefan Zohren Stefan Zohren Stefan Zohren |
author_sort | Peer Nagy |
collection | DOAJ |
description | We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximize its trading return in this environment, we use Deep Dueling Double Q-learning with the APEX (asynchronous prioritized experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilizing synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal. |
first_indexed | 2024-03-11T21:53:18Z |
format | Article |
id | doaj.art-0f9348e74457474ba9d2f4e4778cb098 |
institution | Directory Open Access Journal |
issn | 2624-8212 |
language | English |
last_indexed | 2024-03-11T21:53:18Z |
publishDate | 2023-09-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Artificial Intelligence |
spelling | doaj.art-0f9348e74457474ba9d2f4e4778cb0982023-09-26T06:07:26ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122023-09-01610.3389/frai.2023.11510031151003Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book marketsPeer Nagy0Jan-Peter Calliess1Stefan Zohren2Stefan Zohren3Stefan Zohren4Department of Engineering Science, Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, United KingdomDepartment of Engineering Science, Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, United KingdomDepartment of Engineering Science, Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, United KingdomMan Group, London, United KingdomAlan Turing Institute, London, United KingdomWe employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximize its trading return in this environment, we use Deep Dueling Double Q-learning with the APEX (asynchronous prioritized experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilizing synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal.https://www.frontiersin.org/articles/10.3389/frai.2023.1151003/fulllimit order booksquantitative financereinforcement learningLOBSTERalgorithmic trading |
spellingShingle | Peer Nagy Jan-Peter Calliess Stefan Zohren Stefan Zohren Stefan Zohren Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets Frontiers in Artificial Intelligence limit order books quantitative finance reinforcement learning LOBSTER algorithmic trading |
title | Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets |
title_full | Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets |
title_fullStr | Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets |
title_full_unstemmed | Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets |
title_short | Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets |
title_sort | asynchronous deep double dueling q learning for trading signal execution in limit order book markets |
topic | limit order books quantitative finance reinforcement learning LOBSTER algorithmic trading |
url | https://www.frontiersin.org/articles/10.3389/frai.2023.1151003/full |
work_keys_str_mv | AT peernagy asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets AT janpetercalliess asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets AT stefanzohren asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets AT stefanzohren asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets AT stefanzohren asynchronousdeepdoubleduelingqlearningfortradingsignalexecutioninlimitorderbookmarkets |