Double Deep Q-Network with Dynamic Bootstrapping for Real-Time Isolated Signal Control: A Traffic Engineering Perspective

Real-time isolated signal control (RISC) at an intersection is of interest in the field of traffic engineering. Energizing RISC with reinforcement learning (RL) is feasible and necessary. Previous studies paid less attention to traffic engineering considerations and under-utilized traffic expertise...

Full description

Bibliographic Details
Main Authors: Qiming Zheng, Hongfeng Xu, Jingyun Chen, Dong Zhang, Kun Zhang, Guolei Tang
Format: Article
Language:English
Published: MDPI AG 2022-08-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/17/8641
_version_ 1797496321252261888
author Qiming Zheng
Hongfeng Xu
Jingyun Chen
Dong Zhang
Kun Zhang
Guolei Tang
author_facet Qiming Zheng
Hongfeng Xu
Jingyun Chen
Dong Zhang
Kun Zhang
Guolei Tang
author_sort Qiming Zheng
collection DOAJ
description Real-time isolated signal control (RISC) at an intersection is of interest in the field of traffic engineering. Energizing RISC with reinforcement learning (RL) is feasible and necessary. Previous studies paid less attention to traffic engineering considerations and under-utilized traffic expertise to construct RL tasks. This study profiles the single-ring RISC problem from the perspective of traffic engineers, and improves a prevailing RL method for solving it. By qualitative applicability analysis, we choose double deep Q-network (DDQN) as the basic method. A single agent is deployed for an intersection. Reward is defined with vehicle departures to properly encourage and punish the agent’s behavior. The action is to determine the remaining green time for the current vehicle phase. State is represented in a grid-based mode. To update action values in time-varying environments, we present a temporal-difference algorithm TD(Dyn) to perform dynamic bootstrapping with the variable interval between actions selected. To accelerate training, we propose a data augmentation based on intersection symmetry. Our improved DDQN, termed D3ynQN, is subject to the signal timing constraints in engineering. The experiments at a close-to-reality intersection indicate that, by means of D3ynQN and non-delay-based reward, the agent acquires useful knowledge to significantly outperform a fully-actuated control technique in reducing average vehicle delay.
first_indexed 2024-03-10T03:02:59Z
format Article
id doaj.art-ae960cd2231c4cd6953a8ede25f21c96
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T03:02:59Z
publishDate 2022-08-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-ae960cd2231c4cd6953a8ede25f21c962023-11-23T12:43:40ZengMDPI AGApplied Sciences2076-34172022-08-011217864110.3390/app12178641Double Deep Q-Network with Dynamic Bootstrapping for Real-Time Isolated Signal Control: A Traffic Engineering PerspectiveQiming Zheng0Hongfeng Xu1Jingyun Chen2Dong Zhang3Kun Zhang4Guolei Tang5School of Transportation and Logistics, Dalian University of Technology, Dalian 116024, ChinaSchool of Transportation and Logistics, Dalian University of Technology, Dalian 116024, ChinaSchool of Transportation and Logistics, Dalian University of Technology, Dalian 116024, ChinaSchool of Transportation and Logistics, Dalian University of Technology, Dalian 116024, ChinaSchool of Transportation and Logistics, Dalian University of Technology, Dalian 116024, ChinaSchool of Port, Waterway and Ocean Engineering, Dalian University of Technology, Dalian 116024, ChinaReal-time isolated signal control (RISC) at an intersection is of interest in the field of traffic engineering. Energizing RISC with reinforcement learning (RL) is feasible and necessary. Previous studies paid less attention to traffic engineering considerations and under-utilized traffic expertise to construct RL tasks. This study profiles the single-ring RISC problem from the perspective of traffic engineers, and improves a prevailing RL method for solving it. By qualitative applicability analysis, we choose double deep Q-network (DDQN) as the basic method. A single agent is deployed for an intersection. Reward is defined with vehicle departures to properly encourage and punish the agent’s behavior. The action is to determine the remaining green time for the current vehicle phase. State is represented in a grid-based mode. To update action values in time-varying environments, we present a temporal-difference algorithm TD(Dyn) to perform dynamic bootstrapping with the variable interval between actions selected. To accelerate training, we propose a data augmentation based on intersection symmetry. Our improved DDQN, termed D3ynQN, is subject to the signal timing constraints in engineering. The experiments at a close-to-reality intersection indicate that, by means of D3ynQN and non-delay-based reward, the agent acquires useful knowledge to significantly outperform a fully-actuated control technique in reducing average vehicle delay.https://www.mdpi.com/2076-3417/12/17/8641double deep Q-networktraffic signal controltraffic simulationreinforcement learning
spellingShingle Qiming Zheng
Hongfeng Xu
Jingyun Chen
Dong Zhang
Kun Zhang
Guolei Tang
Double Deep Q-Network with Dynamic Bootstrapping for Real-Time Isolated Signal Control: A Traffic Engineering Perspective
Applied Sciences
double deep Q-network
traffic signal control
traffic simulation
reinforcement learning
title Double Deep Q-Network with Dynamic Bootstrapping for Real-Time Isolated Signal Control: A Traffic Engineering Perspective
title_full Double Deep Q-Network with Dynamic Bootstrapping for Real-Time Isolated Signal Control: A Traffic Engineering Perspective
title_fullStr Double Deep Q-Network with Dynamic Bootstrapping for Real-Time Isolated Signal Control: A Traffic Engineering Perspective
title_full_unstemmed Double Deep Q-Network with Dynamic Bootstrapping for Real-Time Isolated Signal Control: A Traffic Engineering Perspective
title_short Double Deep Q-Network with Dynamic Bootstrapping for Real-Time Isolated Signal Control: A Traffic Engineering Perspective
title_sort double deep q network with dynamic bootstrapping for real time isolated signal control a traffic engineering perspective
topic double deep Q-network
traffic signal control
traffic simulation
reinforcement learning
url https://www.mdpi.com/2076-3417/12/17/8641
work_keys_str_mv AT qimingzheng doubledeepqnetworkwithdynamicbootstrappingforrealtimeisolatedsignalcontrolatrafficengineeringperspective
AT hongfengxu doubledeepqnetworkwithdynamicbootstrappingforrealtimeisolatedsignalcontrolatrafficengineeringperspective
AT jingyunchen doubledeepqnetworkwithdynamicbootstrappingforrealtimeisolatedsignalcontrolatrafficengineeringperspective
AT dongzhang doubledeepqnetworkwithdynamicbootstrappingforrealtimeisolatedsignalcontrolatrafficengineeringperspective
AT kunzhang doubledeepqnetworkwithdynamicbootstrappingforrealtimeisolatedsignalcontrolatrafficengineeringperspective
AT guoleitang doubledeepqnetworkwithdynamicbootstrappingforrealtimeisolatedsignalcontrolatrafficengineeringperspective