Dynamic Fleet Management With Rewriting Deep Reinforcement Learning
Inefficient supply-demand matching makes the fleet management a research hotpot in ride-sharing platforms. With the booming of mobile network services, it is promising to abate the supply-demand gap with effective vehicle dispatching. In this article, we propose a QRewriter - Dueling Deep Q-Network...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9157835/ |
_version_ | 1811209776307109888 |
---|---|
author | Wenqi Zhang Qiang Wang Jingjing Li Chen Xu |
author_facet | Wenqi Zhang Qiang Wang Jingjing Li Chen Xu |
author_sort | Wenqi Zhang |
collection | DOAJ |
description | Inefficient supply-demand matching makes the fleet management a research hotpot in ride-sharing platforms. With the booming of mobile network services, it is promising to abate the supply-demand gap with effective vehicle dispatching. In this article, we propose a QRewriter - Dueling Deep Q-Network (QRewriter-DDQN) algorithm, to dispatch multiple available vehicles in ahead to the locations with high demand to serve more orders. The QRewriter-DDQN algorithm factorizes into a Dueling Deep Q-Network (DDQN) module and a QRewriter module, which are parameterized by neural networks and Q-table with Reinforcement Learning (RL) methods, respectively. Particularly, DDQN module utilizes the Kullback-Leibler (KL) distribution distance between supply (available vehicles) and demand (orders) as excitation to capture the complex dynamic variations of supply-demand. Afterwards, the QRewriter module learns to improve the DDQN dispatching policy with the streamlined and effective Q-table in RL. Importantly, the higher performance improvement space of the DDQN dispatching policy can be obtained by aggregating QRewriter state into low-dimension meta state. A simulator is designed to train and test the performance of QRewriter-DDQN, the experiment results show the significant improvement of QRewriter-DDQN in terms of order response rate. |
first_indexed | 2024-04-12T04:45:49Z |
format | Article |
id | doaj.art-61fa1c3406614c5e9c4d6bab8b6b6af1 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-12T04:45:49Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-61fa1c3406614c5e9c4d6bab8b6b6af12022-12-22T03:47:31ZengIEEEIEEE Access2169-35362020-01-01814333314334110.1109/ACCESS.2020.30140769157835Dynamic Fleet Management With Rewriting Deep Reinforcement LearningWenqi Zhang0https://orcid.org/0000-0002-4482-6715Qiang Wang1https://orcid.org/0000-0002-9392-475XJingjing Li2Chen Xu3National Engineering Laboratory for Mobile Network Technologies, Beijing University of Posts and Telecommunications, Beijing, ChinaNational Engineering Laboratory for Mobile Network Technologies, Beijing University of Posts and Telecommunications, Beijing, ChinaNational Engineering Laboratory for Mobile Network Technologies, Beijing University of Posts and Telecommunications, Beijing, ChinaNational Engineering Laboratory for Mobile Network Technologies, Beijing University of Posts and Telecommunications, Beijing, ChinaInefficient supply-demand matching makes the fleet management a research hotpot in ride-sharing platforms. With the booming of mobile network services, it is promising to abate the supply-demand gap with effective vehicle dispatching. In this article, we propose a QRewriter - Dueling Deep Q-Network (QRewriter-DDQN) algorithm, to dispatch multiple available vehicles in ahead to the locations with high demand to serve more orders. The QRewriter-DDQN algorithm factorizes into a Dueling Deep Q-Network (DDQN) module and a QRewriter module, which are parameterized by neural networks and Q-table with Reinforcement Learning (RL) methods, respectively. Particularly, DDQN module utilizes the Kullback-Leibler (KL) distribution distance between supply (available vehicles) and demand (orders) as excitation to capture the complex dynamic variations of supply-demand. Afterwards, the QRewriter module learns to improve the DDQN dispatching policy with the streamlined and effective Q-table in RL. Importantly, the higher performance improvement space of the DDQN dispatching policy can be obtained by aggregating QRewriter state into low-dimension meta state. A simulator is designed to train and test the performance of QRewriter-DDQN, the experiment results show the significant improvement of QRewriter-DDQN in terms of order response rate.https://ieeexplore.ieee.org/document/9157835/Deep reinforcement learning (DRL)fleet managementlearn to improvemulti-agent |
spellingShingle | Wenqi Zhang Qiang Wang Jingjing Li Chen Xu Dynamic Fleet Management With Rewriting Deep Reinforcement Learning IEEE Access Deep reinforcement learning (DRL) fleet management learn to improve multi-agent |
title | Dynamic Fleet Management With Rewriting Deep Reinforcement Learning |
title_full | Dynamic Fleet Management With Rewriting Deep Reinforcement Learning |
title_fullStr | Dynamic Fleet Management With Rewriting Deep Reinforcement Learning |
title_full_unstemmed | Dynamic Fleet Management With Rewriting Deep Reinforcement Learning |
title_short | Dynamic Fleet Management With Rewriting Deep Reinforcement Learning |
title_sort | dynamic fleet management with rewriting deep reinforcement learning |
topic | Deep reinforcement learning (DRL) fleet management learn to improve multi-agent |
url | https://ieeexplore.ieee.org/document/9157835/ |
work_keys_str_mv | AT wenqizhang dynamicfleetmanagementwithrewritingdeepreinforcementlearning AT qiangwang dynamicfleetmanagementwithrewritingdeepreinforcementlearning AT jingjingli dynamicfleetmanagementwithrewritingdeepreinforcementlearning AT chenxu dynamicfleetmanagementwithrewritingdeepreinforcementlearning |