Reinforcement Learning Approach to Stochastic Vehicle Routing Problem With Correlated Demands
We present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental d...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10223206/ |
_version_ | 1797736961196163072 |
---|---|
author | Zangir Iklassov Ikboljon Sobirov Ruben Solozabal Martin Takac |
author_facet | Zangir Iklassov Ikboljon Sobirov Ruben Solozabal Martin Takac |
author_sort | Zangir Iklassov |
collection | DOAJ |
description | We present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental demonstration of the theoretical premise that non-i.i.d. stochastic demands provide opportunities for improved routing solutions. Our approach bridges the gap in the application of RL to VRPSD and consists of a parameterized stochastic policy optimized using a policy gradient algorithm to generate a sequence of actions that form the solution. Our model outperforms previous state-of-the-art metaheuristics and demonstrates robustness to changes in the environment, such as the supply type, vehicle capacity, correlation, and noise levels of demand. Moreover, the model can be easily retrained for different VRPSD scenarios by observing the reward signals and following feasibility constraints, making it highly flexible and scalable. These findings highlight the potential of RL to enhance the transportation efficiency and mitigate its environmental impact in stochastic routing problems. Our implementation is available in <uri>https://github.com/Zangir/SVRP</uri>. |
first_indexed | 2024-03-12T13:21:29Z |
format | Article |
id | doaj.art-8106d37416784508906865d6f0a2a522 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-12T13:21:29Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-8106d37416784508906865d6f0a2a5222023-08-25T23:00:57ZengIEEEIEEE Access2169-35362023-01-0111879588796910.1109/ACCESS.2023.330607610223206Reinforcement Learning Approach to Stochastic Vehicle Routing Problem With Correlated DemandsZangir Iklassov0https://orcid.org/0000-0002-2835-990XIkboljon Sobirov1https://orcid.org/0000-0002-0476-6359Ruben Solozabal2Martin Takac3https://orcid.org/0000-0001-7455-2025Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, United Arab EmiratesDepartment of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, United Arab EmiratesDepartment of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, United Arab EmiratesDepartment of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, United Arab EmiratesWe present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental demonstration of the theoretical premise that non-i.i.d. stochastic demands provide opportunities for improved routing solutions. Our approach bridges the gap in the application of RL to VRPSD and consists of a parameterized stochastic policy optimized using a policy gradient algorithm to generate a sequence of actions that form the solution. Our model outperforms previous state-of-the-art metaheuristics and demonstrates robustness to changes in the environment, such as the supply type, vehicle capacity, correlation, and noise levels of demand. Moreover, the model can be easily retrained for different VRPSD scenarios by observing the reward signals and following feasibility constraints, making it highly flexible and scalable. These findings highlight the potential of RL to enhance the transportation efficiency and mitigate its environmental impact in stochastic routing problems. Our implementation is available in <uri>https://github.com/Zangir/SVRP</uri>.https://ieeexplore.ieee.org/document/10223206/Reinforcement learningstopchastic optimizationvehicle routing problem |
spellingShingle | Zangir Iklassov Ikboljon Sobirov Ruben Solozabal Martin Takac Reinforcement Learning Approach to Stochastic Vehicle Routing Problem With Correlated Demands IEEE Access Reinforcement learning stopchastic optimization vehicle routing problem |
title | Reinforcement Learning Approach to Stochastic Vehicle Routing Problem With Correlated Demands |
title_full | Reinforcement Learning Approach to Stochastic Vehicle Routing Problem With Correlated Demands |
title_fullStr | Reinforcement Learning Approach to Stochastic Vehicle Routing Problem With Correlated Demands |
title_full_unstemmed | Reinforcement Learning Approach to Stochastic Vehicle Routing Problem With Correlated Demands |
title_short | Reinforcement Learning Approach to Stochastic Vehicle Routing Problem With Correlated Demands |
title_sort | reinforcement learning approach to stochastic vehicle routing problem with correlated demands |
topic | Reinforcement learning stopchastic optimization vehicle routing problem |
url | https://ieeexplore.ieee.org/document/10223206/ |
work_keys_str_mv | AT zangiriklassov reinforcementlearningapproachtostochasticvehicleroutingproblemwithcorrelateddemands AT ikboljonsobirov reinforcementlearningapproachtostochasticvehicleroutingproblemwithcorrelateddemands AT rubensolozabal reinforcementlearningapproachtostochasticvehicleroutingproblemwithcorrelateddemands AT martintakac reinforcementlearningapproachtostochasticvehicleroutingproblemwithcorrelateddemands |