Deep reinforcement learning for optimal resource allocation (II)
As vending machines become increasingly intelligent, enabling real-time updates of stock levels, the manual task of restocking remains a logistical challenge. This paper addresses the efficient restocking of vending machines via the Capacitated Vehicle Routing Problem (CVRP), focusing on optimiz...
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project (FYP) |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/174967 |
_version_ | 1826129997524369408 |
---|---|
author | Uday, Nihal Arya |
author2 | Zhang Jie |
author_facet | Zhang Jie Uday, Nihal Arya |
author_sort | Uday, Nihal Arya |
collection | NTU |
description | As vending machines become increasingly intelligent, enabling real-time updates of
stock levels, the manual task of restocking remains a logistical challenge. This paper
addresses the efficient restocking of vending machines via the Capacitated Vehicle Routing Problem (CVRP), focusing on optimizing routes for limited-capacity vehicles to meet demand without exceeding capacity, while minimizing costs. Traditional heuristics such as LKH-3 have shown robust performance in CVRP but face limitations in scalability and adaptability. This study compares two advanced learning-based approaches—L2D, employing deep reinforcement learning, and NCO, with its innovative light encoder and heavy decoder architecture—against the LKH-3 algorithm. Through detailed experimentation, we evaluate their scalability, computational efficiency, and solution quality.
Our findings reveal that while L2D and NCO exhibit superior generalization capabilities and demonstrate promising scalability to large-scale problem instances, nuances
in performance and efficiency metrics highlight their respective strengths and areas for improvement. The comparative analysis not only underscores the potential of learning-based models in overcoming the limitations of traditional heuristics but also delineates the path for future research in integrating the computational intelligence of machine learning with the intuitive problem-solving prowess of heuristic algorithms. This synthesis aims to pave the way for innovative solutions to CVRP and other combinatorial optimization challenges, marking a significant stride toward leveraging artificial intelligence in operational research. |
first_indexed | 2024-10-01T07:49:20Z |
format | Final Year Project (FYP) |
id | ntu-10356/174967 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T07:49:20Z |
publishDate | 2024 |
publisher | Nanyang Technological University |
record_format | dspace |
spelling | ntu-10356/1749672024-04-19T15:46:25Z Deep reinforcement learning for optimal resource allocation (II) Uday, Nihal Arya Zhang Jie School of Computer Science and Engineering ZhangJ@ntu.edu.sg Computer and Information Science As vending machines become increasingly intelligent, enabling real-time updates of stock levels, the manual task of restocking remains a logistical challenge. This paper addresses the efficient restocking of vending machines via the Capacitated Vehicle Routing Problem (CVRP), focusing on optimizing routes for limited-capacity vehicles to meet demand without exceeding capacity, while minimizing costs. Traditional heuristics such as LKH-3 have shown robust performance in CVRP but face limitations in scalability and adaptability. This study compares two advanced learning-based approaches—L2D, employing deep reinforcement learning, and NCO, with its innovative light encoder and heavy decoder architecture—against the LKH-3 algorithm. Through detailed experimentation, we evaluate their scalability, computational efficiency, and solution quality. Our findings reveal that while L2D and NCO exhibit superior generalization capabilities and demonstrate promising scalability to large-scale problem instances, nuances in performance and efficiency metrics highlight their respective strengths and areas for improvement. The comparative analysis not only underscores the potential of learning-based models in overcoming the limitations of traditional heuristics but also delineates the path for future research in integrating the computational intelligence of machine learning with the intuitive problem-solving prowess of heuristic algorithms. This synthesis aims to pave the way for innovative solutions to CVRP and other combinatorial optimization challenges, marking a significant stride toward leveraging artificial intelligence in operational research. Bachelor's degree 2024-04-17T07:39:00Z 2024-04-17T07:39:00Z 2024 Final Year Project (FYP) Uday, N. A. (2024). Deep reinforcement learning for optimal resource allocation (II). Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/174967 https://hdl.handle.net/10356/174967 en application/pdf Nanyang Technological University |
spellingShingle | Computer and Information Science Uday, Nihal Arya Deep reinforcement learning for optimal resource allocation (II) |
title | Deep reinforcement learning for optimal resource allocation (II) |
title_full | Deep reinforcement learning for optimal resource allocation (II) |
title_fullStr | Deep reinforcement learning for optimal resource allocation (II) |
title_full_unstemmed | Deep reinforcement learning for optimal resource allocation (II) |
title_short | Deep reinforcement learning for optimal resource allocation (II) |
title_sort | deep reinforcement learning for optimal resource allocation ii |
topic | Computer and Information Science |
url | https://hdl.handle.net/10356/174967 |
work_keys_str_mv | AT udaynihalarya deepreinforcementlearningforoptimalresourceallocationii |