Generative inverse reinforcement learning for learning 2-opt heuristics without extrinsic rewards in routing problems

Deep reinforcement learning (DRL) has shown promise in solving challenging combinatorial optimization (CO) problems, such as the traveling salesman problem (TSP) and vehicle routing problem (VRP). However, existing DRL methods rely on manually designed reward functions, which may be inaccurate or un...

Full description

Bibliographic Details
Main Authors: Qi Wang, Yongsheng Hao, Jiawei Zhang
Format: Article
Language:English
Published: Elsevier 2023-10-01
Series:Journal of King Saud University: Computer and Information Sciences
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1319157823003415