Learning to rank for synthesizing planning heuristics

We investigate learning heuristics for domainspecific planning. Prior work framed learning a heuristic as an ordinary regression problem. However, in a greedy best-first search, the ordering of states induced by a heuristic is more indicative of the resulting planner’s performance than mean squared...

Full description

Bibliographic Details
Main Authors: Garrett, Caelan Reed, Kaelbling, Leslie P, Lozano-Perez, Tomas
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:en_US
Published: AAAI Press 2018
Online Access:http://hdl.handle.net/1721.1/115313
https://orcid.org/0000-0002-6474-1276
https://orcid.org/0000-0001-6054-7145
https://orcid.org/0000-0002-8657-2450
Description
Summary:We investigate learning heuristics for domainspecific planning. Prior work framed learning a heuristic as an ordinary regression problem. However, in a greedy best-first search, the ordering of states induced by a heuristic is more indicative of the resulting planner’s performance than mean squared error. Thus, we instead frame learning a heuristic as a learning to rank problem which we solve using a RankSVM formulation. Additionally, we introduce new methods for computing features that capture temporal interactions in an approximate plan. Our experiments on recent International Planning Competition problems show that the RankSVM learned heuristics outperform both the original heuristics and heuristics learned through ordinary regression.