The Case for Learned Index Structures

© 2018 Association for Computing Machinery. Indexes are models: a B-Tree-Index can be seen as a model to map a key to the position of a record within a sorted array, a Hash-Index as a model to map a key to a position of a record within an unsorted array, and a BitMap-Index as a model to indicate if...

Full description

Bibliographic Details
Main Authors: Kraska, Tim, Beutel, Alex, Chi, Ed H, Dean, Jeffrey, Polyzotis, Neoklis
Format: Article
Language:English
Published: Association for Computing Machinery (ACM) 2021
Online Access:https://hdl.handle.net/1721.1/132272
Description
Summary:© 2018 Association for Computing Machinery. Indexes are models: a B-Tree-Index can be seen as a model to map a key to the position of a record within a sorted array, a Hash-Index as a model to map a key to a position of a record within an unsorted array, and a BitMap-Index as a model to indicate if a data record exists or not. In this exploratory research paper, we start from this premise and posit that all existing index structures can be replaced with other types of models, including deep-learning models, which we term learned indexes. We theoretically analyze under which conditions learned indexes outperform traditional index structures and describe the main challenges in designing learned index structures. Our initial results show that our learned indexes can have significant advantages over traditional indexes. More importantly, we believe that the idea of replacing core components of a data management system through learned models has far reaching implications for future systems designs and that this work provides just a glimpse of what might be possible.