Instance-Optimized Data Structures for Membership Queries

We are near the end of Moore’s law and hardware growth has hit a stagnation. Modern data processing systems need to continuously improve their performance to match the humongous growth of data. Data structures and algorithms such as sorting, indexes, filters, hash tables, query optimization, etc are...

Full description

Bibliographic Details
Main Author: Vaidya, Kapil Eknath
Other Authors: Kraska, Tim
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/150074
Description
Summary:We are near the end of Moore’s law and hardware growth has hit a stagnation. Modern data processing systems need to continuously improve their performance to match the humongous growth of data. Data structures and algorithms such as sorting, indexes, filters, hash tables, query optimization, etc are the fundamental building blocks of these systems and dictate their performance. Traditional data structures and algorithms provide worst-case guarantees by making no assumptions about the data or workload. Thus, the resulting data processing system gives an adequate performance in the average case but may not be optimal for a particular use case. In this thesis, we will look at how to redesign membership query data structures so they can automatically adapt to an individual use case. These instance-optimized data structures act as drop in replacements for their counterparts in systems and improve their performance without any significant overhaul of the system or labor-intensive manual tuning.