Optimizing Disjunctive Queries with Tagged Execution

Despite decades of research into query optimization, optimizing queries with disjunctive predicate expressions remains a challenge. Solutions employed by existing systems (if any) are often simplistic and lead to much redundant work being performed by the execution engine. To address these problem...

Full description

Bibliographic Details
Main Authors: Kim, Albert, Madden, Samuel
Format: Article
Language:English
Published: Association for Computing Machinery 2024
Online Access:https://hdl.handle.net/1721.1/155205
_version_ 1811090026803494912
author Kim, Albert
Madden, Samuel
author_facet Kim, Albert
Madden, Samuel
author_sort Kim, Albert
collection MIT
description Despite decades of research into query optimization, optimizing queries with disjunctive predicate expressions remains a challenge. Solutions employed by existing systems (if any) are often simplistic and lead to much redundant work being performed by the execution engine. To address these problems, we propose a novel form of query execution called tagged execution. Tagged execution groups tuples into subrelations based on which predicates in the query they satisfy (or don't satisfy) and tags them with that information. These tags then provide additional context for query operators to take advantage of during runtime, allowing them to eliminate much of the redundant work performed by traditional engines and realize predicate pushdown optimizations for disjunctive predicates. However, tagged execution brings its own challenges, and the question of what tags to create is a nontrivial one. Careless creation of tags can lead to an exponential blowup in the tag space, with the overhead outweighing the benefits. To address this issue, we present a technique called tag generalization to minimize the space of tags. We implemented the tagged execution model with tag generalization in our system Basilisk, and our evaluation shows an average 2.7x speedup in runtime over the traditional execution model with up to a 19x speedup in certain situations.
first_indexed 2024-09-23T14:30:41Z
format Article
id mit-1721.1/155205
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T14:30:41Z
publishDate 2024
publisher Association for Computing Machinery
record_format dspace
spelling mit-1721.1/1552052024-09-20T04:24:16Z Optimizing Disjunctive Queries with Tagged Execution Kim, Albert Madden, Samuel Despite decades of research into query optimization, optimizing queries with disjunctive predicate expressions remains a challenge. Solutions employed by existing systems (if any) are often simplistic and lead to much redundant work being performed by the execution engine. To address these problems, we propose a novel form of query execution called tagged execution. Tagged execution groups tuples into subrelations based on which predicates in the query they satisfy (or don't satisfy) and tags them with that information. These tags then provide additional context for query operators to take advantage of during runtime, allowing them to eliminate much of the redundant work performed by traditional engines and realize predicate pushdown optimizations for disjunctive predicates. However, tagged execution brings its own challenges, and the question of what tags to create is a nontrivial one. Careless creation of tags can lead to an exponential blowup in the tag space, with the overhead outweighing the benefits. To address this issue, we present a technique called tag generalization to minimize the space of tags. We implemented the tagged execution model with tag generalization in our system Basilisk, and our evaluation shows an average 2.7x speedup in runtime over the traditional execution model with up to a 19x speedup in certain situations. 2024-06-06T16:23:06Z 2024-06-06T16:23:06Z 2024-05-29 2024-06-01T07:58:03Z Article http://purl.org/eprint/type/JournalArticle 2836-6573 https://hdl.handle.net/1721.1/155205 Kim, Albert and Madden, Samuel. 2024. "Optimizing Disjunctive Queries with Tagged Execution." Proceedings of the ACM on Management of Data, 2 (3). PUBLISHER_CC en 10.1145/3654961 Proceedings of the ACM on Management of Data Creative Commons Attribution-Noncommercial-ShareAlike https://creativecommons.org/licenses/by-sa/4.0/ The author(s) application/pdf Association for Computing Machinery Association for Computing Machinery
spellingShingle Kim, Albert
Madden, Samuel
Optimizing Disjunctive Queries with Tagged Execution
title Optimizing Disjunctive Queries with Tagged Execution
title_full Optimizing Disjunctive Queries with Tagged Execution
title_fullStr Optimizing Disjunctive Queries with Tagged Execution
title_full_unstemmed Optimizing Disjunctive Queries with Tagged Execution
title_short Optimizing Disjunctive Queries with Tagged Execution
title_sort optimizing disjunctive queries with tagged execution
url https://hdl.handle.net/1721.1/155205
work_keys_str_mv AT kimalbert optimizingdisjunctivequerieswithtaggedexecution
AT maddensamuel optimizingdisjunctivequerieswithtaggedexecution