Optimizations to a massively parallel database and support of a shared scan architecture

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.

Bibliographic Details
Main Author: Ahwal, Saher B
Other Authors: Samuel Madden.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2014
Subjects:
Online Access:http://hdl.handle.net/1721.1/91454
_version_ 1811095620801265664
author Ahwal, Saher B
author2 Samuel Madden.
author_facet Samuel Madden.
Ahwal, Saher B
author_sort Ahwal, Saher B
collection MIT
description Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.
first_indexed 2024-09-23T16:21:26Z
format Thesis
id mit-1721.1/91454
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T16:21:26Z
publishDate 2014
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/914542019-04-10T14:32:48Z Optimizations to a massively parallel database and support of a shared scan architecture Ahwal, Saher B Samuel Madden. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014. 34 Cataloged from PDF version of thesis. Includes bibliographical references (pages 92-94). This thesis presents a new architecture and optimizations to MapD, a database server which uses a hybrid of multi-CPU/multi-GPU architecture for query execution and analysis. We tackle the challenge of partitioning the data across multiple nodes with many CPUs and GPUs by means of an indexing framework. We implement a QuadTree spatial partitioning scheme and demonstrate how it improves the latencies of many queries when using the index as opposed to not using any. Moreover, we tackle the challenge of processing many queries (perhaps issued concurrently) where queries have very fast latency constraints, e.g, for visualization. We implement a software architecture which allows for scheduling concurrent client query requests to share processing of many queries in a single pass through the data ("shared scans"). Our experiments exhibit orders of magnitude improvement in query throughput for both, skewed and non-skewed workloads, for shared scans as opposed to serial execution. by Saher B. Ahwal. M. Eng. 2014-11-04T21:37:51Z 2014-11-04T21:37:51Z 2014 2014 Thesis http://hdl.handle.net/1721.1/91454 893859361 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 94 pages application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Ahwal, Saher B
Optimizations to a massively parallel database and support of a shared scan architecture
title Optimizations to a massively parallel database and support of a shared scan architecture
title_full Optimizations to a massively parallel database and support of a shared scan architecture
title_fullStr Optimizations to a massively parallel database and support of a shared scan architecture
title_full_unstemmed Optimizations to a massively parallel database and support of a shared scan architecture
title_short Optimizations to a massively parallel database and support of a shared scan architecture
title_sort optimizations to a massively parallel database and support of a shared scan architecture
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/91454
work_keys_str_mv AT ahwalsaherb optimizationstoamassivelyparalleldatabaseandsupportofasharedscanarchitecture