Incremental elasticity for array databases

Relational databases benefit significantly from elasticity, whereby they execute on a set of changing hardware resources provisioned to match their storage and processing requirements. Such flexibility is especially attractive for scientific databases because their users often have a no-overwrite st...

Full description

Bibliographic Details
Main Authors: Duggan, Jennie, Stonebraker, Michael
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:en_US
Published: Association for Computing Machinery (ACM) 2014
Online Access:http://hdl.handle.net/1721.1/90874
https://orcid.org/0000-0001-9184-9058
_version_ 1811087730964168704
author Duggan, Jennie
Stonebraker, Michael
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Duggan, Jennie
Stonebraker, Michael
author_sort Duggan, Jennie
collection MIT
description Relational databases benefit significantly from elasticity, whereby they execute on a set of changing hardware resources provisioned to match their storage and processing requirements. Such flexibility is especially attractive for scientific databases because their users often have a no-overwrite storage model, in which they delete data only when their available space is exhausted. This results in a database that is regularly growing and expanding its hardware proportionally. Also, scientific databases frequently store their data as multidimensional arrays optimized for spatial querying. This brings about several novel challenges in clustered, skew-aware data placement on an elastic shared-nothing database. In this work, we design and implement elasticity for an array database. We address this challenge on two fronts: determining when to expand a database cluster and how to partition the data within it. In both steps we propose incremental approaches, affecting a minimum set of data and nodes, while maintaining high performance. We introduce an algorithm for gradually augmenting an array database's hardware using a closed-loop control system. After the cluster adds nodes, we optimize data placement for n-dimensional arrays. Many of our elastic partitioners incrementally reorganize an array, redistributing data only to new nodes. By combining these two tools, the scientific database efficiently and seamlessly manages its monotonically increasing hardware resources.
first_indexed 2024-09-23T13:51:06Z
format Article
id mit-1721.1/90874
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T13:51:06Z
publishDate 2014
publisher Association for Computing Machinery (ACM)
record_format dspace
spelling mit-1721.1/908742022-09-28T16:36:11Z Incremental elasticity for array databases Duggan, Jennie Stonebraker, Michael Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Duggan, Jennie Stonebraker, Michael Relational databases benefit significantly from elasticity, whereby they execute on a set of changing hardware resources provisioned to match their storage and processing requirements. Such flexibility is especially attractive for scientific databases because their users often have a no-overwrite storage model, in which they delete data only when their available space is exhausted. This results in a database that is regularly growing and expanding its hardware proportionally. Also, scientific databases frequently store their data as multidimensional arrays optimized for spatial querying. This brings about several novel challenges in clustered, skew-aware data placement on an elastic shared-nothing database. In this work, we design and implement elasticity for an array database. We address this challenge on two fronts: determining when to expand a database cluster and how to partition the data within it. In both steps we propose incremental approaches, affecting a minimum set of data and nodes, while maintaining high performance. We introduce an algorithm for gradually augmenting an array database's hardware using a closed-loop control system. After the cluster adds nodes, we optimize data placement for n-dimensional arrays. Many of our elastic partitioners incrementally reorganize an array, redistributing data only to new nodes. By combining these two tools, the scientific database efficiently and seamlessly manages its monotonically increasing hardware resources. Intel Corporation (Science and Technology Center for Big Data) 2014-10-10T12:17:09Z 2014-10-10T12:17:09Z 2014-06 Article http://purl.org/eprint/type/ConferencePaper 9781450323765 http://hdl.handle.net/1721.1/90874 Jennie Duggan and Michael Stonebraker. 2014. Incremental elasticity for array databases. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data (SIGMOD '14). ACM, New York, NY, USA, 409-420. https://orcid.org/0000-0001-9184-9058 en_US http://dx.doi.org/10.1145/2588555.2588569 Proceedings of the 2014 ACM SIGMOD international conference on Management of data (SIGMOD '14) Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Association for Computing Machinery (ACM) MIT web domain
spellingShingle Duggan, Jennie
Stonebraker, Michael
Incremental elasticity for array databases
title Incremental elasticity for array databases
title_full Incremental elasticity for array databases
title_fullStr Incremental elasticity for array databases
title_full_unstemmed Incremental elasticity for array databases
title_short Incremental elasticity for array databases
title_sort incremental elasticity for array databases
url http://hdl.handle.net/1721.1/90874
https://orcid.org/0000-0001-9184-9058
work_keys_str_mv AT dugganjennie incrementalelasticityforarraydatabases
AT stonebrakermichael incrementalelasticityforarraydatabases