Incremental elasticity for array databases
Relational databases benefit significantly from elasticity, whereby they execute on a set of changing hardware resources provisioned to match their storage and processing requirements. Such flexibility is especially attractive for scientific databases because their users often have a no-overwrite st...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Association for Computing Machinery (ACM)
2014
|
Online Access: | http://hdl.handle.net/1721.1/90874 https://orcid.org/0000-0001-9184-9058 |
_version_ | 1811087730964168704 |
---|---|
author | Duggan, Jennie Stonebraker, Michael |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Duggan, Jennie Stonebraker, Michael |
author_sort | Duggan, Jennie |
collection | MIT |
description | Relational databases benefit significantly from elasticity, whereby they execute on a set of changing hardware resources provisioned to match their storage and processing requirements. Such flexibility is especially attractive for scientific databases because their users often have a no-overwrite storage model, in which they delete data only when their available space is exhausted. This results in a database that is regularly growing and expanding its hardware proportionally. Also, scientific databases frequently store their data as multidimensional arrays optimized for spatial querying. This brings about several novel challenges in clustered, skew-aware data placement on an elastic shared-nothing database. In this work, we design and implement elasticity for an array database. We address this challenge on two fronts: determining when to expand a database cluster and how to partition the data within it. In both steps we propose incremental approaches, affecting a minimum set of data and nodes, while maintaining high performance. We introduce an algorithm for gradually augmenting an array database's hardware using a closed-loop control system. After the cluster adds nodes, we optimize data placement for n-dimensional arrays. Many of our elastic partitioners incrementally reorganize an array, redistributing data only to new nodes. By combining these two tools, the scientific database efficiently and seamlessly manages its monotonically increasing hardware resources. |
first_indexed | 2024-09-23T13:51:06Z |
format | Article |
id | mit-1721.1/90874 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T13:51:06Z |
publishDate | 2014 |
publisher | Association for Computing Machinery (ACM) |
record_format | dspace |
spelling | mit-1721.1/908742022-09-28T16:36:11Z Incremental elasticity for array databases Duggan, Jennie Stonebraker, Michael Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Duggan, Jennie Stonebraker, Michael Relational databases benefit significantly from elasticity, whereby they execute on a set of changing hardware resources provisioned to match their storage and processing requirements. Such flexibility is especially attractive for scientific databases because their users often have a no-overwrite storage model, in which they delete data only when their available space is exhausted. This results in a database that is regularly growing and expanding its hardware proportionally. Also, scientific databases frequently store their data as multidimensional arrays optimized for spatial querying. This brings about several novel challenges in clustered, skew-aware data placement on an elastic shared-nothing database. In this work, we design and implement elasticity for an array database. We address this challenge on two fronts: determining when to expand a database cluster and how to partition the data within it. In both steps we propose incremental approaches, affecting a minimum set of data and nodes, while maintaining high performance. We introduce an algorithm for gradually augmenting an array database's hardware using a closed-loop control system. After the cluster adds nodes, we optimize data placement for n-dimensional arrays. Many of our elastic partitioners incrementally reorganize an array, redistributing data only to new nodes. By combining these two tools, the scientific database efficiently and seamlessly manages its monotonically increasing hardware resources. Intel Corporation (Science and Technology Center for Big Data) 2014-10-10T12:17:09Z 2014-10-10T12:17:09Z 2014-06 Article http://purl.org/eprint/type/ConferencePaper 9781450323765 http://hdl.handle.net/1721.1/90874 Jennie Duggan and Michael Stonebraker. 2014. Incremental elasticity for array databases. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data (SIGMOD '14). ACM, New York, NY, USA, 409-420. https://orcid.org/0000-0001-9184-9058 en_US http://dx.doi.org/10.1145/2588555.2588569 Proceedings of the 2014 ACM SIGMOD international conference on Management of data (SIGMOD '14) Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Association for Computing Machinery (ACM) MIT web domain |
spellingShingle | Duggan, Jennie Stonebraker, Michael Incremental elasticity for array databases |
title | Incremental elasticity for array databases |
title_full | Incremental elasticity for array databases |
title_fullStr | Incremental elasticity for array databases |
title_full_unstemmed | Incremental elasticity for array databases |
title_short | Incremental elasticity for array databases |
title_sort | incremental elasticity for array databases |
url | http://hdl.handle.net/1721.1/90874 https://orcid.org/0000-0001-9184-9058 |
work_keys_str_mv | AT dugganjennie incrementalelasticityforarraydatabases AT stonebrakermichael incrementalelasticityforarraydatabases |