Multi-Dimensional Data Compression and Query Processing in Array Databases

In recent times, the production of multidimensional data in various domains and their storage in array databases has witnessed a sharp increase; this rapid growth in data volumes necessitates compression in array databases. However, existing compression schemes used in array databases are general-pu...

Full description

Bibliographic Details
Main Authors: Minsoo Kim, Hyubjin Lee, Yon Dohn Chung
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9923935/
_version_ 1828164615564951552
author Minsoo Kim
Hyubjin Lee
Yon Dohn Chung
author_facet Minsoo Kim
Hyubjin Lee
Yon Dohn Chung
author_sort Minsoo Kim
collection DOAJ
description In recent times, the production of multidimensional data in various domains and their storage in array databases has witnessed a sharp increase; this rapid growth in data volumes necessitates compression in array databases. However, existing compression schemes used in array databases are general-purpose and not designed specifically for the databases. They could degrade query performance with complex analytical tasks, which incur huge computing costs. Thus, a compression scheme that considers the workflow of array databases is required. This study presents a compression scheme, SEACOW, for storing and querying multidimensional array data. The scheme is specially designed to be efficient for both dimension-based and value-based exploration. It considers data access patterns for exploration queries and embeds a synopsis, which can be utilized as an index, in the compressed array. In addition, we implement an array storage system, namely MSDB, to perform experiments. We evaluate query performance on real scientific datasets and compared it with those of existing compression schemes. Finally, our experiments demonstrate that SEACOW provides high compression rates compared to existing compression schemes, and the synopsis improves analytical query processing performance.
first_indexed 2024-04-12T01:29:21Z
format Article
id doaj.art-38c323138c424b2394e5272e04b8c861
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-12T01:29:21Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-38c323138c424b2394e5272e04b8c8612022-12-22T03:53:32ZengIEEEIEEE Access2169-35362022-01-011011152811154410.1109/ACCESS.2022.32155259923935Multi-Dimensional Data Compression and Query Processing in Array DatabasesMinsoo Kim0https://orcid.org/0000-0003-3450-9721Hyubjin Lee1https://orcid.org/0000-0003-0046-7316Yon Dohn Chung2https://orcid.org/0000-0003-2070-5123Department of Computer Science and Engineering, Korea University, Seongbuk-gu, Seoul, Republic of KoreaDepartment of Computer Science and Engineering, Korea University, Seongbuk-gu, Seoul, Republic of KoreaDepartment of Computer Science and Engineering, Korea University, Seongbuk-gu, Seoul, Republic of KoreaIn recent times, the production of multidimensional data in various domains and their storage in array databases has witnessed a sharp increase; this rapid growth in data volumes necessitates compression in array databases. However, existing compression schemes used in array databases are general-purpose and not designed specifically for the databases. They could degrade query performance with complex analytical tasks, which incur huge computing costs. Thus, a compression scheme that considers the workflow of array databases is required. This study presents a compression scheme, SEACOW, for storing and querying multidimensional array data. The scheme is specially designed to be efficient for both dimension-based and value-based exploration. It considers data access patterns for exploration queries and embeds a synopsis, which can be utilized as an index, in the compressed array. In addition, we implement an array storage system, namely MSDB, to perform experiments. We evaluate query performance on real scientific datasets and compared it with those of existing compression schemes. Finally, our experiments demonstrate that SEACOW provides high compression rates compared to existing compression schemes, and the synopsis improves analytical query processing performance.https://ieeexplore.ieee.org/document/9923935/Arraysdata compressiondata structuresdatabase systemsdiscrete wavelet transformsHuffman coding
spellingShingle Minsoo Kim
Hyubjin Lee
Yon Dohn Chung
Multi-Dimensional Data Compression and Query Processing in Array Databases
IEEE Access
Arrays
data compression
data structures
database systems
discrete wavelet transforms
Huffman coding
title Multi-Dimensional Data Compression and Query Processing in Array Databases
title_full Multi-Dimensional Data Compression and Query Processing in Array Databases
title_fullStr Multi-Dimensional Data Compression and Query Processing in Array Databases
title_full_unstemmed Multi-Dimensional Data Compression and Query Processing in Array Databases
title_short Multi-Dimensional Data Compression and Query Processing in Array Databases
title_sort multi dimensional data compression and query processing in array databases
topic Arrays
data compression
data structures
database systems
discrete wavelet transforms
Huffman coding
url https://ieeexplore.ieee.org/document/9923935/
work_keys_str_mv AT minsookim multidimensionaldatacompressionandqueryprocessinginarraydatabases
AT hyubjinlee multidimensionaldatacompressionandqueryprocessinginarraydatabases
AT yondohnchung multidimensionaldatacompressionandqueryprocessinginarraydatabases