Analyses of indexing techniques on uncertain data with high dimensionality

Deploying a solution for handling critical decision-based problem efficiently requires the processing of high-dimensional data. Over the years, due to modern technological advancement, unprecedented volume of uncertain data is been captured and this has necessitated the need to organize such data fo...

Full description

Bibliographic Details
Main Authors: Mohammed Lawal, Ma’aruf, Ibrahim, Hamidah, Mohd Sani, Nor Fazlida, Yaakob, Razali
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers 2020
Online Access:http://psasir.upm.edu.my/id/eprint/87851/1/ABSTRACT.pdf
_version_ 1825952420287479808
author Mohammed Lawal, Ma’aruf
Ibrahim, Hamidah
Mohd Sani, Nor Fazlida
Yaakob, Razali
author_facet Mohammed Lawal, Ma’aruf
Ibrahim, Hamidah
Mohd Sani, Nor Fazlida
Yaakob, Razali
author_sort Mohammed Lawal, Ma’aruf
collection UPM
description Deploying a solution for handling critical decision-based problem efficiently requires the processing of high-dimensional data. Over the years, due to modern technological advancement, unprecedented volume of uncertain data is been captured and this has necessitated the need to organize such data for better data access performance. To this effect, the use of indexing technique for supporting, organizing, and storing of uncertain data with high dimensionality has become pertinent. However, the choice of an indexing technique to improve search performance is highly influenced by the properties of the underlying data set, data construction methods employed by the indexing structure, and the query types it supports. This paper is motivated to conduct an extensive performance analysis among existing indexing techniques, namely: R-tree, R*-tree and X-tree, in order to realize the most efficient indexing structure for organizing, storing and ultimately improving search performance over uncertain data with high dimensionality. The results of the analyses with regard to CPU processing time and number of nodes visited clearly show the superiority of X-tree over R-tree and R*-tree, as its superiority holds for different data set sizes, data distributions, number of dimensions and even with varying selectivity ratio.
first_indexed 2024-03-06T10:44:22Z
format Article
id upm.eprints-87851
institution Universiti Putra Malaysia
language English
last_indexed 2024-03-06T10:44:22Z
publishDate 2020
publisher Institute of Electrical and Electronics Engineers
record_format dspace
spelling upm.eprints-878512022-06-14T08:35:26Z http://psasir.upm.edu.my/id/eprint/87851/ Analyses of indexing techniques on uncertain data with high dimensionality Mohammed Lawal, Ma’aruf Ibrahim, Hamidah Mohd Sani, Nor Fazlida Yaakob, Razali Deploying a solution for handling critical decision-based problem efficiently requires the processing of high-dimensional data. Over the years, due to modern technological advancement, unprecedented volume of uncertain data is been captured and this has necessitated the need to organize such data for better data access performance. To this effect, the use of indexing technique for supporting, organizing, and storing of uncertain data with high dimensionality has become pertinent. However, the choice of an indexing technique to improve search performance is highly influenced by the properties of the underlying data set, data construction methods employed by the indexing structure, and the query types it supports. This paper is motivated to conduct an extensive performance analysis among existing indexing techniques, namely: R-tree, R*-tree and X-tree, in order to realize the most efficient indexing structure for organizing, storing and ultimately improving search performance over uncertain data with high dimensionality. The results of the analyses with regard to CPU processing time and number of nodes visited clearly show the superiority of X-tree over R-tree and R*-tree, as its superiority holds for different data set sizes, data distributions, number of dimensions and even with varying selectivity ratio. Institute of Electrical and Electronics Engineers 2020 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/87851/1/ABSTRACT.pdf Mohammed Lawal, Ma’aruf and Ibrahim, Hamidah and Mohd Sani, Nor Fazlida and Yaakob, Razali (2020) Analyses of indexing techniques on uncertain data with high dimensionality. IEEE Access, 8. 74101 - 74117. ISSN 2169-3536 https://ieeexplore.ieee.org/document/9069901
spellingShingle Mohammed Lawal, Ma’aruf
Ibrahim, Hamidah
Mohd Sani, Nor Fazlida
Yaakob, Razali
Analyses of indexing techniques on uncertain data with high dimensionality
title Analyses of indexing techniques on uncertain data with high dimensionality
title_full Analyses of indexing techniques on uncertain data with high dimensionality
title_fullStr Analyses of indexing techniques on uncertain data with high dimensionality
title_full_unstemmed Analyses of indexing techniques on uncertain data with high dimensionality
title_short Analyses of indexing techniques on uncertain data with high dimensionality
title_sort analyses of indexing techniques on uncertain data with high dimensionality
url http://psasir.upm.edu.my/id/eprint/87851/1/ABSTRACT.pdf
work_keys_str_mv AT mohammedlawalmaaruf analysesofindexingtechniquesonuncertaindatawithhighdimensionality
AT ibrahimhamidah analysesofindexingtechniquesonuncertaindatawithhighdimensionality
AT mohdsaninorfazlida analysesofindexingtechniquesonuncertaindatawithhighdimensionality
AT yaakobrazali analysesofindexingtechniquesonuncertaindatawithhighdimensionality