k-NN Query Optimization for High-Dimensional Index Using Machine Learning

In this study, we propose three k-nearest neighbor (k-NN) optimization techniques for a distributed, in-memory-based, high-dimensional indexing method to speed up content-based image retrieval. The proposed techniques perform distributed, in-memory, high-dimensional indexing-based k-NN query optimiz...

Full description

Bibliographic Details
Main Authors: Dojin Choi, Jiwon Wee, Sangho Song, Hyeonbyeong Lee, Jongtae Lim, Kyoungsoo Bok, Jaesoo Yoo
Format: Article
Language:English
Published: MDPI AG 2023-05-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/12/11/2375
_version_ 1797597710844428288
author Dojin Choi
Jiwon Wee
Sangho Song
Hyeonbyeong Lee
Jongtae Lim
Kyoungsoo Bok
Jaesoo Yoo
author_facet Dojin Choi
Jiwon Wee
Sangho Song
Hyeonbyeong Lee
Jongtae Lim
Kyoungsoo Bok
Jaesoo Yoo
author_sort Dojin Choi
collection DOAJ
description In this study, we propose three k-nearest neighbor (k-NN) optimization techniques for a distributed, in-memory-based, high-dimensional indexing method to speed up content-based image retrieval. The proposed techniques perform distributed, in-memory, high-dimensional indexing-based k-NN query optimization: a density-based optimization technique that performs k-NN optimization using data distribution; a cost-based optimization technique using query processing cost statistics; and a learning-based optimization technique using a deep learning model, based on query logs. The proposed techniques were implemented on Spark, which supports a master/slave model for large-scale distributed processing. We showed the superiority and validity of the proposed techniques through various performance evaluations, based on high-dimensional data.
first_indexed 2024-03-11T03:09:20Z
format Article
id doaj.art-3111c10d6f1c439e9d479be56b3a28d7
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-11T03:09:20Z
publishDate 2023-05-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-3111c10d6f1c439e9d479be56b3a28d72023-11-18T07:44:05ZengMDPI AGElectronics2079-92922023-05-011211237510.3390/electronics12112375k-NN Query Optimization for High-Dimensional Index Using Machine LearningDojin Choi0Jiwon Wee1Sangho Song2Hyeonbyeong Lee3Jongtae Lim4Kyoungsoo Bok5Jaesoo Yoo6Department of Computer Engineering, Changwon National University, Changwon 51140, Republic of KoreaDepartment of Information and Communication Engineering, Chungbuk National University, Cheongju 28644, Republic of KoreaDepartment of Information and Communication Engineering, Chungbuk National University, Cheongju 28644, Republic of KoreaDepartment of Information and Communication Engineering, Chungbuk National University, Cheongju 28644, Republic of KoreaDepartment of Information and Communication Engineering, Chungbuk National University, Cheongju 28644, Republic of KoreaDepartment of Artificial Intelligence Convergence, Wonkwang University, Iksan 54538, Republic of KoreaDepartment of Information and Communication Engineering, Chungbuk National University, Cheongju 28644, Republic of KoreaIn this study, we propose three k-nearest neighbor (k-NN) optimization techniques for a distributed, in-memory-based, high-dimensional indexing method to speed up content-based image retrieval. The proposed techniques perform distributed, in-memory, high-dimensional indexing-based k-NN query optimization: a density-based optimization technique that performs k-NN optimization using data distribution; a cost-based optimization technique using query processing cost statistics; and a learning-based optimization technique using a deep learning model, based on query logs. The proposed techniques were implemented on Spark, which supports a master/slave model for large-scale distributed processing. We showed the superiority and validity of the proposed techniques through various performance evaluations, based on high-dimensional data.https://www.mdpi.com/2079-9292/12/11/2375query optimizationdata distributionimage retrievalk-NNhigh-dimensional indexmachine learning
spellingShingle Dojin Choi
Jiwon Wee
Sangho Song
Hyeonbyeong Lee
Jongtae Lim
Kyoungsoo Bok
Jaesoo Yoo
k-NN Query Optimization for High-Dimensional Index Using Machine Learning
Electronics
query optimization
data distribution
image retrieval
k-NN
high-dimensional index
machine learning
title k-NN Query Optimization for High-Dimensional Index Using Machine Learning
title_full k-NN Query Optimization for High-Dimensional Index Using Machine Learning
title_fullStr k-NN Query Optimization for High-Dimensional Index Using Machine Learning
title_full_unstemmed k-NN Query Optimization for High-Dimensional Index Using Machine Learning
title_short k-NN Query Optimization for High-Dimensional Index Using Machine Learning
title_sort k nn query optimization for high dimensional index using machine learning
topic query optimization
data distribution
image retrieval
k-NN
high-dimensional index
machine learning
url https://www.mdpi.com/2079-9292/12/11/2375
work_keys_str_mv AT dojinchoi knnqueryoptimizationforhighdimensionalindexusingmachinelearning
AT jiwonwee knnqueryoptimizationforhighdimensionalindexusingmachinelearning
AT sanghosong knnqueryoptimizationforhighdimensionalindexusingmachinelearning
AT hyeonbyeonglee knnqueryoptimizationforhighdimensionalindexusingmachinelearning
AT jongtaelim knnqueryoptimizationforhighdimensionalindexusingmachinelearning
AT kyoungsoobok knnqueryoptimizationforhighdimensionalindexusingmachinelearning
AT jaesooyoo knnqueryoptimizationforhighdimensionalindexusingmachinelearning