LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values

Locality-sensitive hashing (LSH) has attracted extensive research efforts for approximate nearest neighbors (NN) search. However, most of these LSH-based index structures fail to take data distribution into account. They perform well in a uniform data distribution setting but exhibit unstable perfor...

Full description

Bibliographic Details
Main Authors: Jiwen Ding, Zhuojin Liu, Yanfeng Zhang, Shufeng Gong, Ge Yu
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9795034/
_version_ 1811341867651956736
author Jiwen Ding
Zhuojin Liu
Yanfeng Zhang
Shufeng Gong
Ge Yu
author_facet Jiwen Ding
Zhuojin Liu
Yanfeng Zhang
Shufeng Gong
Ge Yu
author_sort Jiwen Ding
collection DOAJ
description Locality-sensitive hashing (LSH) has attracted extensive research efforts for approximate nearest neighbors (NN) search. However, most of these LSH-based index structures fail to take data distribution into account. They perform well in a uniform data distribution setting but exhibit unstable performance when the data are skewed. As known, most real life data are skewed, which makes LSH suffer. In this paper, we observe that the skewness of hash values resulted from skewed data is a potential reason for performance degradation. To address this problem, we propose to rebuild LSH indices by exploring the density of hash values. The hash values in dense/sparse ranges are carefully reorganized using a multi-layered structure, so that more efforts are put into indexing the dense hash values. We further discuss the benefit in distributed computing. Extensive experiments are conducted to show the effectiveness and efficiency of the reconstructed LSH indices.
first_indexed 2024-04-13T19:00:34Z
format Article
id doaj.art-a564a5545c0a4767964fc1b3fd1785fb
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-13T19:00:34Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-a564a5545c0a4767964fc1b3fd1785fb2022-12-22T02:34:06ZengIEEEIEEE Access2169-35362022-01-0110698516986510.1109/ACCESS.2022.31828029795034LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash ValuesJiwen Ding0Zhuojin Liu1Yanfeng Zhang2https://orcid.org/0000-0002-9871-0304Shufeng Gong3https://orcid.org/0000-0001-5898-5621Ge Yu4https://orcid.org/0000-0002-3171-8889School of Computer Science and Engineering, Northeastern University, Shenyang, ChinaSchool of Computer Science and Engineering, Northeastern University, Shenyang, ChinaSchool of Computer Science and Engineering, Northeastern University, Shenyang, ChinaSchool of Computer Science and Engineering, Northeastern University, Shenyang, ChinaSchool of Computer Science and Engineering, Northeastern University, Shenyang, ChinaLocality-sensitive hashing (LSH) has attracted extensive research efforts for approximate nearest neighbors (NN) search. However, most of these LSH-based index structures fail to take data distribution into account. They perform well in a uniform data distribution setting but exhibit unstable performance when the data are skewed. As known, most real life data are skewed, which makes LSH suffer. In this paper, we observe that the skewness of hash values resulted from skewed data is a potential reason for performance degradation. To address this problem, we propose to rebuild LSH indices by exploring the density of hash values. The hash values in dense/sparse ranges are carefully reorganized using a multi-layered structure, so that more efforts are put into indexing the dense hash values. We further discuss the benefit in distributed computing. Extensive experiments are conducted to show the effectiveness and efficiency of the reconstructed LSH indices.https://ieeexplore.ieee.org/document/9795034/LSHnearest neighbors searchmulti-layered structuredata skewness
spellingShingle Jiwen Ding
Zhuojin Liu
Yanfeng Zhang
Shufeng Gong
Ge Yu
LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values
IEEE Access
LSH
nearest neighbors search
multi-layered structure
data skewness
title LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values
title_full LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values
title_fullStr LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values
title_full_unstemmed LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values
title_short LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values
title_sort layerlsh rebuilding locality sensitive hashing indices by exploring density of hash values
topic LSH
nearest neighbors search
multi-layered structure
data skewness
url https://ieeexplore.ieee.org/document/9795034/
work_keys_str_mv AT jiwending layerlshrebuildinglocalitysensitivehashingindicesbyexploringdensityofhashvalues
AT zhuojinliu layerlshrebuildinglocalitysensitivehashingindicesbyexploringdensityofhashvalues
AT yanfengzhang layerlshrebuildinglocalitysensitivehashingindicesbyexploringdensityofhashvalues
AT shufenggong layerlshrebuildinglocalitysensitivehashingindicesbyexploringdensityofhashvalues
AT geyu layerlshrebuildinglocalitysensitivehashingindicesbyexploringdensityofhashvalues