LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values
Locality-sensitive hashing (LSH) has attracted extensive research efforts for approximate nearest neighbors (NN) search. However, most of these LSH-based index structures fail to take data distribution into account. They perform well in a uniform data distribution setting but exhibit unstable perfor...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9795034/ |
_version_ | 1811341867651956736 |
---|---|
author | Jiwen Ding Zhuojin Liu Yanfeng Zhang Shufeng Gong Ge Yu |
author_facet | Jiwen Ding Zhuojin Liu Yanfeng Zhang Shufeng Gong Ge Yu |
author_sort | Jiwen Ding |
collection | DOAJ |
description | Locality-sensitive hashing (LSH) has attracted extensive research efforts for approximate nearest neighbors (NN) search. However, most of these LSH-based index structures fail to take data distribution into account. They perform well in a uniform data distribution setting but exhibit unstable performance when the data are skewed. As known, most real life data are skewed, which makes LSH suffer. In this paper, we observe that the skewness of hash values resulted from skewed data is a potential reason for performance degradation. To address this problem, we propose to rebuild LSH indices by exploring the density of hash values. The hash values in dense/sparse ranges are carefully reorganized using a multi-layered structure, so that more efforts are put into indexing the dense hash values. We further discuss the benefit in distributed computing. Extensive experiments are conducted to show the effectiveness and efficiency of the reconstructed LSH indices. |
first_indexed | 2024-04-13T19:00:34Z |
format | Article |
id | doaj.art-a564a5545c0a4767964fc1b3fd1785fb |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-13T19:00:34Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-a564a5545c0a4767964fc1b3fd1785fb2022-12-22T02:34:06ZengIEEEIEEE Access2169-35362022-01-0110698516986510.1109/ACCESS.2022.31828029795034LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash ValuesJiwen Ding0Zhuojin Liu1Yanfeng Zhang2https://orcid.org/0000-0002-9871-0304Shufeng Gong3https://orcid.org/0000-0001-5898-5621Ge Yu4https://orcid.org/0000-0002-3171-8889School of Computer Science and Engineering, Northeastern University, Shenyang, ChinaSchool of Computer Science and Engineering, Northeastern University, Shenyang, ChinaSchool of Computer Science and Engineering, Northeastern University, Shenyang, ChinaSchool of Computer Science and Engineering, Northeastern University, Shenyang, ChinaSchool of Computer Science and Engineering, Northeastern University, Shenyang, ChinaLocality-sensitive hashing (LSH) has attracted extensive research efforts for approximate nearest neighbors (NN) search. However, most of these LSH-based index structures fail to take data distribution into account. They perform well in a uniform data distribution setting but exhibit unstable performance when the data are skewed. As known, most real life data are skewed, which makes LSH suffer. In this paper, we observe that the skewness of hash values resulted from skewed data is a potential reason for performance degradation. To address this problem, we propose to rebuild LSH indices by exploring the density of hash values. The hash values in dense/sparse ranges are carefully reorganized using a multi-layered structure, so that more efforts are put into indexing the dense hash values. We further discuss the benefit in distributed computing. Extensive experiments are conducted to show the effectiveness and efficiency of the reconstructed LSH indices.https://ieeexplore.ieee.org/document/9795034/LSHnearest neighbors searchmulti-layered structuredata skewness |
spellingShingle | Jiwen Ding Zhuojin Liu Yanfeng Zhang Shufeng Gong Ge Yu LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values IEEE Access LSH nearest neighbors search multi-layered structure data skewness |
title | LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values |
title_full | LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values |
title_fullStr | LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values |
title_full_unstemmed | LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values |
title_short | LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values |
title_sort | layerlsh rebuilding locality sensitive hashing indices by exploring density of hash values |
topic | LSH nearest neighbors search multi-layered structure data skewness |
url | https://ieeexplore.ieee.org/document/9795034/ |
work_keys_str_mv | AT jiwending layerlshrebuildinglocalitysensitivehashingindicesbyexploringdensityofhashvalues AT zhuojinliu layerlshrebuildinglocalitysensitivehashingindicesbyexploringdensityofhashvalues AT yanfengzhang layerlshrebuildinglocalitysensitivehashingindicesbyexploringdensityofhashvalues AT shufenggong layerlshrebuildinglocalitysensitivehashingindicesbyexploringdensityofhashvalues AT geyu layerlshrebuildinglocalitysensitivehashingindicesbyexploringdensityofhashvalues |