Space and time efficient kernel density estimation in high dimensions
Recently, Charikar and Siminelakis (2017) presented a framework for kernel density estimation in provably sublinear query time, for kernels that possess a certain hashing-based property. However, their data structure requires a significantly increased super-linear storage space, as well as super-lin...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Morgan Kaufmann Publishers
2021
|
Online Access: | https://hdl.handle.net/1721.1/129407 |
_version_ | 1811089339903377408 |
---|---|
author | Indyk, Piotr Wagner, Tal |
author2 | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
author_facet | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Indyk, Piotr Wagner, Tal |
author_sort | Indyk, Piotr |
collection | MIT |
description | Recently, Charikar and Siminelakis (2017) presented a framework for kernel density estimation in provably sublinear query time, for kernels that possess a certain hashing-based property. However, their data structure requires a significantly increased super-linear storage space, as well as super-linear preprocessing time. These limitations inhibit the practical applicability of their approach on large datasets. In this work, we present an improvement to their framework that retains the same query time, while requiring only linear space and linear preprocessing time. We instantiate our framework with the Laplacian and Exponential kernels, two popular kernels which possess the aforementioned property. Our experiments on various datasets verify that our approach attains accuracy and query time similar to Charikar and Siminelakis (2017), with significantly improved space and preprocessing time. |
first_indexed | 2024-09-23T14:17:38Z |
format | Article |
id | mit-1721.1/129407 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T14:17:38Z |
publishDate | 2021 |
publisher | Morgan Kaufmann Publishers |
record_format | dspace |
spelling | mit-1721.1/1294072022-10-01T20:25:57Z Space and time efficient kernel density estimation in high dimensions Indyk, Piotr Wagner, Tal Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Recently, Charikar and Siminelakis (2017) presented a framework for kernel density estimation in provably sublinear query time, for kernels that possess a certain hashing-based property. However, their data structure requires a significantly increased super-linear storage space, as well as super-linear preprocessing time. These limitations inhibit the practical applicability of their approach on large datasets. In this work, we present an improvement to their framework that retains the same query time, while requiring only linear space and linear preprocessing time. We instantiate our framework with the Laplacian and Exponential kernels, two popular kernels which possess the aforementioned property. Our experiments on various datasets verify that our approach attains accuracy and query time similar to Charikar and Siminelakis (2017), with significantly improved space and preprocessing time. National Science Foundation (U.S.). Transdisciplinary Research in Principles of Data Science (Award 1740751) 2021-01-13T19:01:08Z 2021-01-13T19:01:08Z 2019-12 2020-12-18T16:35:13Z Article http://purl.org/eprint/type/ConferencePaper 1049-5258 https://hdl.handle.net/1721.1/129407 Backurs, Arturs et al. “Space and time efficient kernel density estimation in high dimensions.” Advances in Neural Information Processing Systems, 32 (December 2019) © 2019 The Author(s) en https://papers.nips.cc/paper/2019/hash/a2ce8f1706e52936dfad516c23904e3e-Abstract.html Advances in Neural Information Processing Systems Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Morgan Kaufmann Publishers Neural Information Processing Systems (NIPS) |
spellingShingle | Indyk, Piotr Wagner, Tal Space and time efficient kernel density estimation in high dimensions |
title | Space and time efficient kernel density estimation in high dimensions |
title_full | Space and time efficient kernel density estimation in high dimensions |
title_fullStr | Space and time efficient kernel density estimation in high dimensions |
title_full_unstemmed | Space and time efficient kernel density estimation in high dimensions |
title_short | Space and time efficient kernel density estimation in high dimensions |
title_sort | space and time efficient kernel density estimation in high dimensions |
url | https://hdl.handle.net/1721.1/129407 |
work_keys_str_mv | AT indykpiotr spaceandtimeefficientkerneldensityestimationinhighdimensions AT wagnertal spaceandtimeefficientkerneldensityestimationinhighdimensions |