More accurate cardinality estimation in data streams

Abstract Many sketches based on estimator sharing have been proposed to estimate cardinality with huge flows in data streams. However, existing sketches suffer from large estimation errors due to allocating the same memory size for each estimator without considering the skewed cardinality distributi...

Full description

Bibliographic Details
Main Authors: Jie Lu, Hongchang Chen, Zheng Zhang, Jichao Xie
Format: Article
Language:English
Published: Wiley 2022-12-01
Series:Electronics Letters
Subjects:
Online Access:https://doi.org/10.1049/ell2.12671
Description
Summary:Abstract Many sketches based on estimator sharing have been proposed to estimate cardinality with huge flows in data streams. However, existing sketches suffer from large estimation errors due to allocating the same memory size for each estimator without considering the skewed cardinality distribution. Here, a filtering method called SuperFilter is proposed to enhance existing sketches. SuperFilter intelligently identifies high‐cardinality flows from the data stream, and records them with the large estimator, while other low‐cardinality flows are recorded using a traditional sketch with small estimators. The experimental results show that SuperFilter can reduce the average absolute error of cardinality estimation by over 81% compared with existing approaches.
ISSN:0013-5194
1350-911X