Fast approximation of the top‐k items in data streams using FPGAs

Abstract Two methods are presented for finding the top‐k items in data streams using Field Programmable Gate Arrays (FPGAs). These methods deploy two variants of a novel accelerator architecture capable of extracting an approximate list of the topmost frequently occurring items in a single pass over...

Full description

Bibliographic Details
Main Authors: Ali Ebrahim, Jalal Khalifat
Format: Article
Language:English
Published: Hindawi-IET 2023-03-01
Series:IET Computers & Digital Techniques
Subjects:
Online Access:https://doi.org/10.1049/cdt2.12053
Description
Summary:Abstract Two methods are presented for finding the top‐k items in data streams using Field Programmable Gate Arrays (FPGAs). These methods deploy two variants of a novel accelerator architecture capable of extracting an approximate list of the topmost frequently occurring items in a single pass over the input stream without the need for random access. The first variant of the accelerator implements the well‐known Probabilistic sampling algorithm by mapping its main processing stages to a hardware architecture consisting of two custom systolic arrays. The proposed architecture retains all the properties of this algorithm, which works even if the stream size is unknown at run time. The architecture shows better scalability compared to other architectures that are based on other stream algorithms. In addition, experimental results on both synthetic and real datasets, when implementing the accelerator on an Intel Arria 10 GX 1150 FPGA device, showed very good accuracy and significant throughput gains compared to the existing software and hardware‐accelerated solutions. The second variant of the accelerator is specifically tailored for applications requiring higher accuracy, provided that the size of the stream is known at run time. This variant takes advantage of the embedded memory resources in an FPGA to implement a sketch‐based filter that precedes the main systolic array in the accelerator's pipeline. This filter enhances the accuracy of the accelerator by pre‐processing the stream to remove much of the insignificant items, allowing the accelerator to process a significantly smaller filtered stream.
ISSN:1751-8601
1751-861X