Fast approximation of the top‐k items in data streams using FPGAs

Abstract Two methods are presented for finding the top‐k items in data streams using Field Programmable Gate Arrays (FPGAs). These methods deploy two variants of a novel accelerator architecture capable of extracting an approximate list of the topmost frequently occurring items in a single pass over...

Full description

Bibliographic Details
Main Authors: Ali Ebrahim, Jalal Khalifat
Format: Article
Language:English
Published: Hindawi-IET 2023-03-01
Series:IET Computers & Digital Techniques
Subjects:
Online Access:https://doi.org/10.1049/cdt2.12053
_version_ 1797430504779153408
author Ali Ebrahim
Jalal Khalifat
author_facet Ali Ebrahim
Jalal Khalifat
author_sort Ali Ebrahim
collection DOAJ
description Abstract Two methods are presented for finding the top‐k items in data streams using Field Programmable Gate Arrays (FPGAs). These methods deploy two variants of a novel accelerator architecture capable of extracting an approximate list of the topmost frequently occurring items in a single pass over the input stream without the need for random access. The first variant of the accelerator implements the well‐known Probabilistic sampling algorithm by mapping its main processing stages to a hardware architecture consisting of two custom systolic arrays. The proposed architecture retains all the properties of this algorithm, which works even if the stream size is unknown at run time. The architecture shows better scalability compared to other architectures that are based on other stream algorithms. In addition, experimental results on both synthetic and real datasets, when implementing the accelerator on an Intel Arria 10 GX 1150 FPGA device, showed very good accuracy and significant throughput gains compared to the existing software and hardware‐accelerated solutions. The second variant of the accelerator is specifically tailored for applications requiring higher accuracy, provided that the size of the stream is known at run time. This variant takes advantage of the embedded memory resources in an FPGA to implement a sketch‐based filter that precedes the main systolic array in the accelerator's pipeline. This filter enhances the accuracy of the accelerator by pre‐processing the stream to remove much of the insignificant items, allowing the accelerator to process a significantly smaller filtered stream.
first_indexed 2024-03-09T09:28:28Z
format Article
id doaj.art-eea8d3c0f2e049c4b1786ec172a79b1f
institution Directory Open Access Journal
issn 1751-8601
1751-861X
language English
last_indexed 2024-03-09T09:28:28Z
publishDate 2023-03-01
publisher Hindawi-IET
record_format Article
series IET Computers & Digital Techniques
spelling doaj.art-eea8d3c0f2e049c4b1786ec172a79b1f2023-12-02T05:21:03ZengHindawi-IETIET Computers & Digital Techniques1751-86011751-861X2023-03-01172607310.1049/cdt2.12053Fast approximation of the top‐k items in data streams using FPGAsAli Ebrahim0Jalal Khalifat1Department of Computer Engineering University of Bahrain College of Information Technology Sakhir BahrainDepartment of Computer Engineering University of Bahrain College of Information Technology Sakhir BahrainAbstract Two methods are presented for finding the top‐k items in data streams using Field Programmable Gate Arrays (FPGAs). These methods deploy two variants of a novel accelerator architecture capable of extracting an approximate list of the topmost frequently occurring items in a single pass over the input stream without the need for random access. The first variant of the accelerator implements the well‐known Probabilistic sampling algorithm by mapping its main processing stages to a hardware architecture consisting of two custom systolic arrays. The proposed architecture retains all the properties of this algorithm, which works even if the stream size is unknown at run time. The architecture shows better scalability compared to other architectures that are based on other stream algorithms. In addition, experimental results on both synthetic and real datasets, when implementing the accelerator on an Intel Arria 10 GX 1150 FPGA device, showed very good accuracy and significant throughput gains compared to the existing software and hardware‐accelerated solutions. The second variant of the accelerator is specifically tailored for applications requiring higher accuracy, provided that the size of the stream is known at run time. This variant takes advantage of the embedded memory resources in an FPGA to implement a sketch‐based filter that precedes the main systolic array in the accelerator's pipeline. This filter enhances the accuracy of the accelerator by pre‐processing the stream to remove much of the insignificant items, allowing the accelerator to process a significantly smaller filtered stream.https://doi.org/10.1049/cdt2.12053field programmable gate arrayshardware description languageslogic arrays
spellingShingle Ali Ebrahim
Jalal Khalifat
Fast approximation of the top‐k items in data streams using FPGAs
IET Computers & Digital Techniques
field programmable gate arrays
hardware description languages
logic arrays
title Fast approximation of the top‐k items in data streams using FPGAs
title_full Fast approximation of the top‐k items in data streams using FPGAs
title_fullStr Fast approximation of the top‐k items in data streams using FPGAs
title_full_unstemmed Fast approximation of the top‐k items in data streams using FPGAs
title_short Fast approximation of the top‐k items in data streams using FPGAs
title_sort fast approximation of the top k items in data streams using fpgas
topic field programmable gate arrays
hardware description languages
logic arrays
url https://doi.org/10.1049/cdt2.12053
work_keys_str_mv AT aliebrahim fastapproximationofthetopkitemsindatastreamsusingfpgas
AT jalalkhalifat fastapproximationofthetopkitemsindatastreamsusingfpgas