Fast approximation of the top‐k items in data streams using FPGAs

Abstract Two methods are presented for finding the top‐k items in data streams using Field Programmable Gate Arrays (FPGAs). These methods deploy two variants of a novel accelerator architecture capable of extracting an approximate list of the topmost frequently occurring items in a single pass over...

Full description

Bibliographic Details
Main Authors:	Ali Ebrahim, Jalal Khalifat
Format:	Article
Language:	English
Published:	Hindawi-IET 2023-03-01
Series:	IET Computers & Digital Techniques
Subjects:	field programmable gate arrays hardware description languages logic arrays
Online Access:	https://doi.org/10.1049/cdt2.12053

_version_	1797430504779153408
author	Ali Ebrahim Jalal Khalifat
author_facet	Ali Ebrahim Jalal Khalifat
author_sort	Ali Ebrahim
collection	DOAJ
description	Abstract Two methods are presented for finding the top‐k items in data streams using Field Programmable Gate Arrays (FPGAs). These methods deploy two variants of a novel accelerator architecture capable of extracting an approximate list of the topmost frequently occurring items in a single pass over the input stream without the need for random access. The first variant of the accelerator implements the well‐known Probabilistic sampling algorithm by mapping its main processing stages to a hardware architecture consisting of two custom systolic arrays. The proposed architecture retains all the properties of this algorithm, which works even if the stream size is unknown at run time. The architecture shows better scalability compared to other architectures that are based on other stream algorithms. In addition, experimental results on both synthetic and real datasets, when implementing the accelerator on an Intel Arria 10 GX 1150 FPGA device, showed very good accuracy and significant throughput gains compared to the existing software and hardware‐accelerated solutions. The second variant of the accelerator is specifically tailored for applications requiring higher accuracy, provided that the size of the stream is known at run time. This variant takes advantage of the embedded memory resources in an FPGA to implement a sketch‐based filter that precedes the main systolic array in the accelerator's pipeline. This filter enhances the accuracy of the accelerator by pre‐processing the stream to remove much of the insignificant items, allowing the accelerator to process a significantly smaller filtered stream.
first_indexed	2024-03-09T09:28:28Z
format	Article
id	doaj.art-eea8d3c0f2e049c4b1786ec172a79b1f
institution	Directory Open Access Journal
issn	1751-8601 1751-861X
language	English
last_indexed	2024-03-09T09:28:28Z
publishDate	2023-03-01
publisher	Hindawi-IET
record_format	Article
series	IET Computers & Digital Techniques
spelling	doaj.art-eea8d3c0f2e049c4b1786ec172a79b1f2023-12-02T05:21:03ZengHindawi-IETIET Computers & Digital Techniques1751-86011751-861X2023-03-01172607310.1049/cdt2.12053Fast approximation of the top‐k items in data streams using FPGAsAli Ebrahim0Jalal Khalifat1Department of Computer Engineering University of Bahrain College of Information Technology Sakhir BahrainDepartment of Computer Engineering University of Bahrain College of Information Technology Sakhir BahrainAbstract Two methods are presented for finding the top‐k items in data streams using Field Programmable Gate Arrays (FPGAs). These methods deploy two variants of a novel accelerator architecture capable of extracting an approximate list of the topmost frequently occurring items in a single pass over the input stream without the need for random access. The first variant of the accelerator implements the well‐known Probabilistic sampling algorithm by mapping its main processing stages to a hardware architecture consisting of two custom systolic arrays. The proposed architecture retains all the properties of this algorithm, which works even if the stream size is unknown at run time. The architecture shows better scalability compared to other architectures that are based on other stream algorithms. In addition, experimental results on both synthetic and real datasets, when implementing the accelerator on an Intel Arria 10 GX 1150 FPGA device, showed very good accuracy and significant throughput gains compared to the existing software and hardware‐accelerated solutions. The second variant of the accelerator is specifically tailored for applications requiring higher accuracy, provided that the size of the stream is known at run time. This variant takes advantage of the embedded memory resources in an FPGA to implement a sketch‐based filter that precedes the main systolic array in the accelerator's pipeline. This filter enhances the accuracy of the accelerator by pre‐processing the stream to remove much of the insignificant items, allowing the accelerator to process a significantly smaller filtered stream.https://doi.org/10.1049/cdt2.12053field programmable gate arrayshardware description languageslogic arrays
spellingShingle	Ali Ebrahim Jalal Khalifat Fast approximation of the top‐k items in data streams using FPGAs IET Computers & Digital Techniques field programmable gate arrays hardware description languages logic arrays
title	Fast approximation of the top‐k items in data streams using FPGAs
title_full	Fast approximation of the top‐k items in data streams using FPGAs
title_fullStr	Fast approximation of the top‐k items in data streams using FPGAs
title_full_unstemmed	Fast approximation of the top‐k items in data streams using FPGAs
title_short	Fast approximation of the top‐k items in data streams using FPGAs
title_sort	fast approximation of the top k items in data streams using fpgas
topic	field programmable gate arrays hardware description languages logic arrays
url	https://doi.org/10.1049/cdt2.12053
work_keys_str_mv	AT aliebrahim fastapproximationofthetopkitemsindatastreamsusingfpgas AT jalalkhalifat fastapproximationofthetopkitemsindatastreamsusingfpgas

Fast approximation of the top‐k items in data streams using FPGAs

Similar Items