Efficient hardware-accelerated pseudoinverse computation through algorithm restructuring for parallelization in high-level synthesis

This paper describes a fast and efficient hardware-accelerated pseudoinverse computation through algorithm restructuring and leveraging FPGA synthesis directives for parallelism prior to high-level synthesis (HLS). The algorithm, which is composed of modified Gram–Schmidt QR decomposition (MGS-QRD),...

Full description

Bibliographic Details
Main Authors:	Tan, Chong Yeam, Ooi, Chia Yee, Choo, Hau Sim, Ismail, Nordinah
Format:	Article
Published:	John Wiley and Sons Ltd 2022
Subjects:	T Technology (General)

_version_	1796866983862468608
author	Tan, Chong Yeam Ooi, Chia Yee Choo, Hau Sim Ismail, Nordinah
author_facet	Tan, Chong Yeam Ooi, Chia Yee Choo, Hau Sim Ismail, Nordinah
author_sort	Tan, Chong Yeam
collection	ePrints
description	This paper describes a fast and efficient hardware-accelerated pseudoinverse computation through algorithm restructuring and leveraging FPGA synthesis directives for parallelism prior to high-level synthesis (HLS). The algorithm, which is composed of modified Gram–Schmidt QR decomposition (MGS-QRD), triangular matrix inversion (TMI), and matrix multiplication (MM), is synthesized and implemented on a field-programmable gate array (FPGA). MGS-QRD is restructured and augmented with parallelism directives prior to synthesizing the algorithm, which yielded an MGS-QRD hardware accelerator with high throughput. Modifications to the current TMI algorithm were also proposed, in which the removal of redundant computational tasks was done in order to speed up overall operation. Data dependencies in the MM algorithm were carefully considered such that appropriate parallelism directives were inserted, and matching the data flow of MM with MGS-QRD and TMI modules was also performed to accelerate the pseudoinverse computation. The results showed that the proposed pseudoinverse module is better than the naïve implementation which is composed of existing MGS-QRD, TMI and a standard MM in terms of maximum frequency (1.24× speedup), hardware resources(48% of reduction of DSP usage), latency (23% reduction), and throughput (62% increase).
first_indexed	2024-03-05T21:20:22Z
format	Article
id	utm.eprints-101017
institution	Universiti Teknologi Malaysia - ePrints
last_indexed	2024-03-05T21:20:22Z
publishDate	2022
publisher	John Wiley and Sons Ltd
record_format	dspace
spelling	utm.eprints-1010172023-05-23T10:37:43Z http://eprints.utm.my/101017/ Efficient hardware-accelerated pseudoinverse computation through algorithm restructuring for parallelization in high-level synthesis Tan, Chong Yeam Ooi, Chia Yee Choo, Hau Sim Ismail, Nordinah T Technology (General) This paper describes a fast and efficient hardware-accelerated pseudoinverse computation through algorithm restructuring and leveraging FPGA synthesis directives for parallelism prior to high-level synthesis (HLS). The algorithm, which is composed of modified Gram–Schmidt QR decomposition (MGS-QRD), triangular matrix inversion (TMI), and matrix multiplication (MM), is synthesized and implemented on a field-programmable gate array (FPGA). MGS-QRD is restructured and augmented with parallelism directives prior to synthesizing the algorithm, which yielded an MGS-QRD hardware accelerator with high throughput. Modifications to the current TMI algorithm were also proposed, in which the removal of redundant computational tasks was done in order to speed up overall operation. Data dependencies in the MM algorithm were carefully considered such that appropriate parallelism directives were inserted, and matching the data flow of MM with MGS-QRD and TMI modules was also performed to accelerate the pseudoinverse computation. The results showed that the proposed pseudoinverse module is better than the naïve implementation which is composed of existing MGS-QRD, TMI and a standard MM in terms of maximum frequency (1.24× speedup), hardware resources(48% of reduction of DSP usage), latency (23% reduction), and throughput (62% increase). John Wiley and Sons Ltd 2022-02 Article PeerReviewed Tan, Chong Yeam and Ooi, Chia Yee and Choo, Hau Sim and Ismail, Nordinah (2022) Efficient hardware-accelerated pseudoinverse computation through algorithm restructuring for parallelization in high-level synthesis. International Journal of Circuit Theory and Applications, 50 (2). pp. 394-416. ISSN 0098-9886 http://dx.doi.org/10.1002/cta.3155 DOI: 10.1002/cta.3155
spellingShingle	T Technology (General) Tan, Chong Yeam Ooi, Chia Yee Choo, Hau Sim Ismail, Nordinah Efficient hardware-accelerated pseudoinverse computation through algorithm restructuring for parallelization in high-level synthesis
title	Efficient hardware-accelerated pseudoinverse computation through algorithm restructuring for parallelization in high-level synthesis
title_full	Efficient hardware-accelerated pseudoinverse computation through algorithm restructuring for parallelization in high-level synthesis
title_fullStr	Efficient hardware-accelerated pseudoinverse computation through algorithm restructuring for parallelization in high-level synthesis
title_full_unstemmed	Efficient hardware-accelerated pseudoinverse computation through algorithm restructuring for parallelization in high-level synthesis
title_short	Efficient hardware-accelerated pseudoinverse computation through algorithm restructuring for parallelization in high-level synthesis
title_sort	efficient hardware accelerated pseudoinverse computation through algorithm restructuring for parallelization in high level synthesis
topic	T Technology (General)
work_keys_str_mv	AT tanchongyeam efficienthardwareacceleratedpseudoinversecomputationthroughalgorithmrestructuringforparallelizationinhighlevelsynthesis AT ooichiayee efficienthardwareacceleratedpseudoinversecomputationthroughalgorithmrestructuringforparallelizationinhighlevelsynthesis AT choohausim efficienthardwareacceleratedpseudoinversecomputationthroughalgorithmrestructuringforparallelizationinhighlevelsynthesis AT ismailnordinah efficienthardwareacceleratedpseudoinversecomputationthroughalgorithmrestructuringforparallelizationinhighlevelsynthesis

Efficient hardware-accelerated pseudoinverse computation through algorithm restructuring for parallelization in high-level synthesis

Similar Items