TEB: Efficient SpMV Storage Format for Matrix Decomposition and Reconstruction on GPU

Sparse matrix-vector multiplication (SpMV) is a crucial computing process in the field of science and engineering. CSR (compressed sparse row) format is one of the most commonly used storage formats for sparse matrix. In the process of implementing parallel SpMV on the graphics processing unit (GPU)...

Full description

Bibliographic Details
Main Author: WANG Yuhua, ZHANG Yuqi, HE Junfei, XU Yuezhu, CUI Huanyu
Format: Article
Language:zho
Published: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press 2024-04-01
Series:Jisuanji kexue yu tansuo
Subjects:
Online Access:http://fcst.ceaj.org/fileup/1673-9418/PDF/2304039.pdf
_version_ 1797231006553473024
author WANG Yuhua, ZHANG Yuqi, HE Junfei, XU Yuezhu, CUI Huanyu
author_facet WANG Yuhua, ZHANG Yuqi, HE Junfei, XU Yuezhu, CUI Huanyu
author_sort WANG Yuhua, ZHANG Yuqi, HE Junfei, XU Yuezhu, CUI Huanyu
collection DOAJ
description Sparse matrix-vector multiplication (SpMV) is a crucial computing process in the field of science and engineering. CSR (compressed sparse row) format is one of the most commonly used storage formats for sparse matrix. In the process of implementing parallel SpMV on the graphics processing unit (GPU), it only stores non-zero elements of sparse matrix, avoiding computational redundancy caused by zero element filling, and saving storage space. But there is a problem of load imbalance, which wastes computing resources. To address the aforementioned issues, storage formats with good performance in recent years have been studied, and a row by row decomposition and reorganization storage format—TEB (threshold-exchangeorder block) format has been proposed. The format first uses a heuristic threshold selection algorithm to determine the appropriate segmentation threshold, and combines the row merging algorithm based on reordering to reconstruct and decompose the sparse matrix, so that the number of non-zero elements between blocks is as close as possible. Furthermore, combined with CUDA (computer unified device architecture) thread technology, a parallel SpMV algorithm between sub blocks based on TEB storage format is proposed, which can reasonably allocate computing resources and solve the problem of load imbalance, thus improving the parallel computing efficiency of SpMV. In order to verify the effectiveness of the TEB storage format, experiments are conducted on the NVIDIA Tesla V100 platform. The results show that compared to PBC (partition-block-CSR), AMF-CSR (adaptive multi-row folding of CSR), CSR-Scalar (compressed sparse row-scalar), and CSR5 (compressed sparse row 5) storage formats, TEB can improve SpMV time performance by an average of 3.23×, 5.83×, 2.33×, and 2.21×. In terms of floating-point computing performance, the average improvement can be 3.36×, 5.95×, 2.29×, and 2.13×
first_indexed 2024-04-24T15:37:31Z
format Article
id doaj.art-9297e580bf2045719bc7b8950dbba89d
institution Directory Open Access Journal
issn 1673-9418
language zho
last_indexed 2024-04-24T15:37:31Z
publishDate 2024-04-01
publisher Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
record_format Article
series Jisuanji kexue yu tansuo
spelling doaj.art-9297e580bf2045719bc7b8950dbba89d2024-04-02T01:27:23ZzhoJournal of Computer Engineering and Applications Beijing Co., Ltd., Science PressJisuanji kexue yu tansuo1673-94182024-04-011841094110810.3778/j.issn.1673-9418.2304039TEB: Efficient SpMV Storage Format for Matrix Decomposition and Reconstruction on GPUWANG Yuhua, ZHANG Yuqi, HE Junfei, XU Yuezhu, CUI Huanyu0School of Computer Science and Technology, Harbin Engineering University, Harbin 150000, ChinaSparse matrix-vector multiplication (SpMV) is a crucial computing process in the field of science and engineering. CSR (compressed sparse row) format is one of the most commonly used storage formats for sparse matrix. In the process of implementing parallel SpMV on the graphics processing unit (GPU), it only stores non-zero elements of sparse matrix, avoiding computational redundancy caused by zero element filling, and saving storage space. But there is a problem of load imbalance, which wastes computing resources. To address the aforementioned issues, storage formats with good performance in recent years have been studied, and a row by row decomposition and reorganization storage format—TEB (threshold-exchangeorder block) format has been proposed. The format first uses a heuristic threshold selection algorithm to determine the appropriate segmentation threshold, and combines the row merging algorithm based on reordering to reconstruct and decompose the sparse matrix, so that the number of non-zero elements between blocks is as close as possible. Furthermore, combined with CUDA (computer unified device architecture) thread technology, a parallel SpMV algorithm between sub blocks based on TEB storage format is proposed, which can reasonably allocate computing resources and solve the problem of load imbalance, thus improving the parallel computing efficiency of SpMV. In order to verify the effectiveness of the TEB storage format, experiments are conducted on the NVIDIA Tesla V100 platform. The results show that compared to PBC (partition-block-CSR), AMF-CSR (adaptive multi-row folding of CSR), CSR-Scalar (compressed sparse row-scalar), and CSR5 (compressed sparse row 5) storage formats, TEB can improve SpMV time performance by an average of 3.23×, 5.83×, 2.33×, and 2.21×. In terms of floating-point computing performance, the average improvement can be 3.36×, 5.95×, 2.29×, and 2.13×http://fcst.ceaj.org/fileup/1673-9418/PDF/2304039.pdfsparse matrix-vector multiplication (spmv); reorder; compressed sparse row (csr) format; load balancing; storage format; graphics processing unit (gpu)
spellingShingle WANG Yuhua, ZHANG Yuqi, HE Junfei, XU Yuezhu, CUI Huanyu
TEB: Efficient SpMV Storage Format for Matrix Decomposition and Reconstruction on GPU
Jisuanji kexue yu tansuo
sparse matrix-vector multiplication (spmv); reorder; compressed sparse row (csr) format; load balancing; storage format; graphics processing unit (gpu)
title TEB: Efficient SpMV Storage Format for Matrix Decomposition and Reconstruction on GPU
title_full TEB: Efficient SpMV Storage Format for Matrix Decomposition and Reconstruction on GPU
title_fullStr TEB: Efficient SpMV Storage Format for Matrix Decomposition and Reconstruction on GPU
title_full_unstemmed TEB: Efficient SpMV Storage Format for Matrix Decomposition and Reconstruction on GPU
title_short TEB: Efficient SpMV Storage Format for Matrix Decomposition and Reconstruction on GPU
title_sort teb efficient spmv storage format for matrix decomposition and reconstruction on gpu
topic sparse matrix-vector multiplication (spmv); reorder; compressed sparse row (csr) format; load balancing; storage format; graphics processing unit (gpu)
url http://fcst.ceaj.org/fileup/1673-9418/PDF/2304039.pdf
work_keys_str_mv AT wangyuhuazhangyuqihejunfeixuyuezhucuihuanyu tebefficientspmvstorageformatformatrixdecompositionandreconstructionongpu