Adaptive Hybrid Storage Format for Sparse Matrix–Vector Multiplication on Multi-Core SIMD CPUs

Optimizing sparse matrix–vector multiplication (SpMV) is challenging due to the non-uniform distribution of the non-zero elements of the sparse matrix. The best-performing SpMV format changes depending on the input matrix and the underlying architecture, and there is no “one-size-fit-for-all” format...

Full description

Bibliographic Details
Main Authors: Shizhao Chen, Jianbin Fang, Chuanfu Xu, Zheng Wang
Format: Article
Language:English
Published: MDPI AG 2022-09-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/19/9812
_version_ 1797480648834809856
author Shizhao Chen
Jianbin Fang
Chuanfu Xu
Zheng Wang
author_facet Shizhao Chen
Jianbin Fang
Chuanfu Xu
Zheng Wang
author_sort Shizhao Chen
collection DOAJ
description Optimizing sparse matrix–vector multiplication (SpMV) is challenging due to the non-uniform distribution of the non-zero elements of the sparse matrix. The best-performing SpMV format changes depending on the input matrix and the underlying architecture, and there is no “one-size-fit-for-all” format. A hybrid scheme combining multiple SpMV storage formats allows one to choose an appropriate format to use for the target matrix and hardware. However, existing hybrid approaches are inadequate for utilizing the SIMD cores of modern multi-core CPUs with SIMDs, and it remains unclear how to best mix different SpMV formats for a given matrix. This paper presents a new hybrid storage format for sparse matrices, specifically targeting multi-core CPUs with SIMDs. Our approach partitions the target sparse matrix into two segmentations based on the regularities of the memory access pattern, where each segmentation is stored in a format suitable for its memory access patterns. Unlike prior hybrid storage schemes that rely on the user to determine the data partition among storage formats, we employ machine learning to build a predictive model to automatically determine the partition threshold on a per matrix basis. Our predictive model is first trained off line, and the trained model can be applied to any new, unseen sparse matrix. We apply our approach to 956 matrices and evaluate its performance on three distinct multi-core CPU platforms: a 72-core Intel Knights Landing (KNL) CPU, a 128-core AMD EPYC CPU, and a 64-core Phytium ARMv8 CPU. Experimental results show that our hybrid scheme, combined with the predictive model, outperforms the best-performing alternative by 2.9%, 17.5% and 16% on average on KNL, AMD, and Phytium, respectively.
first_indexed 2024-03-09T22:04:05Z
format Article
id doaj.art-f74c73edd91843f48c81a99a28b6558c
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T22:04:05Z
publishDate 2022-09-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-f74c73edd91843f48c81a99a28b6558c2023-11-23T19:46:02ZengMDPI AGApplied Sciences2076-34172022-09-011219981210.3390/app12199812Adaptive Hybrid Storage Format for Sparse Matrix–Vector Multiplication on Multi-Core SIMD CPUsShizhao Chen0Jianbin Fang1Chuanfu Xu2Zheng Wang3Institute for Quantum Information & State Key Laboratory of High Performance Computing, National University of Defense Technology, Changsha 410073, ChinaCollege of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaInstitute for Quantum Information & State Key Laboratory of High Performance Computing, National University of Defense Technology, Changsha 410073, ChinaSchool of Computing, University of Leeds, Leeds LS2 9JT, UKOptimizing sparse matrix–vector multiplication (SpMV) is challenging due to the non-uniform distribution of the non-zero elements of the sparse matrix. The best-performing SpMV format changes depending on the input matrix and the underlying architecture, and there is no “one-size-fit-for-all” format. A hybrid scheme combining multiple SpMV storage formats allows one to choose an appropriate format to use for the target matrix and hardware. However, existing hybrid approaches are inadequate for utilizing the SIMD cores of modern multi-core CPUs with SIMDs, and it remains unclear how to best mix different SpMV formats for a given matrix. This paper presents a new hybrid storage format for sparse matrices, specifically targeting multi-core CPUs with SIMDs. Our approach partitions the target sparse matrix into two segmentations based on the regularities of the memory access pattern, where each segmentation is stored in a format suitable for its memory access patterns. Unlike prior hybrid storage schemes that rely on the user to determine the data partition among storage formats, we employ machine learning to build a predictive model to automatically determine the partition threshold on a per matrix basis. Our predictive model is first trained off line, and the trained model can be applied to any new, unseen sparse matrix. We apply our approach to 956 matrices and evaluate its performance on three distinct multi-core CPU platforms: a 72-core Intel Knights Landing (KNL) CPU, a 128-core AMD EPYC CPU, and a 64-core Phytium ARMv8 CPU. Experimental results show that our hybrid scheme, combined with the predictive model, outperforms the best-performing alternative by 2.9%, 17.5% and 16% on average on KNL, AMD, and Phytium, respectively.https://www.mdpi.com/2076-3417/12/19/9812SpMVperformancemachine learningsparse matrix format
spellingShingle Shizhao Chen
Jianbin Fang
Chuanfu Xu
Zheng Wang
Adaptive Hybrid Storage Format for Sparse Matrix–Vector Multiplication on Multi-Core SIMD CPUs
Applied Sciences
SpMV
performance
machine learning
sparse matrix format
title Adaptive Hybrid Storage Format for Sparse Matrix–Vector Multiplication on Multi-Core SIMD CPUs
title_full Adaptive Hybrid Storage Format for Sparse Matrix–Vector Multiplication on Multi-Core SIMD CPUs
title_fullStr Adaptive Hybrid Storage Format for Sparse Matrix–Vector Multiplication on Multi-Core SIMD CPUs
title_full_unstemmed Adaptive Hybrid Storage Format for Sparse Matrix–Vector Multiplication on Multi-Core SIMD CPUs
title_short Adaptive Hybrid Storage Format for Sparse Matrix–Vector Multiplication on Multi-Core SIMD CPUs
title_sort adaptive hybrid storage format for sparse matrix vector multiplication on multi core simd cpus
topic SpMV
performance
machine learning
sparse matrix format
url https://www.mdpi.com/2076-3417/12/19/9812
work_keys_str_mv AT shizhaochen adaptivehybridstorageformatforsparsematrixvectormultiplicationonmulticoresimdcpus
AT jianbinfang adaptivehybridstorageformatforsparsematrixvectormultiplicationonmulticoresimdcpus
AT chuanfuxu adaptivehybridstorageformatforsparsematrixvectormultiplicationonmulticoresimdcpus
AT zhengwang adaptivehybridstorageformatforsparsematrixvectormultiplicationonmulticoresimdcpus