Design and Implementation of Multithreaded Reproducible DGEMV for Phytium Processor

In high-performance computing,the accumulation of rounding error in the process of solving the large-scale,long time and ill-conditioned problem will lead to invalidated results.These results are useful for the developers to debug programs and check their correctness.Therefore,the reproducibility of...

Full description

Bibliographic Details
Main Author: CHEN Lei, TANG Tao, QI Hai-jun, JIANG Hao, HE Kang
Format: Article
Language:zho
Published: Editorial office of Computer Science 2022-10-01
Series:Jisuanji kexue
Subjects:
Online Access:https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-10-27.pdf
_version_ 1797845110477553664
author CHEN Lei, TANG Tao, QI Hai-jun, JIANG Hao, HE Kang
author_facet CHEN Lei, TANG Tao, QI Hai-jun, JIANG Hao, HE Kang
author_sort CHEN Lei, TANG Tao, QI Hai-jun, JIANG Hao, HE Kang
collection DOAJ
description In high-performance computing,the accumulation of rounding error in the process of solving the large-scale,long time and ill-conditioned problem will lead to invalidated results.These results are useful for the developers to debug programs and check their correctness.Therefore,the reproducibility of the numerical results of the algorithm becomes very important.Based on the OpenBLAS’s framework,combining with Demmel’s reproducible method in ReproBLAS and multilayer block technology proposed by Castaldo,this paper designs a reproducible algorithm of multithreaded DGEMV for Phytium processor with rounding error analysis and error free transformation.Numerical experiments show that the output of the algorithm is the same as that of the ReproBLAS,which verifies the reproducibility.Our algorithm is up to 2x faster than that in ReproBLAS.Compared with the DGEMV function of OzBLAS proposed by Mukunoki,our algorithm runs at least 20x faster than that in OzBLAS with single thread,and 9x faster than that in OzBLAS with multi-threads.Theoretical analysis and numerical experiments illustrate that improved algorithm is accurate,validated and efficiency.
first_indexed 2024-04-09T17:33:16Z
format Article
id doaj.art-1c7197b429894b7da7d1c60c33279d01
institution Directory Open Access Journal
issn 1002-137X
language zho
last_indexed 2024-04-09T17:33:16Z
publishDate 2022-10-01
publisher Editorial office of Computer Science
record_format Article
series Jisuanji kexue
spelling doaj.art-1c7197b429894b7da7d1c60c33279d012023-04-18T02:32:39ZzhoEditorial office of Computer ScienceJisuanji kexue1002-137X2022-10-014910273510.11896/jsjkx.220100125Design and Implementation of Multithreaded Reproducible DGEMV for Phytium ProcessorCHEN Lei, TANG Tao, QI Hai-jun, JIANG Hao, HE Kang0College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,ChinaIn high-performance computing,the accumulation of rounding error in the process of solving the large-scale,long time and ill-conditioned problem will lead to invalidated results.These results are useful for the developers to debug programs and check their correctness.Therefore,the reproducibility of the numerical results of the algorithm becomes very important.Based on the OpenBLAS’s framework,combining with Demmel’s reproducible method in ReproBLAS and multilayer block technology proposed by Castaldo,this paper designs a reproducible algorithm of multithreaded DGEMV for Phytium processor with rounding error analysis and error free transformation.Numerical experiments show that the output of the algorithm is the same as that of the ReproBLAS,which verifies the reproducibility.Our algorithm is up to 2x faster than that in ReproBLAS.Compared with the DGEMV function of OzBLAS proposed by Mukunoki,our algorithm runs at least 20x faster than that in OzBLAS with single thread,and 9x faster than that in OzBLAS with multi-threads.Theoretical analysis and numerical experiments illustrate that improved algorithm is accurate,validated and efficiency.https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-10-27.pdfreproducibility|round-off error|error-free transformation|dgemv
spellingShingle CHEN Lei, TANG Tao, QI Hai-jun, JIANG Hao, HE Kang
Design and Implementation of Multithreaded Reproducible DGEMV for Phytium Processor
Jisuanji kexue
reproducibility|round-off error|error-free transformation|dgemv
title Design and Implementation of Multithreaded Reproducible DGEMV for Phytium Processor
title_full Design and Implementation of Multithreaded Reproducible DGEMV for Phytium Processor
title_fullStr Design and Implementation of Multithreaded Reproducible DGEMV for Phytium Processor
title_full_unstemmed Design and Implementation of Multithreaded Reproducible DGEMV for Phytium Processor
title_short Design and Implementation of Multithreaded Reproducible DGEMV for Phytium Processor
title_sort design and implementation of multithreaded reproducible dgemv for phytium processor
topic reproducibility|round-off error|error-free transformation|dgemv
url https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-10-27.pdf
work_keys_str_mv AT chenleitangtaoqihaijunjianghaohekang designandimplementationofmultithreadedreproducibledgemvforphytiumprocessor