Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO

A general-purpose GPU (GPGPU) is employed in a variety of domains, including accelerating the spread of deep natural network models; however, further research into its effective implementation is needed. When using the compute unified device architecture (CUDA), which has recently gained popularity,...

Full description

Bibliographic Details
Main Authors: Younsang Cho, Jaeoh Kim, Donghyeon Yu
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9777722/
_version_ 1828224350870831104
author Younsang Cho
Jaeoh Kim
Donghyeon Yu
author_facet Younsang Cho
Jaeoh Kim
Donghyeon Yu
author_sort Younsang Cho
collection DOAJ
description A general-purpose GPU (GPGPU) is employed in a variety of domains, including accelerating the spread of deep natural network models; however, further research into its effective implementation is needed. When using the compute unified device architecture (CUDA), which has recently gained popularity, the situation is analogous to use the GPUs and its memory space. This is due to the lack of a gold standard for selecting the most efficient approach for CUDA GPU parallel computation. Contrarily, as solving the least absolute shrinkage and selection operator (LASSO) regression fully consists of the basic linear algebra operations, the computation using GPGPU is more effective than other models. Additionally, its optimization problem often requires fast and efficient calculations. The purpose of this study is to provide brief introductions to the implementation approaches and numerically compare the computational efficiency of GPU parallel computation with that of the fast iterative shrinkage-thresholding algorithm for LASSO. This study contributes to providing gold standards for the CUDA GPU parallel computation, considering both computational efficiency and ease of implementation. Based on our comparison results, we recommend implementing the CUDA GPU parallel computation using Python, with either a dynamic-link library or PyTorch for the iterative algorithms.
first_indexed 2024-04-12T17:18:23Z
format Article
id doaj.art-00215b0d322a42408f2d75dfbd8aaac5
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-12T17:18:23Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-00215b0d322a42408f2d75dfbd8aaac52022-12-22T03:23:33ZengIEEEIEEE Access2169-35362022-01-0110533245334310.1109/ACCESS.2022.31759879777722Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSOYounsang Cho0Jaeoh Kim1https://orcid.org/0000-0001-7831-6353Donghyeon Yu2https://orcid.org/0000-0003-4519-8500Department of Statistics, Inha University, Incheon, South KoreaDepartment of Data Science, Inha University, Incheon, South KoreaDepartment of Statistics, Inha University, Incheon, South KoreaA general-purpose GPU (GPGPU) is employed in a variety of domains, including accelerating the spread of deep natural network models; however, further research into its effective implementation is needed. When using the compute unified device architecture (CUDA), which has recently gained popularity, the situation is analogous to use the GPUs and its memory space. This is due to the lack of a gold standard for selecting the most efficient approach for CUDA GPU parallel computation. Contrarily, as solving the least absolute shrinkage and selection operator (LASSO) regression fully consists of the basic linear algebra operations, the computation using GPGPU is more effective than other models. Additionally, its optimization problem often requires fast and efficient calculations. The purpose of this study is to provide brief introductions to the implementation approaches and numerically compare the computational efficiency of GPU parallel computation with that of the fast iterative shrinkage-thresholding algorithm for LASSO. This study contributes to providing gold standards for the CUDA GPU parallel computation, considering both computational efficiency and ease of implementation. Based on our comparison results, we recommend implementing the CUDA GPU parallel computation using Python, with either a dynamic-link library or PyTorch for the iterative algorithms.https://ieeexplore.ieee.org/document/9777722/Compute unified device architecturegraphics processing unitfast iterative shrinkage-thresholding algorithmLASSOPython
spellingShingle Younsang Cho
Jaeoh Kim
Donghyeon Yu
Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO
IEEE Access
Compute unified device architecture
graphics processing unit
fast iterative shrinkage-thresholding algorithm
LASSO
Python
title Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO
title_full Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO
title_fullStr Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO
title_full_unstemmed Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO
title_short Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO
title_sort comparative study of cuda gpu implementations in python with the fast iterative shrinkage thresholding algorithm for lasso
topic Compute unified device architecture
graphics processing unit
fast iterative shrinkage-thresholding algorithm
LASSO
Python
url https://ieeexplore.ieee.org/document/9777722/
work_keys_str_mv AT younsangcho comparativestudyofcudagpuimplementationsinpythonwiththefastiterativeshrinkagethresholdingalgorithmforlasso
AT jaeohkim comparativestudyofcudagpuimplementationsinpythonwiththefastiterativeshrinkagethresholdingalgorithmforlasso
AT donghyeonyu comparativestudyofcudagpuimplementationsinpythonwiththefastiterativeshrinkagethresholdingalgorithmforlasso