Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO
A general-purpose GPU (GPGPU) is employed in a variety of domains, including accelerating the spread of deep natural network models; however, further research into its effective implementation is needed. When using the compute unified device architecture (CUDA), which has recently gained popularity,...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9777722/ |
_version_ | 1828224350870831104 |
---|---|
author | Younsang Cho Jaeoh Kim Donghyeon Yu |
author_facet | Younsang Cho Jaeoh Kim Donghyeon Yu |
author_sort | Younsang Cho |
collection | DOAJ |
description | A general-purpose GPU (GPGPU) is employed in a variety of domains, including accelerating the spread of deep natural network models; however, further research into its effective implementation is needed. When using the compute unified device architecture (CUDA), which has recently gained popularity, the situation is analogous to use the GPUs and its memory space. This is due to the lack of a gold standard for selecting the most efficient approach for CUDA GPU parallel computation. Contrarily, as solving the least absolute shrinkage and selection operator (LASSO) regression fully consists of the basic linear algebra operations, the computation using GPGPU is more effective than other models. Additionally, its optimization problem often requires fast and efficient calculations. The purpose of this study is to provide brief introductions to the implementation approaches and numerically compare the computational efficiency of GPU parallel computation with that of the fast iterative shrinkage-thresholding algorithm for LASSO. This study contributes to providing gold standards for the CUDA GPU parallel computation, considering both computational efficiency and ease of implementation. Based on our comparison results, we recommend implementing the CUDA GPU parallel computation using Python, with either a dynamic-link library or PyTorch for the iterative algorithms. |
first_indexed | 2024-04-12T17:18:23Z |
format | Article |
id | doaj.art-00215b0d322a42408f2d75dfbd8aaac5 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-12T17:18:23Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-00215b0d322a42408f2d75dfbd8aaac52022-12-22T03:23:33ZengIEEEIEEE Access2169-35362022-01-0110533245334310.1109/ACCESS.2022.31759879777722Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSOYounsang Cho0Jaeoh Kim1https://orcid.org/0000-0001-7831-6353Donghyeon Yu2https://orcid.org/0000-0003-4519-8500Department of Statistics, Inha University, Incheon, South KoreaDepartment of Data Science, Inha University, Incheon, South KoreaDepartment of Statistics, Inha University, Incheon, South KoreaA general-purpose GPU (GPGPU) is employed in a variety of domains, including accelerating the spread of deep natural network models; however, further research into its effective implementation is needed. When using the compute unified device architecture (CUDA), which has recently gained popularity, the situation is analogous to use the GPUs and its memory space. This is due to the lack of a gold standard for selecting the most efficient approach for CUDA GPU parallel computation. Contrarily, as solving the least absolute shrinkage and selection operator (LASSO) regression fully consists of the basic linear algebra operations, the computation using GPGPU is more effective than other models. Additionally, its optimization problem often requires fast and efficient calculations. The purpose of this study is to provide brief introductions to the implementation approaches and numerically compare the computational efficiency of GPU parallel computation with that of the fast iterative shrinkage-thresholding algorithm for LASSO. This study contributes to providing gold standards for the CUDA GPU parallel computation, considering both computational efficiency and ease of implementation. Based on our comparison results, we recommend implementing the CUDA GPU parallel computation using Python, with either a dynamic-link library or PyTorch for the iterative algorithms.https://ieeexplore.ieee.org/document/9777722/Compute unified device architecturegraphics processing unitfast iterative shrinkage-thresholding algorithmLASSOPython |
spellingShingle | Younsang Cho Jaeoh Kim Donghyeon Yu Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO IEEE Access Compute unified device architecture graphics processing unit fast iterative shrinkage-thresholding algorithm LASSO Python |
title | Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO |
title_full | Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO |
title_fullStr | Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO |
title_full_unstemmed | Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO |
title_short | Comparative Study of CUDA GPU Implementations in Python With the Fast Iterative Shrinkage-Thresholding Algorithm for LASSO |
title_sort | comparative study of cuda gpu implementations in python with the fast iterative shrinkage thresholding algorithm for lasso |
topic | Compute unified device architecture graphics processing unit fast iterative shrinkage-thresholding algorithm LASSO Python |
url | https://ieeexplore.ieee.org/document/9777722/ |
work_keys_str_mv | AT younsangcho comparativestudyofcudagpuimplementationsinpythonwiththefastiterativeshrinkagethresholdingalgorithmforlasso AT jaeohkim comparativestudyofcudagpuimplementationsinpythonwiththefastiterativeshrinkagethresholdingalgorithmforlasso AT donghyeonyu comparativestudyofcudagpuimplementationsinpythonwiththefastiterativeshrinkagethresholdingalgorithmforlasso |