Fast Implementation of SHA-3 in GPU Environment

Recently, Graphic Processing Units (GPUs) have been widely used for general purpose applications such as machine learning applications, acceleration of cryptographic applications (especially, blockchains), etc. The development of CUDA makes this General-Purpose computing on GPU possible. In particul...

Full description

Bibliographic Details
Main Authors: Hojin Choi, Seog Chung Seo
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9585122/
_version_ 1818826079240978432
author Hojin Choi
Seog Chung Seo
author_facet Hojin Choi
Seog Chung Seo
author_sort Hojin Choi
collection DOAJ
description Recently, Graphic Processing Units (GPUs) have been widely used for general purpose applications such as machine learning applications, acceleration of cryptographic applications (especially, blockchains), etc. The development of CUDA makes this General-Purpose computing on GPU possible. In particular, currently GPU technology has been widely used for server-side applications so as to provide fast and efficient service to a number of clients. In other words, servers need to process a large amount of user data and execute authentication process. Verifying the integrity of transmitted data is essential for ensuring that the data is not modified during transmission. Hash functions are the cryptographic algorithm which can verify the integrity of data and there are SHA-1, SHA-2, and SHA-3 standard hash functions. In 2015, Keccak algorithm was selected for SHA-3 competition by NIST. However, until now, software implementations of SHA-3 have not provided enough performance for various applications. In addition, SHA-3 and SHAKE using SHA-3 are being used in many Post-Quantum Cryptosystems (PQC) submitted to NIST PQC competition. Therefore, SHA-3 optimization research is required in the software environment. We propose an optimized SHA-3 software implementation on GPU environment. For performance efficiency, we propose several techniques including optimization of SHA-3 internal process, inline PTX optimization, optimized memory usage, and the application of asynchronous CUDA stream. As a result of applying the proposed optimization method, our SHA-3(512) (resp. SHA-3(256)) implementation without CUDA stream provides a maximum throughput of 88.51 Gb/s (resp. 171.62 Gb/s) on RTX2080Ti GPU. Furthermore, without the application of CUDA stream, our SHA-3(512) software on GTX1070 provides about 49.73% improved throughput compared with the previous best work on GTX1080, which shows the superiority of our proposed optimization methods. Our optimized SHA-3 software on GPU can be efficiently used for block-chain applications and several PQCs (especially, key generation process in Lattice-based cryptosystems).
first_indexed 2024-12-19T00:21:57Z
format Article
id doaj.art-04dc217ff4864ed68ffb285f866b75ea
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-19T00:21:57Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-04dc217ff4864ed68ffb285f866b75ea2022-12-21T20:45:28ZengIEEEIEEE Access2169-35362021-01-01914457414458610.1109/ACCESS.2021.31224669585122Fast Implementation of SHA-3 in GPU EnvironmentHojin Choi0https://orcid.org/0000-0002-7298-3689Seog Chung Seo1https://orcid.org/0000-0001-8016-2808Department of Financial Information Security, Kookmin University, Seoul, South KoreaDepartment of Financial Information Security, Kookmin University, Seoul, South KoreaRecently, Graphic Processing Units (GPUs) have been widely used for general purpose applications such as machine learning applications, acceleration of cryptographic applications (especially, blockchains), etc. The development of CUDA makes this General-Purpose computing on GPU possible. In particular, currently GPU technology has been widely used for server-side applications so as to provide fast and efficient service to a number of clients. In other words, servers need to process a large amount of user data and execute authentication process. Verifying the integrity of transmitted data is essential for ensuring that the data is not modified during transmission. Hash functions are the cryptographic algorithm which can verify the integrity of data and there are SHA-1, SHA-2, and SHA-3 standard hash functions. In 2015, Keccak algorithm was selected for SHA-3 competition by NIST. However, until now, software implementations of SHA-3 have not provided enough performance for various applications. In addition, SHA-3 and SHAKE using SHA-3 are being used in many Post-Quantum Cryptosystems (PQC) submitted to NIST PQC competition. Therefore, SHA-3 optimization research is required in the software environment. We propose an optimized SHA-3 software implementation on GPU environment. For performance efficiency, we propose several techniques including optimization of SHA-3 internal process, inline PTX optimization, optimized memory usage, and the application of asynchronous CUDA stream. As a result of applying the proposed optimization method, our SHA-3(512) (resp. SHA-3(256)) implementation without CUDA stream provides a maximum throughput of 88.51 Gb/s (resp. 171.62 Gb/s) on RTX2080Ti GPU. Furthermore, without the application of CUDA stream, our SHA-3(512) software on GTX1070 provides about 49.73% improved throughput compared with the previous best work on GTX1080, which shows the superiority of our proposed optimization methods. Our optimized SHA-3 software on GPU can be efficiently used for block-chain applications and several PQCs (especially, key generation process in Lattice-based cryptosystems).https://ieeexplore.ieee.org/document/9585122/Graphic Processing Unit (GPU)secure hash functionSecure Hash Algorithm (SHA)-3software optimizationNVIDIA CUDAparallel processing
spellingShingle Hojin Choi
Seog Chung Seo
Fast Implementation of SHA-3 in GPU Environment
IEEE Access
Graphic Processing Unit (GPU)
secure hash function
Secure Hash Algorithm (SHA)-3
software optimization
NVIDIA CUDA
parallel processing
title Fast Implementation of SHA-3 in GPU Environment
title_full Fast Implementation of SHA-3 in GPU Environment
title_fullStr Fast Implementation of SHA-3 in GPU Environment
title_full_unstemmed Fast Implementation of SHA-3 in GPU Environment
title_short Fast Implementation of SHA-3 in GPU Environment
title_sort fast implementation of sha 3 in gpu environment
topic Graphic Processing Unit (GPU)
secure hash function
Secure Hash Algorithm (SHA)-3
software optimization
NVIDIA CUDA
parallel processing
url https://ieeexplore.ieee.org/document/9585122/
work_keys_str_mv AT hojinchoi fastimplementationofsha3ingpuenvironment
AT seogchungseo fastimplementationofsha3ingpuenvironment