Implementation of a Fully-Parallel Turbo Decoder on a General-Purpose Graphics Processing Unit

Turbo codes comprising a parallel concatenation of upper and lower convolutional codes are widely employed in the state-of-the-art wireless communication standards, since they facilitate transmission throughputs that closely approach the channel capacity. However, this necessitates high processing t...

Full description

Bibliographic Details
Main Authors: An Li, Robert G. Maunder, Bashir M. Al-Hashimi, Lajos Hanzo
Format: Article
Language:English
Published: IEEE 2016-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/7501831/
_version_ 1828966446713012224
author An Li
Robert G. Maunder
Bashir M. Al-Hashimi
Lajos Hanzo
author_facet An Li
Robert G. Maunder
Bashir M. Al-Hashimi
Lajos Hanzo
author_sort An Li
collection DOAJ
description Turbo codes comprising a parallel concatenation of upper and lower convolutional codes are widely employed in the state-of-the-art wireless communication standards, since they facilitate transmission throughputs that closely approach the channel capacity. However, this necessitates high processing throughputs in order for the turbo code to support real-time communications. In the state-of-the-art turbo code implementations, the processing throughput is typically limited by the data dependences that occur within the forward and backward recursions of the Log-BCJR algorithm, which is employed during turbo decoding. In contrast to the highly serial Log-BCJR turbo decoder, we have recently proposed a novel fully parallel turbo decoder (FPTD) algorithm, which can eliminate the data dependences and perform fully parallel processing. In this paper, we propose an optimized FPTD algorithm, which reformulates the operation of the FPTD algorithm so that the upper and lower decoders have identical operation, in order to support single instruction multiple data operation. This allows us to develop a novel general purpose graphics processing unit (GPGPU) implementation of the FPTD, which has application in software-defined radios and virtualized cloud-radio access networks. As a benefit of its higher degree of parallelism, we show that our FPTD improves the higher processing throughput of the Log-BCJR turbo decoder by between 2.3 and 9.2 times, when employing a high-specification GPGPU. However, this is achieved at the cost of a moderate increase of the overall complexity by between 1.7 and 3.3 times.
first_indexed 2024-12-14T11:31:58Z
format Article
id doaj.art-9727601940b442fd8923aa76156eb86f
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-14T11:31:58Z
publishDate 2016-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-9727601940b442fd8923aa76156eb86f2022-12-21T23:03:16ZengIEEEIEEE Access2169-35362016-01-0145624563910.1109/ACCESS.2016.25863097501831Implementation of a Fully-Parallel Turbo Decoder on a General-Purpose Graphics Processing UnitAn Li0Robert G. Maunder1Bashir M. Al-Hashimi2Lajos Hanzo3Department of Electronics and Computer Science, University of Southampton, Southampton, U.K.Department of Electronics and Computer Science, University of Southampton, Southampton, U.K.Department of Electronics and Computer Science, University of Southampton, Southampton, U.K.Department of Electronics and Computer Science, University of Southampton, Southampton, U.K.Turbo codes comprising a parallel concatenation of upper and lower convolutional codes are widely employed in the state-of-the-art wireless communication standards, since they facilitate transmission throughputs that closely approach the channel capacity. However, this necessitates high processing throughputs in order for the turbo code to support real-time communications. In the state-of-the-art turbo code implementations, the processing throughput is typically limited by the data dependences that occur within the forward and backward recursions of the Log-BCJR algorithm, which is employed during turbo decoding. In contrast to the highly serial Log-BCJR turbo decoder, we have recently proposed a novel fully parallel turbo decoder (FPTD) algorithm, which can eliminate the data dependences and perform fully parallel processing. In this paper, we propose an optimized FPTD algorithm, which reformulates the operation of the FPTD algorithm so that the upper and lower decoders have identical operation, in order to support single instruction multiple data operation. This allows us to develop a novel general purpose graphics processing unit (GPGPU) implementation of the FPTD, which has application in software-defined radios and virtualized cloud-radio access networks. As a benefit of its higher degree of parallelism, we show that our FPTD improves the higher processing throughput of the Log-BCJR turbo decoder by between 2.3 and 9.2 times, when employing a high-specification GPGPU. However, this is achieved at the cost of a moderate increase of the overall complexity by between 1.7 and 3.3 times.https://ieeexplore.ieee.org/document/7501831/Fully-parallel turbo decoderparallel processingGPGPU computingsoftware defined radiocould radio access network
spellingShingle An Li
Robert G. Maunder
Bashir M. Al-Hashimi
Lajos Hanzo
Implementation of a Fully-Parallel Turbo Decoder on a General-Purpose Graphics Processing Unit
IEEE Access
Fully-parallel turbo decoder
parallel processing
GPGPU computing
software defined radio
could radio access network
title Implementation of a Fully-Parallel Turbo Decoder on a General-Purpose Graphics Processing Unit
title_full Implementation of a Fully-Parallel Turbo Decoder on a General-Purpose Graphics Processing Unit
title_fullStr Implementation of a Fully-Parallel Turbo Decoder on a General-Purpose Graphics Processing Unit
title_full_unstemmed Implementation of a Fully-Parallel Turbo Decoder on a General-Purpose Graphics Processing Unit
title_short Implementation of a Fully-Parallel Turbo Decoder on a General-Purpose Graphics Processing Unit
title_sort implementation of a fully parallel turbo decoder on a general purpose graphics processing unit
topic Fully-parallel turbo decoder
parallel processing
GPGPU computing
software defined radio
could radio access network
url https://ieeexplore.ieee.org/document/7501831/
work_keys_str_mv AT anli implementationofafullyparallelturbodecoderonageneralpurposegraphicsprocessingunit
AT robertgmaunder implementationofafullyparallelturbodecoderonageneralpurposegraphicsprocessingunit
AT bashirmalhashimi implementationofafullyparallelturbodecoderonageneralpurposegraphicsprocessingunit
AT lajoshanzo implementationofafullyparallelturbodecoderonageneralpurposegraphicsprocessingunit