A graphics processing unit-accelerated meshless method for two-dimensional compressible flows

A graphics processing unit (GPU) -accelerated meshless method is presented for solving two-dimensional compressible flows over aerodynamic bodies. The Compute Unified Device Architecture (CUDA) Fortran programming model is employed to port the meshless method from central processing unit to GPU as a...

Full description

Bibliographic Details
Main Authors: Jiale Zhang, Hongquan Chen, Cheng Cao
Format: Article
Language:English
Published: Taylor & Francis Group 2017-01-01
Series:Engineering Applications of Computational Fluid Mechanics
Subjects:
Online Access:http://dx.doi.org/10.1080/19942060.2017.1317027
_version_ 1818647762079580160
author Jiale Zhang
Hongquan Chen
Cheng Cao
author_facet Jiale Zhang
Hongquan Chen
Cheng Cao
author_sort Jiale Zhang
collection DOAJ
description A graphics processing unit (GPU) -accelerated meshless method is presented for solving two-dimensional compressible flows over aerodynamic bodies. The Compute Unified Device Architecture (CUDA) Fortran programming model is employed to port the meshless method from central processing unit to GPU as a way of achieving efficiency, which involves implementation of CUDA kernels and management of data storage structure and thread hierarchy. The CUDA kernel subroutines are designed to meet with the point-based computing of the meshless method. The corresponding point-based data structure and thread hierarchy are constructed or manipulated in the paper by presenting two specific GPU implementations of the meshless method, which are developed for solving Navier–Stokes equations. The Jameson–Schmidt–Turkel scheme is used to estimate the flux terms of the Navier–Stokes equations and an explicit four-stage Runge–Kutta scheme is applied to update the solution at time level. After tuning the performances of the resulting two GPU-accelerated meshless solvers by changing the number of threads in a block, a set of typical flows over aerodynamic bodies are simulated for validation. Numerical results are shown in a comparison with available experimental data or computational values that appear in extant literature with an analysis of code performance. This reveals that the cost of computing time of the presented test cases is significantly reduced for both solvers without losing accuracy, while impressive speedups up to 64 times are achieved due to careful management of memory access.
first_indexed 2024-12-17T01:07:41Z
format Article
id doaj.art-162ee2b3bdbd48c18836b9611231f475
institution Directory Open Access Journal
issn 1994-2060
1997-003X
language English
last_indexed 2024-12-17T01:07:41Z
publishDate 2017-01-01
publisher Taylor & Francis Group
record_format Article
series Engineering Applications of Computational Fluid Mechanics
spelling doaj.art-162ee2b3bdbd48c18836b9611231f4752022-12-21T22:09:13ZengTaylor & Francis GroupEngineering Applications of Computational Fluid Mechanics1994-20601997-003X2017-01-0111152654310.1080/19942060.2017.13170271317027A graphics processing unit-accelerated meshless method for two-dimensional compressible flowsJiale Zhang0Hongquan Chen1Cheng Cao2Nanjing University of Aeronautics and AstronauticsNanjing University of Aeronautics and AstronauticsNanjing University of Aeronautics and AstronauticsA graphics processing unit (GPU) -accelerated meshless method is presented for solving two-dimensional compressible flows over aerodynamic bodies. The Compute Unified Device Architecture (CUDA) Fortran programming model is employed to port the meshless method from central processing unit to GPU as a way of achieving efficiency, which involves implementation of CUDA kernels and management of data storage structure and thread hierarchy. The CUDA kernel subroutines are designed to meet with the point-based computing of the meshless method. The corresponding point-based data structure and thread hierarchy are constructed or manipulated in the paper by presenting two specific GPU implementations of the meshless method, which are developed for solving Navier–Stokes equations. The Jameson–Schmidt–Turkel scheme is used to estimate the flux terms of the Navier–Stokes equations and an explicit four-stage Runge–Kutta scheme is applied to update the solution at time level. After tuning the performances of the resulting two GPU-accelerated meshless solvers by changing the number of threads in a block, a set of typical flows over aerodynamic bodies are simulated for validation. Numerical results are shown in a comparison with available experimental data or computational values that appear in extant literature with an analysis of code performance. This reveals that the cost of computing time of the presented test cases is significantly reduced for both solvers without losing accuracy, while impressive speedups up to 64 times are achieved due to careful management of memory access.http://dx.doi.org/10.1080/19942060.2017.1317027GPU parallel computingmeshless methodNavier–Stokes equationsCUDA Fortran
spellingShingle Jiale Zhang
Hongquan Chen
Cheng Cao
A graphics processing unit-accelerated meshless method for two-dimensional compressible flows
Engineering Applications of Computational Fluid Mechanics
GPU parallel computing
meshless method
Navier–Stokes equations
CUDA Fortran
title A graphics processing unit-accelerated meshless method for two-dimensional compressible flows
title_full A graphics processing unit-accelerated meshless method for two-dimensional compressible flows
title_fullStr A graphics processing unit-accelerated meshless method for two-dimensional compressible flows
title_full_unstemmed A graphics processing unit-accelerated meshless method for two-dimensional compressible flows
title_short A graphics processing unit-accelerated meshless method for two-dimensional compressible flows
title_sort graphics processing unit accelerated meshless method for two dimensional compressible flows
topic GPU parallel computing
meshless method
Navier–Stokes equations
CUDA Fortran
url http://dx.doi.org/10.1080/19942060.2017.1317027
work_keys_str_mv AT jialezhang agraphicsprocessingunitacceleratedmeshlessmethodfortwodimensionalcompressibleflows
AT hongquanchen agraphicsprocessingunitacceleratedmeshlessmethodfortwodimensionalcompressibleflows
AT chengcao agraphicsprocessingunitacceleratedmeshlessmethodfortwodimensionalcompressibleflows
AT jialezhang graphicsprocessingunitacceleratedmeshlessmethodfortwodimensionalcompressibleflows
AT hongquanchen graphicsprocessingunitacceleratedmeshlessmethodfortwodimensionalcompressibleflows
AT chengcao graphicsprocessingunitacceleratedmeshlessmethodfortwodimensionalcompressibleflows