A graphics processing unit-accelerated meshless method for two-dimensional compressible flows
A graphics processing unit (GPU) -accelerated meshless method is presented for solving two-dimensional compressible flows over aerodynamic bodies. The Compute Unified Device Architecture (CUDA) Fortran programming model is employed to port the meshless method from central processing unit to GPU as a...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis Group
2017-01-01
|
Series: | Engineering Applications of Computational Fluid Mechanics |
Subjects: | |
Online Access: | http://dx.doi.org/10.1080/19942060.2017.1317027 |
_version_ | 1818647762079580160 |
---|---|
author | Jiale Zhang Hongquan Chen Cheng Cao |
author_facet | Jiale Zhang Hongquan Chen Cheng Cao |
author_sort | Jiale Zhang |
collection | DOAJ |
description | A graphics processing unit (GPU) -accelerated meshless method is presented for solving two-dimensional compressible flows over aerodynamic bodies. The Compute Unified Device Architecture (CUDA) Fortran programming model is employed to port the meshless method from central processing unit to GPU as a way of achieving efficiency, which involves implementation of CUDA kernels and management of data storage structure and thread hierarchy. The CUDA kernel subroutines are designed to meet with the point-based computing of the meshless method. The corresponding point-based data structure and thread hierarchy are constructed or manipulated in the paper by presenting two specific GPU implementations of the meshless method, which are developed for solving Navier–Stokes equations. The Jameson–Schmidt–Turkel scheme is used to estimate the flux terms of the Navier–Stokes equations and an explicit four-stage Runge–Kutta scheme is applied to update the solution at time level. After tuning the performances of the resulting two GPU-accelerated meshless solvers by changing the number of threads in a block, a set of typical flows over aerodynamic bodies are simulated for validation. Numerical results are shown in a comparison with available experimental data or computational values that appear in extant literature with an analysis of code performance. This reveals that the cost of computing time of the presented test cases is significantly reduced for both solvers without losing accuracy, while impressive speedups up to 64 times are achieved due to careful management of memory access. |
first_indexed | 2024-12-17T01:07:41Z |
format | Article |
id | doaj.art-162ee2b3bdbd48c18836b9611231f475 |
institution | Directory Open Access Journal |
issn | 1994-2060 1997-003X |
language | English |
last_indexed | 2024-12-17T01:07:41Z |
publishDate | 2017-01-01 |
publisher | Taylor & Francis Group |
record_format | Article |
series | Engineering Applications of Computational Fluid Mechanics |
spelling | doaj.art-162ee2b3bdbd48c18836b9611231f4752022-12-21T22:09:13ZengTaylor & Francis GroupEngineering Applications of Computational Fluid Mechanics1994-20601997-003X2017-01-0111152654310.1080/19942060.2017.13170271317027A graphics processing unit-accelerated meshless method for two-dimensional compressible flowsJiale Zhang0Hongquan Chen1Cheng Cao2Nanjing University of Aeronautics and AstronauticsNanjing University of Aeronautics and AstronauticsNanjing University of Aeronautics and AstronauticsA graphics processing unit (GPU) -accelerated meshless method is presented for solving two-dimensional compressible flows over aerodynamic bodies. The Compute Unified Device Architecture (CUDA) Fortran programming model is employed to port the meshless method from central processing unit to GPU as a way of achieving efficiency, which involves implementation of CUDA kernels and management of data storage structure and thread hierarchy. The CUDA kernel subroutines are designed to meet with the point-based computing of the meshless method. The corresponding point-based data structure and thread hierarchy are constructed or manipulated in the paper by presenting two specific GPU implementations of the meshless method, which are developed for solving Navier–Stokes equations. The Jameson–Schmidt–Turkel scheme is used to estimate the flux terms of the Navier–Stokes equations and an explicit four-stage Runge–Kutta scheme is applied to update the solution at time level. After tuning the performances of the resulting two GPU-accelerated meshless solvers by changing the number of threads in a block, a set of typical flows over aerodynamic bodies are simulated for validation. Numerical results are shown in a comparison with available experimental data or computational values that appear in extant literature with an analysis of code performance. This reveals that the cost of computing time of the presented test cases is significantly reduced for both solvers without losing accuracy, while impressive speedups up to 64 times are achieved due to careful management of memory access.http://dx.doi.org/10.1080/19942060.2017.1317027GPU parallel computingmeshless methodNavier–Stokes equationsCUDA Fortran |
spellingShingle | Jiale Zhang Hongquan Chen Cheng Cao A graphics processing unit-accelerated meshless method for two-dimensional compressible flows Engineering Applications of Computational Fluid Mechanics GPU parallel computing meshless method Navier–Stokes equations CUDA Fortran |
title | A graphics processing unit-accelerated meshless method for two-dimensional compressible flows |
title_full | A graphics processing unit-accelerated meshless method for two-dimensional compressible flows |
title_fullStr | A graphics processing unit-accelerated meshless method for two-dimensional compressible flows |
title_full_unstemmed | A graphics processing unit-accelerated meshless method for two-dimensional compressible flows |
title_short | A graphics processing unit-accelerated meshless method for two-dimensional compressible flows |
title_sort | graphics processing unit accelerated meshless method for two dimensional compressible flows |
topic | GPU parallel computing meshless method Navier–Stokes equations CUDA Fortran |
url | http://dx.doi.org/10.1080/19942060.2017.1317027 |
work_keys_str_mv | AT jialezhang agraphicsprocessingunitacceleratedmeshlessmethodfortwodimensionalcompressibleflows AT hongquanchen agraphicsprocessingunitacceleratedmeshlessmethodfortwodimensionalcompressibleflows AT chengcao agraphicsprocessingunitacceleratedmeshlessmethodfortwodimensionalcompressibleflows AT jialezhang graphicsprocessingunitacceleratedmeshlessmethodfortwodimensionalcompressibleflows AT hongquanchen graphicsprocessingunitacceleratedmeshlessmethodfortwodimensionalcompressibleflows AT chengcao graphicsprocessingunitacceleratedmeshlessmethodfortwodimensionalcompressibleflows |