Acceleration of the Parameterization of Unified Microphysics Across Scales (PUMAS) on the Graphics Processing Unit (GPU) With Directive‐Based Methods

Abstract Cloud microphysics is one of the most time‐consuming components in a climate model. In this study, we port the cloud microphysics parameterization in the Community Atmosphere Model (CAM), known as Parameterization of Unified Microphysics Across Scales (PUMAS), from CPU to GPU to seek a comp...

Full description

Bibliographic Details
Main Authors: Jian Sun, John M. Dennis, Sheri A. Mickelson, Brian Vanderwende, Andrew Gettelman, Katherine Thayer‐Calder
Format: Article
Language:English
Published: American Geophysical Union (AGU) 2023-05-01
Series:Journal of Advances in Modeling Earth Systems
Subjects:
Online Access:https://doi.org/10.1029/2022MS003515
_version_ 1827921691289845760
author Jian Sun
John M. Dennis
Sheri A. Mickelson
Brian Vanderwende
Andrew Gettelman
Katherine Thayer‐Calder
author_facet Jian Sun
John M. Dennis
Sheri A. Mickelson
Brian Vanderwende
Andrew Gettelman
Katherine Thayer‐Calder
author_sort Jian Sun
collection DOAJ
description Abstract Cloud microphysics is one of the most time‐consuming components in a climate model. In this study, we port the cloud microphysics parameterization in the Community Atmosphere Model (CAM), known as Parameterization of Unified Microphysics Across Scales (PUMAS), from CPU to GPU to seek a computational speedup. The directive‐based methods (OpenACC and OpenMP target offload) are determined as the best fit specifically for our development practices, which enable a single version of source code to run either on the CPU or GPU, and yield a better portability and maintainability. Their performance is first examined in a PUMAS stand‐alone kernel and the directive‐based methods can outperform a CPU node as long as there is enough computational burden on the GPU. A consistent behavior is observed when we run PUMAS on the GPU in a practical CAM simulation. A 3.6× speedup of the PUMAS execution time, including data movement between CPU and GPU, is achieved at a coarse horizontal resolution (8 NVIDIA V100 GPUs against 36 Intel Skylake CPU cores). This speedup further increases up to 5.4× at a high resolution (24 NVIDIA V100 GPUs against 108 Intel Skylake CPU cores), which highlights the fact that GPU favors larger problem size. This study demonstrates that using GPU in a CAM simulation can save noticeable computational costs even with a small portion of code being GPU‐enabled. Therefore, we are encouraged to port more parameterizations to GPU to take advantage of its computational benefit.
first_indexed 2024-03-13T04:31:04Z
format Article
id doaj.art-81895c2ebdef4b69b2d37ff4e66334c7
institution Directory Open Access Journal
issn 1942-2466
language English
last_indexed 2024-03-13T04:31:04Z
publishDate 2023-05-01
publisher American Geophysical Union (AGU)
record_format Article
series Journal of Advances in Modeling Earth Systems
spelling doaj.art-81895c2ebdef4b69b2d37ff4e66334c72023-06-19T13:40:46ZengAmerican Geophysical Union (AGU)Journal of Advances in Modeling Earth Systems1942-24662023-05-01155n/an/a10.1029/2022MS003515Acceleration of the Parameterization of Unified Microphysics Across Scales (PUMAS) on the Graphics Processing Unit (GPU) With Directive‐Based MethodsJian Sun0John M. Dennis1Sheri A. Mickelson2Brian Vanderwende3Andrew Gettelman4Katherine Thayer‐Calder5National Center for Atmospheric Research CO Boulder USANational Center for Atmospheric Research CO Boulder USANational Center for Atmospheric Research CO Boulder USANational Center for Atmospheric Research CO Boulder USANational Center for Atmospheric Research CO Boulder USANational Center for Atmospheric Research CO Boulder USAAbstract Cloud microphysics is one of the most time‐consuming components in a climate model. In this study, we port the cloud microphysics parameterization in the Community Atmosphere Model (CAM), known as Parameterization of Unified Microphysics Across Scales (PUMAS), from CPU to GPU to seek a computational speedup. The directive‐based methods (OpenACC and OpenMP target offload) are determined as the best fit specifically for our development practices, which enable a single version of source code to run either on the CPU or GPU, and yield a better portability and maintainability. Their performance is first examined in a PUMAS stand‐alone kernel and the directive‐based methods can outperform a CPU node as long as there is enough computational burden on the GPU. A consistent behavior is observed when we run PUMAS on the GPU in a practical CAM simulation. A 3.6× speedup of the PUMAS execution time, including data movement between CPU and GPU, is achieved at a coarse horizontal resolution (8 NVIDIA V100 GPUs against 36 Intel Skylake CPU cores). This speedup further increases up to 5.4× at a high resolution (24 NVIDIA V100 GPUs against 108 Intel Skylake CPU cores), which highlights the fact that GPU favors larger problem size. This study demonstrates that using GPU in a CAM simulation can save noticeable computational costs even with a small portion of code being GPU‐enabled. Therefore, we are encouraged to port more parameterizations to GPU to take advantage of its computational benefit.https://doi.org/10.1029/2022MS003515GPUOpenACCOpenMP target offloadPUMASCAM
spellingShingle Jian Sun
John M. Dennis
Sheri A. Mickelson
Brian Vanderwende
Andrew Gettelman
Katherine Thayer‐Calder
Acceleration of the Parameterization of Unified Microphysics Across Scales (PUMAS) on the Graphics Processing Unit (GPU) With Directive‐Based Methods
Journal of Advances in Modeling Earth Systems
GPU
OpenACC
OpenMP target offload
PUMAS
CAM
title Acceleration of the Parameterization of Unified Microphysics Across Scales (PUMAS) on the Graphics Processing Unit (GPU) With Directive‐Based Methods
title_full Acceleration of the Parameterization of Unified Microphysics Across Scales (PUMAS) on the Graphics Processing Unit (GPU) With Directive‐Based Methods
title_fullStr Acceleration of the Parameterization of Unified Microphysics Across Scales (PUMAS) on the Graphics Processing Unit (GPU) With Directive‐Based Methods
title_full_unstemmed Acceleration of the Parameterization of Unified Microphysics Across Scales (PUMAS) on the Graphics Processing Unit (GPU) With Directive‐Based Methods
title_short Acceleration of the Parameterization of Unified Microphysics Across Scales (PUMAS) on the Graphics Processing Unit (GPU) With Directive‐Based Methods
title_sort acceleration of the parameterization of unified microphysics across scales pumas on the graphics processing unit gpu with directive based methods
topic GPU
OpenACC
OpenMP target offload
PUMAS
CAM
url https://doi.org/10.1029/2022MS003515
work_keys_str_mv AT jiansun accelerationoftheparameterizationofunifiedmicrophysicsacrossscalespumasonthegraphicsprocessingunitgpuwithdirectivebasedmethods
AT johnmdennis accelerationoftheparameterizationofunifiedmicrophysicsacrossscalespumasonthegraphicsprocessingunitgpuwithdirectivebasedmethods
AT sheriamickelson accelerationoftheparameterizationofunifiedmicrophysicsacrossscalespumasonthegraphicsprocessingunitgpuwithdirectivebasedmethods
AT brianvanderwende accelerationoftheparameterizationofunifiedmicrophysicsacrossscalespumasonthegraphicsprocessingunitgpuwithdirectivebasedmethods
AT andrewgettelman accelerationoftheparameterizationofunifiedmicrophysicsacrossscalespumasonthegraphicsprocessingunitgpuwithdirectivebasedmethods
AT katherinethayercalder accelerationoftheparameterizationofunifiedmicrophysicsacrossscalespumasonthegraphicsprocessingunitgpuwithdirectivebasedmethods