Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0
The best hope for reducing long-standing global climate model biases is by increasing resolution to the kilometer scale. Here we present results from an ultrahigh-resolution non-hydrostatic climate model for a near-global setup running on the full Piz Daint supercomputer on 4888 GPUs (graphics p...
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Copernicus Publications
2018-05-01
|
Series: | Geoscientific Model Development |
Online Access: | https://www.geosci-model-dev.net/11/1665/2018/gmd-11-1665-2018.pdf |
_version_ | 1819295513972834304 |
---|---|
author | O. Fuhrer T. Chadha T. Hoefler G. Kwasniewski X. Lapillonne D. Leutwyler D. Lüthi C. Osuna C. Schär T. C. Schulthess T. C. Schulthess H. Vogt |
author_facet | O. Fuhrer T. Chadha T. Hoefler G. Kwasniewski X. Lapillonne D. Leutwyler D. Lüthi C. Osuna C. Schär T. C. Schulthess T. C. Schulthess H. Vogt |
author_sort | O. Fuhrer |
collection | DOAJ |
description | The best hope for reducing long-standing global climate model biases is by
increasing resolution to the kilometer scale. Here we present results from an
ultrahigh-resolution non-hydrostatic climate model for a near-global setup
running on the full Piz Daint supercomputer on 4888 GPUs (graphics
processing units). The dynamical core of the model has been completely
rewritten using a domain-specific language (DSL) for performance portability
across different hardware architectures. Physical parameterizations and
diagnostics have been ported using compiler directives. To our knowledge this
represents the first complete atmospheric model being run entirely on
accelerators on this scale. At a grid spacing of 930 m (1.9 km), we achieve
a simulation throughput of 0.043 (0.23) simulated years per day and an energy
consumption of 596 MWh per simulated year. Furthermore, we propose a new
memory usage efficiency (MUE) metric that considers how efficiently the
memory bandwidth – the dominant bottleneck of climate codes – is being
used. |
first_indexed | 2024-12-24T04:43:25Z |
format | Article |
id | doaj.art-1ec5a114e15548aa9600fac3da5334a8 |
institution | Directory Open Access Journal |
issn | 1991-959X 1991-9603 |
language | English |
last_indexed | 2024-12-24T04:43:25Z |
publishDate | 2018-05-01 |
publisher | Copernicus Publications |
record_format | Article |
series | Geoscientific Model Development |
spelling | doaj.art-1ec5a114e15548aa9600fac3da5334a82022-12-21T17:14:45ZengCopernicus PublicationsGeoscientific Model Development1991-959X1991-96032018-05-01111665168110.5194/gmd-11-1665-2018Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0O. Fuhrer0T. Chadha1T. Hoefler2G. Kwasniewski3X. Lapillonne4D. Leutwyler5D. Lüthi6C. Osuna7C. Schär8T. C. Schulthess9T. C. Schulthess10H. Vogt11Federal Institute of Meteorology and Climatology, MeteoSwiss, Zurich, SwitzerlandITS Research Informatics, ETH Zurich, SwitzerlandScalable Parallel Computing Lab, ETH Zurich, SwitzerlandScalable Parallel Computing Lab, ETH Zurich, SwitzerlandFederal Institute of Meteorology and Climatology, MeteoSwiss, Zurich, SwitzerlandInstitute for Atmospheric and Climate Science, ETH Zurich, SwitzerlandInstitute for Atmospheric and Climate Science, ETH Zurich, SwitzerlandFederal Institute of Meteorology and Climatology, MeteoSwiss, Zurich, SwitzerlandInstitute for Atmospheric and Climate Science, ETH Zurich, SwitzerlandInstitute for Theoretical Physics, ETH Zurich, SwitzerlandSwiss National Supercomputing Centre, CSCS, Lugano, SwitzerlandSwiss National Supercomputing Centre, CSCS, Lugano, SwitzerlandThe best hope for reducing long-standing global climate model biases is by increasing resolution to the kilometer scale. Here we present results from an ultrahigh-resolution non-hydrostatic climate model for a near-global setup running on the full Piz Daint supercomputer on 4888 GPUs (graphics processing units). The dynamical core of the model has been completely rewritten using a domain-specific language (DSL) for performance portability across different hardware architectures. Physical parameterizations and diagnostics have been ported using compiler directives. To our knowledge this represents the first complete atmospheric model being run entirely on accelerators on this scale. At a grid spacing of 930 m (1.9 km), we achieve a simulation throughput of 0.043 (0.23) simulated years per day and an energy consumption of 596 MWh per simulated year. Furthermore, we propose a new memory usage efficiency (MUE) metric that considers how efficiently the memory bandwidth – the dominant bottleneck of climate codes – is being used.https://www.geosci-model-dev.net/11/1665/2018/gmd-11-1665-2018.pdf |
spellingShingle | O. Fuhrer T. Chadha T. Hoefler G. Kwasniewski X. Lapillonne D. Leutwyler D. Lüthi C. Osuna C. Schär T. C. Schulthess T. C. Schulthess H. Vogt Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0 Geoscientific Model Development |
title | Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0 |
title_full | Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0 |
title_fullStr | Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0 |
title_full_unstemmed | Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0 |
title_short | Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0 |
title_sort | near global climate simulation at 1 km resolution establishing a performance baseline on 4888 gpus with cosmo 5 0 |
url | https://www.geosci-model-dev.net/11/1665/2018/gmd-11-1665-2018.pdf |
work_keys_str_mv | AT ofuhrer nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50 AT tchadha nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50 AT thoefler nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50 AT gkwasniewski nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50 AT xlapillonne nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50 AT dleutwyler nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50 AT dluthi nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50 AT cosuna nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50 AT cschar nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50 AT tcschulthess nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50 AT tcschulthess nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50 AT hvogt nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50 |