Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0

The best hope for reducing long-standing global climate model biases is by increasing resolution to the kilometer scale. Here we present results from an ultrahigh-resolution non-hydrostatic climate model for a near-global setup running on the full Piz Daint supercomputer on 4888 GPUs (graphics p...

Full description

Bibliographic Details
Main Authors: O. Fuhrer, T. Chadha, T. Hoefler, G. Kwasniewski, X. Lapillonne, D. Leutwyler, D. Lüthi, C. Osuna, C. Schär, T. C. Schulthess, H. Vogt
Format: Article
Language:English
Published: Copernicus Publications 2018-05-01
Series:Geoscientific Model Development
Online Access:https://www.geosci-model-dev.net/11/1665/2018/gmd-11-1665-2018.pdf
_version_ 1819295513972834304
author O. Fuhrer
T. Chadha
T. Hoefler
G. Kwasniewski
X. Lapillonne
D. Leutwyler
D. Lüthi
C. Osuna
C. Schär
T. C. Schulthess
T. C. Schulthess
H. Vogt
author_facet O. Fuhrer
T. Chadha
T. Hoefler
G. Kwasniewski
X. Lapillonne
D. Leutwyler
D. Lüthi
C. Osuna
C. Schär
T. C. Schulthess
T. C. Schulthess
H. Vogt
author_sort O. Fuhrer
collection DOAJ
description The best hope for reducing long-standing global climate model biases is by increasing resolution to the kilometer scale. Here we present results from an ultrahigh-resolution non-hydrostatic climate model for a near-global setup running on the full Piz Daint supercomputer on 4888 GPUs (graphics processing units). The dynamical core of the model has been completely rewritten using a domain-specific language (DSL) for performance portability across different hardware architectures. Physical parameterizations and diagnostics have been ported using compiler directives. To our knowledge this represents the first complete atmospheric model being run entirely on accelerators on this scale. At a grid spacing of 930 m (1.9 km), we achieve a simulation throughput of 0.043 (0.23) simulated years per day and an energy consumption of 596 MWh per simulated year. Furthermore, we propose a new memory usage efficiency (MUE) metric that considers how efficiently the memory bandwidth – the dominant bottleneck of climate codes – is being used.
first_indexed 2024-12-24T04:43:25Z
format Article
id doaj.art-1ec5a114e15548aa9600fac3da5334a8
institution Directory Open Access Journal
issn 1991-959X
1991-9603
language English
last_indexed 2024-12-24T04:43:25Z
publishDate 2018-05-01
publisher Copernicus Publications
record_format Article
series Geoscientific Model Development
spelling doaj.art-1ec5a114e15548aa9600fac3da5334a82022-12-21T17:14:45ZengCopernicus PublicationsGeoscientific Model Development1991-959X1991-96032018-05-01111665168110.5194/gmd-11-1665-2018Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0O. Fuhrer0T. Chadha1T. Hoefler2G. Kwasniewski3X. Lapillonne4D. Leutwyler5D. Lüthi6C. Osuna7C. Schär8T. C. Schulthess9T. C. Schulthess10H. Vogt11Federal Institute of Meteorology and Climatology, MeteoSwiss, Zurich, SwitzerlandITS Research Informatics, ETH Zurich, SwitzerlandScalable Parallel Computing Lab, ETH Zurich, SwitzerlandScalable Parallel Computing Lab, ETH Zurich, SwitzerlandFederal Institute of Meteorology and Climatology, MeteoSwiss, Zurich, SwitzerlandInstitute for Atmospheric and Climate Science, ETH Zurich, SwitzerlandInstitute for Atmospheric and Climate Science, ETH Zurich, SwitzerlandFederal Institute of Meteorology and Climatology, MeteoSwiss, Zurich, SwitzerlandInstitute for Atmospheric and Climate Science, ETH Zurich, SwitzerlandInstitute for Theoretical Physics, ETH Zurich, SwitzerlandSwiss National Supercomputing Centre, CSCS, Lugano, SwitzerlandSwiss National Supercomputing Centre, CSCS, Lugano, SwitzerlandThe best hope for reducing long-standing global climate model biases is by increasing resolution to the kilometer scale. Here we present results from an ultrahigh-resolution non-hydrostatic climate model for a near-global setup running on the full Piz Daint supercomputer on 4888 GPUs (graphics processing units). The dynamical core of the model has been completely rewritten using a domain-specific language (DSL) for performance portability across different hardware architectures. Physical parameterizations and diagnostics have been ported using compiler directives. To our knowledge this represents the first complete atmospheric model being run entirely on accelerators on this scale. At a grid spacing of 930 m (1.9 km), we achieve a simulation throughput of 0.043 (0.23) simulated years per day and an energy consumption of 596 MWh per simulated year. Furthermore, we propose a new memory usage efficiency (MUE) metric that considers how efficiently the memory bandwidth – the dominant bottleneck of climate codes – is being used.https://www.geosci-model-dev.net/11/1665/2018/gmd-11-1665-2018.pdf
spellingShingle O. Fuhrer
T. Chadha
T. Hoefler
G. Kwasniewski
X. Lapillonne
D. Leutwyler
D. Lüthi
C. Osuna
C. Schär
T. C. Schulthess
T. C. Schulthess
H. Vogt
Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0
Geoscientific Model Development
title Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0
title_full Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0
title_fullStr Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0
title_full_unstemmed Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0
title_short Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0
title_sort near global climate simulation at 1 km resolution establishing a performance baseline on 4888 gpus with cosmo 5 0
url https://www.geosci-model-dev.net/11/1665/2018/gmd-11-1665-2018.pdf
work_keys_str_mv AT ofuhrer nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50
AT tchadha nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50
AT thoefler nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50
AT gkwasniewski nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50
AT xlapillonne nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50
AT dleutwyler nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50
AT dluthi nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50
AT cosuna nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50
AT cschar nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50
AT tcschulthess nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50
AT tcschulthess nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50
AT hvogt nearglobalclimatesimulationat1kmresolutionestablishingaperformancebaselineon4888gpuswithcosmo50