An approach to enhance pnetCDF performance in environmental modeling applications
Data intensive simulations are often limited by their I/O (input/output) performance, and "novel" techniques need to be developed in order to overcome this limitation. The software package pnetCDF (parallel network Common Data Form), which works with parallel file systems, was developed t...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Copernicus Publications
2015-04-01
|
Series: | Geoscientific Model Development |
Online Access: | http://www.geosci-model-dev.net/8/1033/2015/gmd-8-1033-2015.pdf |
_version_ | 1829484799860932608 |
---|---|
author | D. C. Wong C. E. Yang J. S. Fu K. Wong Y. Gao |
author_facet | D. C. Wong C. E. Yang J. S. Fu K. Wong Y. Gao |
author_sort | D. C. Wong |
collection | DOAJ |
description | Data intensive simulations are often limited by their I/O (input/output) performance, and
"novel" techniques need to be developed in order to overcome this
limitation. The software package pnetCDF (parallel network Common Data Form), which works with parallel file
systems, was developed to address this issue by providing parallel I/O
capability. This study examines the performance of an application-level data
aggregation approach which performs data aggregation along either row or
column dimension of MPI (Message Passing Interface) processes on a spatially decomposed domain, and then
applies the pnetCDF parallel I/O paradigm. The test was done with three
different domain sizes which represent small, moderately large, and large
data domains, using a small-scale Community Multiscale Air Quality model
(CMAQ) mock-up code. The examination includes comparing I/O performance with
traditional serial I/O technique, straight application of pnetCDF, and the
data aggregation along row and column dimension before applying pnetCDF.
After the comparison, "optimal" I/O configurations of this application-level data aggregation approach were quantified. Data aggregation along the
row dimension (pnetCDFcr) works better than along the column dimension
(pnetCDFcc) although it may perform slightly worse than the straight pnetCDF
method with a small number of processors. When the number of processors
becomes larger, pnetCDFcr outperforms pnetCDF significantly. If the number
of processors keeps increasing, pnetCDF reaches a point where the
performance is even worse than the serial I/O technique. This new technique
has also been tested for a real application where it performs two times
better than the straight pnetCDF paradigm. |
first_indexed | 2024-12-14T22:34:19Z |
format | Article |
id | doaj.art-dd87c5936aa149dba73a6c5f9ffba1d9 |
institution | Directory Open Access Journal |
issn | 1991-959X 1991-9603 |
language | English |
last_indexed | 2024-12-14T22:34:19Z |
publishDate | 2015-04-01 |
publisher | Copernicus Publications |
record_format | Article |
series | Geoscientific Model Development |
spelling | doaj.art-dd87c5936aa149dba73a6c5f9ffba1d92022-12-21T22:45:12ZengCopernicus PublicationsGeoscientific Model Development1991-959X1991-96032015-04-01841033104610.5194/gmd-8-1033-2015An approach to enhance pnetCDF performance in environmental modeling applicationsD. C. Wong0C. E. Yang1J. S. Fu2K. Wong3Y. Gao4U.S. Environmental Protection Agency, Research Triangle Park, NC, USAUniversity of Tennessee, Knoxville, TN, USAUniversity of Tennessee, Knoxville, TN, USAUniversity of Tennessee, Knoxville, TN, USAUniversity of Tennessee, Knoxville, TN, USAData intensive simulations are often limited by their I/O (input/output) performance, and "novel" techniques need to be developed in order to overcome this limitation. The software package pnetCDF (parallel network Common Data Form), which works with parallel file systems, was developed to address this issue by providing parallel I/O capability. This study examines the performance of an application-level data aggregation approach which performs data aggregation along either row or column dimension of MPI (Message Passing Interface) processes on a spatially decomposed domain, and then applies the pnetCDF parallel I/O paradigm. The test was done with three different domain sizes which represent small, moderately large, and large data domains, using a small-scale Community Multiscale Air Quality model (CMAQ) mock-up code. The examination includes comparing I/O performance with traditional serial I/O technique, straight application of pnetCDF, and the data aggregation along row and column dimension before applying pnetCDF. After the comparison, "optimal" I/O configurations of this application-level data aggregation approach were quantified. Data aggregation along the row dimension (pnetCDFcr) works better than along the column dimension (pnetCDFcc) although it may perform slightly worse than the straight pnetCDF method with a small number of processors. When the number of processors becomes larger, pnetCDFcr outperforms pnetCDF significantly. If the number of processors keeps increasing, pnetCDF reaches a point where the performance is even worse than the serial I/O technique. This new technique has also been tested for a real application where it performs two times better than the straight pnetCDF paradigm.http://www.geosci-model-dev.net/8/1033/2015/gmd-8-1033-2015.pdf |
spellingShingle | D. C. Wong C. E. Yang J. S. Fu K. Wong Y. Gao An approach to enhance pnetCDF performance in environmental modeling applications Geoscientific Model Development |
title | An approach to enhance pnetCDF performance in environmental modeling applications |
title_full | An approach to enhance pnetCDF performance in environmental modeling applications |
title_fullStr | An approach to enhance pnetCDF performance in environmental modeling applications |
title_full_unstemmed | An approach to enhance pnetCDF performance in environmental modeling applications |
title_short | An approach to enhance pnetCDF performance in environmental modeling applications |
title_sort | approach to enhance pnetcdf performance in environmental modeling applications |
url | http://www.geosci-model-dev.net/8/1033/2015/gmd-8-1033-2015.pdf |
work_keys_str_mv | AT dcwong anapproachtoenhancepnetcdfperformanceinenvironmentalmodelingapplications AT ceyang anapproachtoenhancepnetcdfperformanceinenvironmentalmodelingapplications AT jsfu anapproachtoenhancepnetcdfperformanceinenvironmentalmodelingapplications AT kwong anapproachtoenhancepnetcdfperformanceinenvironmentalmodelingapplications AT ygao anapproachtoenhancepnetcdfperformanceinenvironmentalmodelingapplications AT dcwong approachtoenhancepnetcdfperformanceinenvironmentalmodelingapplications AT ceyang approachtoenhancepnetcdfperformanceinenvironmentalmodelingapplications AT jsfu approachtoenhancepnetcdfperformanceinenvironmentalmodelingapplications AT kwong approachtoenhancepnetcdfperformanceinenvironmentalmodelingapplications AT ygao approachtoenhancepnetcdfperformanceinenvironmentalmodelingapplications |