An events based algorithm for distributing concurrent tasks on multi-core architectures
In this paper, a programming model is presented which enables scalable parallel performance on multi-core shared memory architectures. The model has been developed for application to a wide range of numerical simulation problems. Such problems involve time stepping or iteration algorithms where sync...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Elsevier ScienceDirect
2011
|
Online Access: | http://hdl.handle.net/1721.1/66185 https://orcid.org/0000-0002-2970-9158 https://orcid.org/0000-0002-3826-2204 |
_version_ | 1826206737533763584 |
---|---|
author | Holmes, David W. Williams, John R. Tilke, Peter |
author2 | Massachusetts Institute of Technology. Department of Civil and Environmental Engineering |
author_facet | Massachusetts Institute of Technology. Department of Civil and Environmental Engineering Holmes, David W. Williams, John R. Tilke, Peter |
author_sort | Holmes, David W. |
collection | MIT |
description | In this paper, a programming model is presented which enables scalable parallel performance on multi-core shared memory architectures. The model has been developed for application to a wide range of numerical simulation problems. Such problems involve time stepping or iteration algorithms where synchronization of multiple threads of execution is required. It is shown that traditional approaches to parallelism including message passing and scatter-gather can be improved upon in terms of speed-up and memory management. Using spatial decomposition to create orthogonal computational tasks, a new task management algorithm called H-Dispatch is developed. This algorithm makes efficient use of memory resources by limiting the need for garbage collection and takes optimal advantage of multiple cores by employing a “hungry” pull strategy. The technique is demonstrated on a simple finite difference solver and results are compared to traditional MPI and scatter-gather approaches. The H-Dispatch approach achieves near linear speed-up with results for efficiency of 85% on a 24-core machine. It is noted that the H-Dispatch algorithm is quite general and can be applied to a wide class of computational tasks on heterogeneous architectures involving multi-core and GPGPU hardware. |
first_indexed | 2024-09-23T13:37:47Z |
format | Article |
id | mit-1721.1/66185 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T13:37:47Z |
publishDate | 2011 |
publisher | Elsevier ScienceDirect |
record_format | dspace |
spelling | mit-1721.1/661852022-09-28T15:08:51Z An events based algorithm for distributing concurrent tasks on multi-core architectures Holmes, David W. Williams, John R. Tilke, Peter Massachusetts Institute of Technology. Department of Civil and Environmental Engineering Williams, John R. Holmes, David W. Williams, John R. In this paper, a programming model is presented which enables scalable parallel performance on multi-core shared memory architectures. The model has been developed for application to a wide range of numerical simulation problems. Such problems involve time stepping or iteration algorithms where synchronization of multiple threads of execution is required. It is shown that traditional approaches to parallelism including message passing and scatter-gather can be improved upon in terms of speed-up and memory management. Using spatial decomposition to create orthogonal computational tasks, a new task management algorithm called H-Dispatch is developed. This algorithm makes efficient use of memory resources by limiting the need for garbage collection and takes optimal advantage of multiple cores by employing a “hungry” pull strategy. The technique is demonstrated on a simple finite difference solver and results are compared to traditional MPI and scatter-gather approaches. The H-Dispatch approach achieves near linear speed-up with results for efficiency of 85% on a 24-core machine. It is noted that the H-Dispatch algorithm is quite general and can be applied to a wide class of computational tasks on heterogeneous architectures involving multi-core and GPGPU hardware. Schlumberger-Doll Research Center Saudi Aramco 2011-10-05T15:49:18Z 2011-10-05T15:49:18Z 2009-10 2009-08 Article http://purl.org/eprint/type/JournalArticle 0010-4655 http://hdl.handle.net/1721.1/66185 Holmes, David W., John R. Williams, and Peter Tilke. “An events based algorithm for distributing concurrent tasks on multi-core architectures.” Computer Physics Communications 181 (2) (2010): 341-354. Copyright © 2010, Elsevier https://orcid.org/0000-0002-2970-9158 https://orcid.org/0000-0002-3826-2204 en_US http://dx.doi.org/10.1016/j.cpc.2009.10.009 Computer Physics Communications Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/ application/pdf Elsevier ScienceDirect MIT web domain |
spellingShingle | Holmes, David W. Williams, John R. Tilke, Peter An events based algorithm for distributing concurrent tasks on multi-core architectures |
title | An events based algorithm for distributing concurrent tasks on multi-core architectures |
title_full | An events based algorithm for distributing concurrent tasks on multi-core architectures |
title_fullStr | An events based algorithm for distributing concurrent tasks on multi-core architectures |
title_full_unstemmed | An events based algorithm for distributing concurrent tasks on multi-core architectures |
title_short | An events based algorithm for distributing concurrent tasks on multi-core architectures |
title_sort | events based algorithm for distributing concurrent tasks on multi core architectures |
url | http://hdl.handle.net/1721.1/66185 https://orcid.org/0000-0002-2970-9158 https://orcid.org/0000-0002-3826-2204 |
work_keys_str_mv | AT holmesdavidw aneventsbasedalgorithmfordistributingconcurrenttasksonmulticorearchitectures AT williamsjohnr aneventsbasedalgorithmfordistributingconcurrenttasksonmulticorearchitectures AT tilkepeter aneventsbasedalgorithmfordistributingconcurrenttasksonmulticorearchitectures AT holmesdavidw eventsbasedalgorithmfordistributingconcurrenttasksonmulticorearchitectures AT williamsjohnr eventsbasedalgorithmfordistributingconcurrenttasksonmulticorearchitectures AT tilkepeter eventsbasedalgorithmfordistributingconcurrenttasksonmulticorearchitectures |