Software prefetching for unstructured mesh applications
This article demonstrates the utility and implementation of software prefetching in an unstructured finite volume computational fluid dynamics code of representative size and complexity to an industrial application and across a number of modern processors. We present the benefits of auto-tuning for...
Հիմնական հեղինակներ: | , , , |
---|---|
Ձևաչափ: | Journal article |
Լեզու: | English |
Հրապարակվել է: |
Association for Computing Machinery
2020
|
_version_ | 1826297444499980288 |
---|---|
author | Hadade, I Jones, T Wang, F Di Mare, L |
author_facet | Hadade, I Jones, T Wang, F Di Mare, L |
author_sort | Hadade, I |
collection | OXFORD |
description | This article demonstrates the utility and implementation of software prefetching in an unstructured finite volume computational fluid dynamics code of representative size and complexity to an industrial application and across a number of modern processors. We present the benefits of auto-tuning for finding the optimal prefetch distance values across different computational kernels and architectures and demonstrate the importance of choosing the right prefetch destination across the available cache levels for best performance. We discuss the impact of the data layout on the number of prefetch instructions required in kernels with indirect addressing patterns and show how to best implement them in an existing large-scale computational fluid dynamics application. Through this, we show significant full application speed-ups on a range of processors and realistic test cases in both single core/tile and full socket configurations, such as 1.14× on the Intel Xeon Sandy Bridge, 1.09× on the Intel Xeon Broadwell, 1.29× on the Intel Xeon Skylake, 1.99× on the in-order Intel Xeon Phi Knights Corner coprocessor, and 1.51× on the out-of-order Intel Xeon Phi Knights Landing many-core processor. |
first_indexed | 2024-03-07T04:31:41Z |
format | Journal article |
id | oxford-uuid:ce8ca1ff-83ee-4eb6-b941-bbe9d1fdfd4a |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T04:31:41Z |
publishDate | 2020 |
publisher | Association for Computing Machinery |
record_format | dspace |
spelling | oxford-uuid:ce8ca1ff-83ee-4eb6-b941-bbe9d1fdfd4a2022-03-27T07:36:21ZSoftware prefetching for unstructured mesh applicationsJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:ce8ca1ff-83ee-4eb6-b941-bbe9d1fdfd4aEnglishSymplectic Elements at OxfordAssociation for Computing Machinery2020Hadade, IJones, TWang, FDi Mare, LThis article demonstrates the utility and implementation of software prefetching in an unstructured finite volume computational fluid dynamics code of representative size and complexity to an industrial application and across a number of modern processors. We present the benefits of auto-tuning for finding the optimal prefetch distance values across different computational kernels and architectures and demonstrate the importance of choosing the right prefetch destination across the available cache levels for best performance. We discuss the impact of the data layout on the number of prefetch instructions required in kernels with indirect addressing patterns and show how to best implement them in an existing large-scale computational fluid dynamics application. Through this, we show significant full application speed-ups on a range of processors and realistic test cases in both single core/tile and full socket configurations, such as 1.14× on the Intel Xeon Sandy Bridge, 1.09× on the Intel Xeon Broadwell, 1.29× on the Intel Xeon Skylake, 1.99× on the in-order Intel Xeon Phi Knights Corner coprocessor, and 1.51× on the out-of-order Intel Xeon Phi Knights Landing many-core processor. |
spellingShingle | Hadade, I Jones, T Wang, F Di Mare, L Software prefetching for unstructured mesh applications |
title | Software prefetching for unstructured mesh applications |
title_full | Software prefetching for unstructured mesh applications |
title_fullStr | Software prefetching for unstructured mesh applications |
title_full_unstemmed | Software prefetching for unstructured mesh applications |
title_short | Software prefetching for unstructured mesh applications |
title_sort | software prefetching for unstructured mesh applications |
work_keys_str_mv | AT hadadei softwareprefetchingforunstructuredmeshapplications AT jonest softwareprefetchingforunstructuredmeshapplications AT wangf softwareprefetchingforunstructuredmeshapplications AT dimarel softwareprefetchingforunstructuredmeshapplications |