Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines

Image processing pipelines combine the challenges of stencil computations and stream programs. They are composed of large graphs of different stencil stages, as well as complex reductions, and stages with global or data-dependent access patterns. Because of their complex structure, the performance d...

Full description

Bibliographic Details
Main Authors:	Barnes, Connelly, Adams, Andrew, Paris, Sylvain, Ragan-Kelley, Jonathan Millar, Durand, Fredo, Amarasinghe, Saman P.
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Language:	en_US
Published:	Association for Computing Machinery (ACM) 2014
Online Access:	http://hdl.handle.net/1721.1/85943 https://orcid.org/0000-0001-9919-069X https://orcid.org/0000-0002-7231-7643

_version_	1826207188797882368
author	Barnes, Connelly Adams, Andrew Paris, Sylvain Ragan-Kelley, Jonathan Millar Durand, Fredo Amarasinghe, Saman P.
author2	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Barnes, Connelly Adams, Andrew Paris, Sylvain Ragan-Kelley, Jonathan Millar Durand, Fredo Amarasinghe, Saman P.
author_sort	Barnes, Connelly
collection	MIT
description	Image processing pipelines combine the challenges of stencil computations and stream programs. They are composed of large graphs of different stencil stages, as well as complex reductions, and stages with global or data-dependent access patterns. Because of their complex structure, the performance difference between a naive implementation of a pipeline and an optimized one is often an order of magnitude. Efficient implementations require optimization of both parallelism and locality, but due to the nature of stencils, there is a fundamental tension between parallelism, locality, and introducing redundant recomputation of shared values. We present a systematic model of the tradeoff space fundamental to stencil pipelines, a schedule representation which describes concrete points in this space for each stage in an image processing pipeline, and an optimizing compiler for the Halide image processing language that synthesizes high performance implementations from a Halide algorithm and a schedule. Combining this compiler with stochastic search over the space of schedules enables terse, composable programs to achieve state-of-the-art performance on a wide range of real image processing pipelines, and across different hardware architectures, including multicores with SIMD, and heterogeneous CPU+GPU execution. From simple Halide programs written in a few hours, we demonstrate performance up to 5x faster than hand-tuned C, intrinsics, and CUDA implementations optimized by experts over weeks or months, for image processing applications beyond the reach of past automatic compilers.
first_indexed	2024-09-23T13:45:26Z
format	Article
id	mit-1721.1/85943
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T13:45:26Z
publishDate	2014
publisher	Association for Computing Machinery (ACM)
record_format	dspace
spelling	mit-1721.1/859432022-09-28T15:55:32Z Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines Barnes, Connelly Adams, Andrew Paris, Sylvain Ragan-Kelley, Jonathan Millar Durand, Fredo Amarasinghe, Saman P. Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Ragan-Kelley, Jonathan Millar Adams, Andrew Durand, Fredo Amarasinghe, Saman P. Image processing pipelines combine the challenges of stencil computations and stream programs. They are composed of large graphs of different stencil stages, as well as complex reductions, and stages with global or data-dependent access patterns. Because of their complex structure, the performance difference between a naive implementation of a pipeline and an optimized one is often an order of magnitude. Efficient implementations require optimization of both parallelism and locality, but due to the nature of stencils, there is a fundamental tension between parallelism, locality, and introducing redundant recomputation of shared values. We present a systematic model of the tradeoff space fundamental to stencil pipelines, a schedule representation which describes concrete points in this space for each stage in an image processing pipeline, and an optimizing compiler for the Halide image processing language that synthesizes high performance implementations from a Halide algorithm and a schedule. Combining this compiler with stochastic search over the space of schedules enables terse, composable programs to achieve state-of-the-art performance on a wide range of real image processing pipelines, and across different hardware architectures, including multicores with SIMD, and heterogeneous CPU+GPU execution. From simple Halide programs written in a few hours, we demonstrate performance up to 5x faster than hand-tuned C, intrinsics, and CUDA implementations optimized by experts over weeks or months, for image processing applications beyond the reach of past automatic compilers. United States. Dept. of Energy (Award DE-SC0005288) National Science Foundation (U.S.) (Grant 0964004) Intel Corporation Cognex Corporation Adobe Systems 2014-03-28T13:55:24Z 2014-03-28T13:55:24Z 2013-06 Article http://purl.org/eprint/type/ConferencePaper 03621340 http://hdl.handle.net/1721.1/85943 Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Fredo Durand, and Saman Amarasinghe. 2013. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. SIGPLAN Not. 48, no. 6 (June 23, 2013): 519. https://orcid.org/0000-0001-9919-069X https://orcid.org/0000-0002-7231-7643 en_US http://dx.doi.org/10.1145/2499370.2462176 ACM SIGPLAN Notices Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Association for Computing Machinery (ACM) MIT web domain
spellingShingle	Barnes, Connelly Adams, Andrew Paris, Sylvain Ragan-Kelley, Jonathan Millar Durand, Fredo Amarasinghe, Saman P. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines
title	Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines
title_full	Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines
title_fullStr	Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines
title_full_unstemmed	Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines
title_short	Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines
title_sort	halide a language and compiler for optimizing parallelism locality and recomputation in image processing pipelines
url	http://hdl.handle.net/1721.1/85943 https://orcid.org/0000-0001-9919-069X https://orcid.org/0000-0002-7231-7643
work_keys_str_mv	AT barnesconnelly halidealanguageandcompilerforoptimizingparallelismlocalityandrecomputationinimageprocessingpipelines AT adamsandrew halidealanguageandcompilerforoptimizingparallelismlocalityandrecomputationinimageprocessingpipelines AT parissylvain halidealanguageandcompilerforoptimizingparallelismlocalityandrecomputationinimageprocessingpipelines AT ragankelleyjonathanmillar halidealanguageandcompilerforoptimizingparallelismlocalityandrecomputationinimageprocessingpipelines AT durandfredo halidealanguageandcompilerforoptimizingparallelismlocalityandrecomputationinimageprocessingpipelines AT amarasinghesamanp halidealanguageandcompilerforoptimizingparallelismlocalityandrecomputationinimageprocessingpipelines

Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines

Similar Items