Fast Instruction Selection for Fast Digital Signal Processing
Modern vector processors support a wide variety of instructions for fixed-point digital signal processing. These instructions support a proliferation of rounding, saturating, and type conversion modes, and are often fused combinations of more primitive operations. While these are common idioms in fi...
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
ACM
2024
|
Online Access: | https://hdl.handle.net/1721.1/153625 |
_version_ | 1826200960907608064 |
---|---|
author | Root, Alexander J Ahmad, Maaz Bin Safeer Sharlet, Dillon Adams, Andrew Kamil, Shoaib Ragan-Kelley, Jonathan |
author2 | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
author_facet | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Root, Alexander J Ahmad, Maaz Bin Safeer Sharlet, Dillon Adams, Andrew Kamil, Shoaib Ragan-Kelley, Jonathan |
author_sort | Root, Alexander J |
collection | MIT |
description | Modern vector processors support a wide variety of instructions for fixed-point digital signal processing. These instructions support a proliferation of rounding, saturating, and type conversion modes, and are often fused combinations of more primitive operations. While these are common idioms in fixed-point signal processing, it is difficult to use these operations in portable code. It is challenging for programmers to write down portable integer arithmetic in a C-like language that corresponds exactly to one of these instructions, and even more challenging for compilers to recognize when these instructions can be used. Our system, Pitchfork, defines a portable fixed-point intermediate representation, FPIR, that captures common idioms in fixed-point code. FPIR can be used directly by programmers experienced with fixed-point, or Pitchfork can automatically lift from integer operations into FPIR using a term-rewriting system (TRS) composed of verified manual and automatically-synthesized rules. Pitchfork then lowers from FPIR into target-specific fixed-point instructions using a set of target-specific TRSs. We show that this approach improves runtime performance of portably-written fixed-point signal processing code in Halide, across a range of benchmarks, by geomean 1.31x on x86 with AVX2, 1.82x on ARM Neon, and 2.44x on Hexagon HVX compared to a standard LLVM-based compiler flow, while maintaining or improving existing compile times. |
first_indexed | 2024-09-23T11:44:25Z |
format | Article |
id | mit-1721.1/153625 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2025-02-19T04:20:48Z |
publishDate | 2024 |
publisher | ACM |
record_format | dspace |
spelling | mit-1721.1/1536252024-11-20T20:18:08Z Fast Instruction Selection for Fast Digital Signal Processing Root, Alexander J Ahmad, Maaz Bin Safeer Sharlet, Dillon Adams, Andrew Kamil, Shoaib Ragan-Kelley, Jonathan Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Modern vector processors support a wide variety of instructions for fixed-point digital signal processing. These instructions support a proliferation of rounding, saturating, and type conversion modes, and are often fused combinations of more primitive operations. While these are common idioms in fixed-point signal processing, it is difficult to use these operations in portable code. It is challenging for programmers to write down portable integer arithmetic in a C-like language that corresponds exactly to one of these instructions, and even more challenging for compilers to recognize when these instructions can be used. Our system, Pitchfork, defines a portable fixed-point intermediate representation, FPIR, that captures common idioms in fixed-point code. FPIR can be used directly by programmers experienced with fixed-point, or Pitchfork can automatically lift from integer operations into FPIR using a term-rewriting system (TRS) composed of verified manual and automatically-synthesized rules. Pitchfork then lowers from FPIR into target-specific fixed-point instructions using a set of target-specific TRSs. We show that this approach improves runtime performance of portably-written fixed-point signal processing code in Halide, across a range of benchmarks, by geomean 1.31x on x86 with AVX2, 1.82x on ARM Neon, and 2.44x on Hexagon HVX compared to a standard LLVM-based compiler flow, while maintaining or improving existing compile times. 2024-03-01T15:48:06Z 2024-03-01T15:48:06Z 2023-03-25 2024-03-01T08:45:40Z Article http://purl.org/eprint/type/ConferencePaper 979-8-4007-0394-2 https://hdl.handle.net/1721.1/153625 Root, Alexander J, Ahmad, Maaz Bin Safeer, Sharlet, Dillon, Adams, Andrew, Kamil, Shoaib et al. 2023. "Fast Instruction Selection for Fast Digital Signal Processing." PUBLISHER_CC en 10.1145/3623278.3624768 Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/ The author(s) application/pdf ACM Association for Computing Machinery |
spellingShingle | Root, Alexander J Ahmad, Maaz Bin Safeer Sharlet, Dillon Adams, Andrew Kamil, Shoaib Ragan-Kelley, Jonathan Fast Instruction Selection for Fast Digital Signal Processing |
title | Fast Instruction Selection for Fast Digital Signal Processing |
title_full | Fast Instruction Selection for Fast Digital Signal Processing |
title_fullStr | Fast Instruction Selection for Fast Digital Signal Processing |
title_full_unstemmed | Fast Instruction Selection for Fast Digital Signal Processing |
title_short | Fast Instruction Selection for Fast Digital Signal Processing |
title_sort | fast instruction selection for fast digital signal processing |
url | https://hdl.handle.net/1721.1/153625 |
work_keys_str_mv | AT rootalexanderj fastinstructionselectionforfastdigitalsignalprocessing AT ahmadmaazbinsafeer fastinstructionselectionforfastdigitalsignalprocessing AT sharletdillon fastinstructionselectionforfastdigitalsignalprocessing AT adamsandrew fastinstructionselectionforfastdigitalsignalprocessing AT kamilshoaib fastinstructionselectionforfastdigitalsignalprocessing AT ragankelleyjonathan fastinstructionselectionforfastdigitalsignalprocessing |