Compile-time Techniques for Processor Allocation in Macro Dataflow Graphs for Multiprocessors

When compiling a progam consisting of multiple nested loops for execution on a multiprocessor, processor allocation is the problem of determining the number of processors over which to partition each nested loop. This paper presents processor allocation techniques for compiling such programs for mul...

Full description

Bibliographic Details
Main Authors: Prasanna, G.N. Srinivasa, Agarwal, Anant
Published: 2023
Online Access:https://hdl.handle.net/1721.1/149194
Description
Summary:When compiling a progam consisting of multiple nested loops for execution on a multiprocessor, processor allocation is the problem of determining the number of processors over which to partition each nested loop. This paper presents processor allocation techniques for compiling such programs for multiprocessors with local memory. Programs consisting of multiple loops, where the precedence constraints between the loops is known, can be viewed as macro dataflow graphs. Macro dataflow graphs comprise several macro nodes (or macro operations) that must be executed subject to prespecified precedence constraints. Optimal processor allocation specifies the number of processors computing each macro node and their sequencing to optimize run time. This paper presents computing each macro node and their sequencing to optimize run time. This paper presents computationally efficient techniques for determining the optimal processor allocation using estimated speedup functions of the macro nodes. These ideas have been implemented in a structure-driven compiler, SDC, for expressions of matrix operations. The paper presents the performance of the compiler for several matrix expressions on a simulator of the Alewife multiprocessor.