Optimizing Parallel Performance with Work and Span in the OpenCilk Compiler

OpenCilk is the modern iteration of Cilk, a multithreaded programming environment designed for high-performance multicore computing. OpenCilk consists of a LLVM fork called Tapir and a runtime scheduler, which, together, allow for OpenCilk’s high performance in practice. However, there are many oppo...

Full description

Bibliographic Details
Main Author: Reddy, Nikhil
Other Authors: Schardl, Tao B.
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/147495
Description
Summary:OpenCilk is the modern iteration of Cilk, a multithreaded programming environment designed for high-performance multicore computing. OpenCilk consists of a LLVM fork called Tapir and a runtime scheduler, which, together, allow for OpenCilk’s high performance in practice. However, there are many opportunities to improve on the implementation of OpenCilk. In particular, current grainsize calculations for OpenCilk rely on a notion of work that is limited in a few ways. In this thesis, I propose an alternate implementation that creates a first-class notion of work and span within the OpenCilk compiler that is then used to inform optimizations within OpenCilk. I then analyze the current formulas for grainsizes. I identify key scenarios where the compiler is unable to determine a suitable grainsize and use this to suggest improvements. Finally, I construct a benchmark of one of these scenarios using a Twitter follower dataset and empirically analyze optimal grainsizes for it.