Parallel and Distributed Just-in-Time Shell Script Compilation

In the past several years, the shell has received renewed interest from the research community. This thesis describes the work I did to advance the performance and capabilities of the current state-of-the-art shell-script parallelization systems. In the first half of this thesis, I focus on my contr...

Full description

Bibliographic Details
Main Author: Mustafa, Tammam
Other Authors: Vasilakis, Nikos
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/147902
_version_ 1826214969607192576
author Mustafa, Tammam
author2 Vasilakis, Nikos
author_facet Vasilakis, Nikos
Mustafa, Tammam
author_sort Mustafa, Tammam
collection MIT
description In the past several years, the shell has received renewed interest from the research community. This thesis describes the work I did to advance the performance and capabilities of the current state-of-the-art shell-script parallelization systems. In the first half of this thesis, I focus on my contributions to PaSh-JIT, a JIT compiler for parallelizing POSIX shell scripts. In the second half, I explore the design and implementation of Distributed-PaSh, a shell that can utilize distributed computing resources and easily interface with distributed storage systems to efficiently execute data-processing pipelines. Distributed-PaSh analyzes the dataflow graph of a given script to create highly parallel data pipelines and execute those pipelines in a distributed cluster while giving special attention to data locality and movement. Distributed-PaSh achieves higher performance than single machine sequential and parallel shells.
first_indexed 2024-09-23T16:14:05Z
format Thesis
id mit-1721.1/147902
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T16:14:05Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1479022023-02-07T03:03:13Z Parallel and Distributed Just-in-Time Shell Script Compilation Mustafa, Tammam Vasilakis, Nikos Rinard, Martin C. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science In the past several years, the shell has received renewed interest from the research community. This thesis describes the work I did to advance the performance and capabilities of the current state-of-the-art shell-script parallelization systems. In the first half of this thesis, I focus on my contributions to PaSh-JIT, a JIT compiler for parallelizing POSIX shell scripts. In the second half, I explore the design and implementation of Distributed-PaSh, a shell that can utilize distributed computing resources and easily interface with distributed storage systems to efficiently execute data-processing pipelines. Distributed-PaSh analyzes the dataflow graph of a given script to create highly parallel data pipelines and execute those pipelines in a distributed cluster while giving special attention to data locality and movement. Distributed-PaSh achieves higher performance than single machine sequential and parallel shells. M.Eng. 2023-02-06T18:30:46Z 2023-02-06T18:30:46Z 2022-05 2022-05-27T16:18:52.588Z Thesis https://hdl.handle.net/1721.1/147902 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Mustafa, Tammam
Parallel and Distributed Just-in-Time Shell Script Compilation
title Parallel and Distributed Just-in-Time Shell Script Compilation
title_full Parallel and Distributed Just-in-Time Shell Script Compilation
title_fullStr Parallel and Distributed Just-in-Time Shell Script Compilation
title_full_unstemmed Parallel and Distributed Just-in-Time Shell Script Compilation
title_short Parallel and Distributed Just-in-Time Shell Script Compilation
title_sort parallel and distributed just in time shell script compilation
url https://hdl.handle.net/1721.1/147902
work_keys_str_mv AT mustafatammam parallelanddistributedjustintimeshellscriptcompilation