Optimization and scheduling of applications in a heterogeneous CPU-GPU environment
With the emergence of General Purpose computation on GPU (GPGPU) and corresponding programming frameworks (OpenCL, CUDA), more applications are being ported to use GPUs as a co-processor to achieve performance that could not be accomplished using just the traditional processors. However, programmin...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Language: | English |
Published: |
2014
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/61727 |
_version_ | 1811697506841526272 |
---|---|
author | Karan Rajendra Shetti |
author2 | Suhaib A. Fahmy |
author_facet | Suhaib A. Fahmy Karan Rajendra Shetti |
author_sort | Karan Rajendra Shetti |
collection | NTU |
description | With the emergence of General Purpose computation on GPU (GPGPU) and corresponding programming frameworks (OpenCL, CUDA), more applications are being ported to use GPUs as a co-processor to achieve performance that could not be accomplished using just the traditional processors. However, programming the GPUs is not a trivial task and depends on the experience and knowledge of the individual programmer. The main problem is identifying which task or job should be allocated to a particular device. The problem is further complicated due to the dissimilar computational power of the CPU and the GPU. Therefore, there is a genuine need to optimize the workload balance. This thesis presents the work done toward the author's post graduate study and describes the optimization of the Heterogeneous Earliest Finish Time (HEFT) algorithm in the CPU-GPU heterogeneous environment. In the initial chapters, different scheduling principles available are described and an in depth analysis of three state of the art algorithms for the chosen heterogeneous environment is presented. A comparison of fine-grained with coarse-grained scheduling paradigms is also studied. Using state of the art StarPU scheduling framework and exhaustive benchmarks, it is shown that the fine grained approach in much more efficient for the CPU-GPU environment. A novel optimization of the HEFT algorithm that takes advantage of dissimilar execution times of the processors is proposed. By balancing the locally optimal result with the globally optimal result, it is shown that performance can be improved significantly without any change in the complexity of the algorithm (as compared to HEFT). HEFT-NC (No-Cross) is compared with HEFT both in terms of speedup and schedule length. It is shown that the HEFT-NC outperforms HEFT algorithm and is consistent across different graph shapes and task sizes. |
first_indexed | 2024-10-01T07:56:21Z |
format | Thesis |
id | ntu-10356/61727 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T07:56:21Z |
publishDate | 2014 |
record_format | dspace |
spelling | ntu-10356/617272023-03-04T00:37:56Z Optimization and scheduling of applications in a heterogeneous CPU-GPU environment Karan Rajendra Shetti Suhaib A. Fahmy School of Computer Engineering DRNTU::Engineering::Computer science and engineering With the emergence of General Purpose computation on GPU (GPGPU) and corresponding programming frameworks (OpenCL, CUDA), more applications are being ported to use GPUs as a co-processor to achieve performance that could not be accomplished using just the traditional processors. However, programming the GPUs is not a trivial task and depends on the experience and knowledge of the individual programmer. The main problem is identifying which task or job should be allocated to a particular device. The problem is further complicated due to the dissimilar computational power of the CPU and the GPU. Therefore, there is a genuine need to optimize the workload balance. This thesis presents the work done toward the author's post graduate study and describes the optimization of the Heterogeneous Earliest Finish Time (HEFT) algorithm in the CPU-GPU heterogeneous environment. In the initial chapters, different scheduling principles available are described and an in depth analysis of three state of the art algorithms for the chosen heterogeneous environment is presented. A comparison of fine-grained with coarse-grained scheduling paradigms is also studied. Using state of the art StarPU scheduling framework and exhaustive benchmarks, it is shown that the fine grained approach in much more efficient for the CPU-GPU environment. A novel optimization of the HEFT algorithm that takes advantage of dissimilar execution times of the processors is proposed. By balancing the locally optimal result with the globally optimal result, it is shown that performance can be improved significantly without any change in the complexity of the algorithm (as compared to HEFT). HEFT-NC (No-Cross) is compared with HEFT both in terms of speedup and schedule length. It is shown that the HEFT-NC outperforms HEFT algorithm and is consistent across different graph shapes and task sizes. MASTER OF ENGINEERING (SCE) 2014-08-28T07:37:41Z 2014-08-28T07:37:41Z 2014 2014 Thesis https://hdl.handle.net/10356/61727 10.32657/10356/61727 en 97 p. application/pdf |
spellingShingle | DRNTU::Engineering::Computer science and engineering Karan Rajendra Shetti Optimization and scheduling of applications in a heterogeneous CPU-GPU environment |
title | Optimization and scheduling of applications in a heterogeneous CPU-GPU environment |
title_full | Optimization and scheduling of applications in a heterogeneous CPU-GPU environment |
title_fullStr | Optimization and scheduling of applications in a heterogeneous CPU-GPU environment |
title_full_unstemmed | Optimization and scheduling of applications in a heterogeneous CPU-GPU environment |
title_short | Optimization and scheduling of applications in a heterogeneous CPU-GPU environment |
title_sort | optimization and scheduling of applications in a heterogeneous cpu gpu environment |
topic | DRNTU::Engineering::Computer science and engineering |
url | https://hdl.handle.net/10356/61727 |
work_keys_str_mv | AT karanrajendrashetti optimizationandschedulingofapplicationsinaheterogeneouscpugpuenvironment |