Learned scheduling for database management systems

Parallel database management systems need efficient job scheduling. Currently systems use simple heuristics ignoring the characteristics of database workloads. Therefore, we created an effective scheduler that uses machine learning techniques, such as reinforcement learning and neural networks, and...

Full description

Bibliographic Details
Main Author: Ukyab, Tenzin Samten
Other Authors: Kraska, Tim
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/139086
_version_ 1811080330430382080
author Ukyab, Tenzin Samten
author2 Kraska, Tim
author_facet Kraska, Tim
Ukyab, Tenzin Samten
author_sort Ukyab, Tenzin Samten
collection MIT
description Parallel database management systems need efficient job scheduling. Currently systems use simple heuristics ignoring the characteristics of database workloads. Therefore, we created an effective scheduler that uses machine learning techniques, such as reinforcement learning and neural networks, and does not require human intervention beyond an objective, such as reducing average job completion time. We use existing training techniques for job schedulers with dependency constraints. However, the model is specialized for database workloads using features specific to database queries, such as node operator type. In addition, we represent pipelining scheduling opportunities between operator tasks. With further training time our learned scheduler will be able to improve the average job completion time in comparison to heuristic schedulers, such as FIFO and fair scheduling.
first_indexed 2024-09-23T11:29:31Z
format Thesis
id mit-1721.1/139086
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T11:29:31Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1390862022-01-15T03:40:55Z Learned scheduling for database management systems Ukyab, Tenzin Samten Kraska, Tim Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Parallel database management systems need efficient job scheduling. Currently systems use simple heuristics ignoring the characteristics of database workloads. Therefore, we created an effective scheduler that uses machine learning techniques, such as reinforcement learning and neural networks, and does not require human intervention beyond an objective, such as reducing average job completion time. We use existing training techniques for job schedulers with dependency constraints. However, the model is specialized for database workloads using features specific to database queries, such as node operator type. In addition, we represent pipelining scheduling opportunities between operator tasks. With further training time our learned scheduler will be able to improve the average job completion time in comparison to heuristic schedulers, such as FIFO and fair scheduling. M.Eng. 2022-01-14T14:49:07Z 2022-01-14T14:49:07Z 2021-06 2021-06-17T20:14:37.074Z Thesis https://hdl.handle.net/1721.1/139086 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Ukyab, Tenzin Samten
Learned scheduling for database management systems
title Learned scheduling for database management systems
title_full Learned scheduling for database management systems
title_fullStr Learned scheduling for database management systems
title_full_unstemmed Learned scheduling for database management systems
title_short Learned scheduling for database management systems
title_sort learned scheduling for database management systems
url https://hdl.handle.net/1721.1/139086
work_keys_str_mv AT ukyabtenzinsamten learnedschedulingfordatabasemanagementsystems