Machine learning regression to boost scheduling performance in hyper-scale cloud-computing data centres

Data centres increase their size and complexity due to the increasing amount of heterogeneous workloads and patterns to be served. Such a mix of various purpose workloads makes the optimisation of resource management systems according to temporal or application-level patterns difficult. Data-centre...

Full description

Bibliographic Details
Main Authors:	Damián Fernández-Cerero, José A. Troyano, Agnieszka Jakóbik, Alejandro Fernández-Montes
Format:	Article
Language:	English
Published:	Elsevier 2022-06-01
Series:	Journal of King Saud University: Computer and Information Sciences
Subjects:	Data centre Cloud computing Scheduling optimisation Machine learning Gradient boosting
Online Access:	http://www.sciencedirect.com/science/article/pii/S1319157822001367

_version_	1811236666005782528
author	Damián Fernández-Cerero José A. Troyano Agnieszka Jakóbik Alejandro Fernández-Montes
author_facet	Damián Fernández-Cerero José A. Troyano Agnieszka Jakóbik Alejandro Fernández-Montes
author_sort	Damián Fernández-Cerero
collection	DOAJ
description	Data centres increase their size and complexity due to the increasing amount of heterogeneous workloads and patterns to be served. Such a mix of various purpose workloads makes the optimisation of resource management systems according to temporal or application-level patterns difficult. Data-centre operators have developed multiple resource-management models to improve scheduling performance in controlled scenarios. However, the constant evolution of the workloads makes the utilisation of only one resource-management model sub-optimal in some scenarios.In this work, we propose: (a) a machine learning regression model based on gradient boosting to predict the time a resource manager needs to schedule incoming jobs for a given period; and (b) a resource management model, Boost, that takes advantage of this regression model to predict the scheduling time of a catalogue of resource managers so that the most performant can be used for a time span.The benefits of the proposed resource-management model are analysed by comparing its scheduling performance KPIs to those provided by the two most popular resource-management models: two-level, used by Apache Mesos, and shared-state, employed by Google Borg. Such gains are empirically evaluated by simulating a hyper-scale data centre that executes a realistic synthetically generated workload that follows real-world trace patterns.
first_indexed	2024-04-12T12:11:56Z
format	Article
id	doaj.art-f9cce10f9a264a0baca9a89dbd9e070b
institution	Directory Open Access Journal
issn	1319-1578
language	English
last_indexed	2024-04-12T12:11:56Z
publishDate	2022-06-01
publisher	Elsevier
record_format	Article
series	Journal of King Saud University: Computer and Information Sciences
spelling	doaj.art-f9cce10f9a264a0baca9a89dbd9e070b2022-12-22T03:33:33ZengElsevierJournal of King Saud University: Computer and Information Sciences1319-15782022-06-0134631913203Machine learning regression to boost scheduling performance in hyper-scale cloud-computing data centresDamián Fernández-Cerero0José A. Troyano1Agnieszka Jakóbik2Alejandro Fernández-Montes3Department of Computer Languages and Systems, University of Seville, Avda. Reina Mercedes s/n., 41012 Seville, Spain; Corresponding author.Department of Computer Languages and Systems, University of Seville, Avda. Reina Mercedes s/n., 41012 Seville, SpainDepartment of Computer Science, Cracow University of Technology, Cracow, PolandDepartment of Computer Languages and Systems, University of Seville, Avda. Reina Mercedes s/n., 41012 Seville, SpainData centres increase their size and complexity due to the increasing amount of heterogeneous workloads and patterns to be served. Such a mix of various purpose workloads makes the optimisation of resource management systems according to temporal or application-level patterns difficult. Data-centre operators have developed multiple resource-management models to improve scheduling performance in controlled scenarios. However, the constant evolution of the workloads makes the utilisation of only one resource-management model sub-optimal in some scenarios.In this work, we propose: (a) a machine learning regression model based on gradient boosting to predict the time a resource manager needs to schedule incoming jobs for a given period; and (b) a resource management model, Boost, that takes advantage of this regression model to predict the scheduling time of a catalogue of resource managers so that the most performant can be used for a time span.The benefits of the proposed resource-management model are analysed by comparing its scheduling performance KPIs to those provided by the two most popular resource-management models: two-level, used by Apache Mesos, and shared-state, employed by Google Borg. Such gains are empirically evaluated by simulating a hyper-scale data centre that executes a realistic synthetically generated workload that follows real-world trace patterns.http://www.sciencedirect.com/science/article/pii/S1319157822001367Data centreCloud computingScheduling optimisationMachine learningGradient boosting
spellingShingle	Damián Fernández-Cerero José A. Troyano Agnieszka Jakóbik Alejandro Fernández-Montes Machine learning regression to boost scheduling performance in hyper-scale cloud-computing data centres Journal of King Saud University: Computer and Information Sciences Data centre Cloud computing Scheduling optimisation Machine learning Gradient boosting
title	Machine learning regression to boost scheduling performance in hyper-scale cloud-computing data centres
title_full	Machine learning regression to boost scheduling performance in hyper-scale cloud-computing data centres
title_fullStr	Machine learning regression to boost scheduling performance in hyper-scale cloud-computing data centres
title_full_unstemmed	Machine learning regression to boost scheduling performance in hyper-scale cloud-computing data centres
title_short	Machine learning regression to boost scheduling performance in hyper-scale cloud-computing data centres
title_sort	machine learning regression to boost scheduling performance in hyper scale cloud computing data centres
topic	Data centre Cloud computing Scheduling optimisation Machine learning Gradient boosting
url	http://www.sciencedirect.com/science/article/pii/S1319157822001367
work_keys_str_mv	AT damianfernandezcerero machinelearningregressiontoboostschedulingperformanceinhyperscalecloudcomputingdatacentres AT joseatroyano machinelearningregressiontoboostschedulingperformanceinhyperscalecloudcomputingdatacentres AT agnieszkajakobik machinelearningregressiontoboostschedulingperformanceinhyperscalecloudcomputingdatacentres AT alejandrofernandezmontes machinelearningregressiontoboostschedulingperformanceinhyperscalecloudcomputingdatacentres

Machine learning regression to boost scheduling performance in hyper-scale cloud-computing data centres

Similar Items