Predicting Model Training Time to Optimize Distributed Machine Learning Applications

Despite major advances in recent years, the field of Machine Learning continues to face research and technical challenges. Mostly, these stem from big data and streaming data, which require models to be frequently updated or re-trained, at the expense of significant computational resources. One solu...

Full description

Bibliographic Details
Main Authors:	Miguel Guimarães, Davide Carneiro, Guilherme Palumbo, Filipe Oliveira, Óscar Oliveira, Victor Alves, Paulo Novais
Format:	Article
Language:	English
Published:	MDPI AG 2023-02-01
Series:	Electronics
Subjects:	meta-learning machine learning distributed learning training time optimization
Online Access:	https://www.mdpi.com/2079-9292/12/4/871

_version_	1797621324575670272
author	Miguel Guimarães Davide Carneiro Guilherme Palumbo Filipe Oliveira Óscar Oliveira Victor Alves Paulo Novais
author_facet	Miguel Guimarães Davide Carneiro Guilherme Palumbo Filipe Oliveira Óscar Oliveira Victor Alves Paulo Novais
author_sort	Miguel Guimarães
collection	DOAJ
description	Despite major advances in recent years, the field of Machine Learning continues to face research and technical challenges. Mostly, these stem from big data and streaming data, which require models to be frequently updated or re-trained, at the expense of significant computational resources. One solution is the use of distributed learning algorithms, which can learn in a distributed manner, from distributed datasets. In this paper, we describe CEDEs—a distributed learning system in which models are heterogeneous distributed Ensembles, i.e., complex models constituted by different base models, trained with different and distributed subsets of data. Specifically, we address the issue of predicting the training time of a given model, given its characteristics and the characteristics of the data. Given that the creation of an Ensemble may imply the training of hundreds of base models, information about the predicted duration of each of these individual tasks is paramount for an efficient management of the cluster’s computational resources and for minimizing makespan, i.e., the time it takes to train the whole Ensemble. Results show that the proposed approach is able to predict the training time of Decision Trees with an average error of 0.103 s, and the training time of Neural Networks with an average error of 21.263 s. We also show how results depend significantly on the hyperparameters of the model and on the characteristics of the input data.
first_indexed	2024-03-11T08:55:19Z
format	Article
id	doaj.art-d1e77c2301eb426bb7f77fa22bf8e077
institution	Directory Open Access Journal
issn	2079-9292
language	English
last_indexed	2024-03-11T08:55:19Z
publishDate	2023-02-01
publisher	MDPI AG
record_format	Article
series	Electronics
spelling	doaj.art-d1e77c2301eb426bb7f77fa22bf8e0772023-11-16T20:11:17ZengMDPI AGElectronics2079-92922023-02-0112487110.3390/electronics12040871Predicting Model Training Time to Optimize Distributed Machine Learning ApplicationsMiguel Guimarães0Davide Carneiro1Guilherme Palumbo2Filipe Oliveira3Óscar Oliveira4Victor Alves5Paulo Novais6CIICESI, ESTG, Politécnico do Porto, 4610-156 Felgueiras, PortugalCIICESI, ESTG, Politécnico do Porto, 4610-156 Felgueiras, PortugalCIICESI, ESTG, Politécnico do Porto, 4610-156 Felgueiras, PortugalCIICESI, ESTG, Politécnico do Porto, 4610-156 Felgueiras, PortugalCIICESI, ESTG, Politécnico do Porto, 4610-156 Felgueiras, PortugalALGORITMI Research Centre/LASI, University of Minho, 4710-057 Braga, PortugalALGORITMI Research Centre/LASI, University of Minho, 4710-057 Braga, PortugalDespite major advances in recent years, the field of Machine Learning continues to face research and technical challenges. Mostly, these stem from big data and streaming data, which require models to be frequently updated or re-trained, at the expense of significant computational resources. One solution is the use of distributed learning algorithms, which can learn in a distributed manner, from distributed datasets. In this paper, we describe CEDEs—a distributed learning system in which models are heterogeneous distributed Ensembles, i.e., complex models constituted by different base models, trained with different and distributed subsets of data. Specifically, we address the issue of predicting the training time of a given model, given its characteristics and the characteristics of the data. Given that the creation of an Ensemble may imply the training of hundreds of base models, information about the predicted duration of each of these individual tasks is paramount for an efficient management of the cluster’s computational resources and for minimizing makespan, i.e., the time it takes to train the whole Ensemble. Results show that the proposed approach is able to predict the training time of Decision Trees with an average error of 0.103 s, and the training time of Neural Networks with an average error of 21.263 s. We also show how results depend significantly on the hyperparameters of the model and on the characteristics of the input data.https://www.mdpi.com/2079-9292/12/4/871meta-learningmachine learningdistributed learningtraining timeoptimization
spellingShingle	Miguel Guimarães Davide Carneiro Guilherme Palumbo Filipe Oliveira Óscar Oliveira Victor Alves Paulo Novais Predicting Model Training Time to Optimize Distributed Machine Learning Applications Electronics meta-learning machine learning distributed learning training time optimization
title	Predicting Model Training Time to Optimize Distributed Machine Learning Applications
title_full	Predicting Model Training Time to Optimize Distributed Machine Learning Applications
title_fullStr	Predicting Model Training Time to Optimize Distributed Machine Learning Applications
title_full_unstemmed	Predicting Model Training Time to Optimize Distributed Machine Learning Applications
title_short	Predicting Model Training Time to Optimize Distributed Machine Learning Applications
title_sort	predicting model training time to optimize distributed machine learning applications
topic	meta-learning machine learning distributed learning training time optimization
url	https://www.mdpi.com/2079-9292/12/4/871
work_keys_str_mv	AT miguelguimaraes predictingmodeltrainingtimetooptimizedistributedmachinelearningapplications AT davidecarneiro predictingmodeltrainingtimetooptimizedistributedmachinelearningapplications AT guilhermepalumbo predictingmodeltrainingtimetooptimizedistributedmachinelearningapplications AT filipeoliveira predictingmodeltrainingtimetooptimizedistributedmachinelearningapplications AT oscaroliveira predictingmodeltrainingtimetooptimizedistributedmachinelearningapplications AT victoralves predictingmodeltrainingtimetooptimizedistributedmachinelearningapplications AT paulonovais predictingmodeltrainingtimetooptimizedistributedmachinelearningapplications

Predicting Model Training Time to Optimize Distributed Machine Learning Applications

Similar Items