Multi-Objective Scientific-Workflow Scheduling With Data Movement Awareness in Cloud

Due to serving several purposes simultaneously, running scientific workflows on dynamic environments such as cloud computing, has become multi-objective scheduling. Among these purposes, Cost and Makespan are probably the most two primitive objectives. Another critical factor in a large-scale scient...

Full description

Bibliographic Details
Main Authors: Peerasak Wangsom, Kittichai Lavangnananda, Pascal Bouvry
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8924641/
_version_ 1818444281577209856
author Peerasak Wangsom
Kittichai Lavangnananda
Pascal Bouvry
author_facet Peerasak Wangsom
Kittichai Lavangnananda
Pascal Bouvry
author_sort Peerasak Wangsom
collection DOAJ
description Due to serving several purposes simultaneously, running scientific workflows on dynamic environments such as cloud computing, has become multi-objective scheduling. Among these purposes, Cost and Makespan are probably the most two primitive objectives. Another critical factor in a large-scale scientific workflow is tremendous amount of data during execution. Therefore, this work also includes Data Movement as an additional objective as it has a major impact on network utilization and energy consumption in network equipment in cloud data center. In considering these three objectives, this work proposes a framework for scheduling solutions which combines a new nodes clustering technique in Directed Acyclic Graph (DAG) model known as Multilevel Dependent Node Clustering (MDNC) and the multi-objective optimization, Extreme Nondominated Sorting Genetic Algorithm-III (E-NSGA-III). E-NSGA-III is the recent extension of Nondominated Sorting Genetic Algorithm (NSGA-III). Five well-known scientific workflows, CyberShake, Epigenomics, LIGO, Montage, and SIPHT are selected as testbeds, while the commonly known Hypervolume is chosen as the performance metric. In this work, MDNC is also experimented with both NSGA-III. Comparison among three approaches, E-NSGA-III alone, E-NSGA-III with Peer-to-Peer clustering and E-NSGA-III with MDNC are carried out. The superiority of the proposed framework among them and its limitation are discussed.
first_indexed 2024-12-14T19:13:27Z
format Article
id doaj.art-6fb3f097408041aeab14f5da4f04a006
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-14T19:13:27Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-6fb3f097408041aeab14f5da4f04a0062022-12-21T22:50:40ZengIEEEIEEE Access2169-35362019-01-01717706317708110.1109/ACCESS.2019.29579988924641Multi-Objective Scientific-Workflow Scheduling With Data Movement Awareness in CloudPeerasak Wangsom0https://orcid.org/0000-0002-3916-8973Kittichai Lavangnananda1https://orcid.org/0000-0002-9227-4839Pascal Bouvry2https://orcid.org/0000-0003-4473-8659Data Science and Engineering Laboratory, School of Information Technology, King Mongkut’s University of Technology Thonburi, Bangkok, ThailandData Science and Engineering Laboratory, School of Information Technology, King Mongkut’s University of Technology Thonburi, Bangkok, ThailandFSTC-CSC, SnT, University of Luxembourg, Luxembourg, LuxembourgDue to serving several purposes simultaneously, running scientific workflows on dynamic environments such as cloud computing, has become multi-objective scheduling. Among these purposes, Cost and Makespan are probably the most two primitive objectives. Another critical factor in a large-scale scientific workflow is tremendous amount of data during execution. Therefore, this work also includes Data Movement as an additional objective as it has a major impact on network utilization and energy consumption in network equipment in cloud data center. In considering these three objectives, this work proposes a framework for scheduling solutions which combines a new nodes clustering technique in Directed Acyclic Graph (DAG) model known as Multilevel Dependent Node Clustering (MDNC) and the multi-objective optimization, Extreme Nondominated Sorting Genetic Algorithm-III (E-NSGA-III). E-NSGA-III is the recent extension of Nondominated Sorting Genetic Algorithm (NSGA-III). Five well-known scientific workflows, CyberShake, Epigenomics, LIGO, Montage, and SIPHT are selected as testbeds, while the commonly known Hypervolume is chosen as the performance metric. In this work, MDNC is also experimented with both NSGA-III. Comparison among three approaches, E-NSGA-III alone, E-NSGA-III with Peer-to-Peer clustering and E-NSGA-III with MDNC are carried out. The superiority of the proposed framework among them and its limitation are discussed.https://ieeexplore.ieee.org/document/8924641/Cloud computingcostdata movementdirected acyclic graph (DAG)extreme nondominated sorting genetic algorithm-III (E-NSGA-III)makespan
spellingShingle Peerasak Wangsom
Kittichai Lavangnananda
Pascal Bouvry
Multi-Objective Scientific-Workflow Scheduling With Data Movement Awareness in Cloud
IEEE Access
Cloud computing
cost
data movement
directed acyclic graph (DAG)
extreme nondominated sorting genetic algorithm-III (E-NSGA-III)
makespan
title Multi-Objective Scientific-Workflow Scheduling With Data Movement Awareness in Cloud
title_full Multi-Objective Scientific-Workflow Scheduling With Data Movement Awareness in Cloud
title_fullStr Multi-Objective Scientific-Workflow Scheduling With Data Movement Awareness in Cloud
title_full_unstemmed Multi-Objective Scientific-Workflow Scheduling With Data Movement Awareness in Cloud
title_short Multi-Objective Scientific-Workflow Scheduling With Data Movement Awareness in Cloud
title_sort multi objective scientific workflow scheduling with data movement awareness in cloud
topic Cloud computing
cost
data movement
directed acyclic graph (DAG)
extreme nondominated sorting genetic algorithm-III (E-NSGA-III)
makespan
url https://ieeexplore.ieee.org/document/8924641/
work_keys_str_mv AT peerasakwangsom multiobjectivescientificworkflowschedulingwithdatamovementawarenessincloud
AT kittichailavangnananda multiobjectivescientificworkflowschedulingwithdatamovementawarenessincloud
AT pascalbouvry multiobjectivescientificworkflowschedulingwithdatamovementawarenessincloud