A Novel Data Management Scheme in Cloud for Micromachines

In cyber-physical systems (CPS), micromachines are typically deployed across a wide range of applications, including smart industry, smart healthcare, and smart cities. Providing on-premises resources for the storage and processing of huge data collected by such CPS applications is crucial. The clou...

Full description

Bibliographic Details
Main Authors: Gurwinder Singh, Rathinaraja Jeyaraj, Anil Sharma, Anand Paul
Format: Article
Language:English
Published: MDPI AG 2023-09-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/12/18/3807
_version_ 1797580455537541120
author Gurwinder Singh
Rathinaraja Jeyaraj
Anil Sharma
Anand Paul
author_facet Gurwinder Singh
Rathinaraja Jeyaraj
Anil Sharma
Anand Paul
author_sort Gurwinder Singh
collection DOAJ
description In cyber-physical systems (CPS), micromachines are typically deployed across a wide range of applications, including smart industry, smart healthcare, and smart cities. Providing on-premises resources for the storage and processing of huge data collected by such CPS applications is crucial. The cloud provides scalable storage and computation resources, typically through a cluster of virtual machines (VMs) with big data tools such as Hadoop MapReduce. In such a distributed environment, job latency and makespan are highly affected by excessive non-local executions due to various heterogeneities (hardware, VM, performance, and workload level). Existing approaches handle one or more of these heterogeneities; however, they do not account for the varying performance of storage disks. In this paper, we propose a prediction-based method for placing data blocks in virtual clusters to minimize the number of non-local executions. This is accomplished by applying a linear regression algorithm to determine the performance of disk storage on each physical machine hosting a virtual cluster. This allows us to place data blocks and execute map tasks where the data blocks are located. Furthermore, map tasks are scheduled based on VM performance to reduce job latency and makespan. We simulated our ideas and compared them with the existing schedulers in the Hadoop framework. The results show that the proposed method improves MapReduce performance in terms of job latency and makespan by minimizing non-local executions compared to other methods taken for evaluation.
first_indexed 2024-03-10T22:51:10Z
format Article
id doaj.art-63e8e37b9b164ecead95f1c13fffaf8d
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-10T22:51:10Z
publishDate 2023-09-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-63e8e37b9b164ecead95f1c13fffaf8d2023-11-19T10:21:36ZengMDPI AGElectronics2079-92922023-09-011218380710.3390/electronics12183807A Novel Data Management Scheme in Cloud for MicromachinesGurwinder Singh0Rathinaraja Jeyaraj1Anil Sharma2Anand Paul3Department of Computer Science and Applications, Sikh National College, Banga 144505, IndiaDepartment of Computer and Information Sciences, University of Houston-Victoria, Victoria, TX 77901, USASchool of Computer Applications, Lovely Professional University, Punjab 144001, IndiaSchool of Computer Science and Engineering, Kyungpook National University, Daegu 41566, Republic of KoreaIn cyber-physical systems (CPS), micromachines are typically deployed across a wide range of applications, including smart industry, smart healthcare, and smart cities. Providing on-premises resources for the storage and processing of huge data collected by such CPS applications is crucial. The cloud provides scalable storage and computation resources, typically through a cluster of virtual machines (VMs) with big data tools such as Hadoop MapReduce. In such a distributed environment, job latency and makespan are highly affected by excessive non-local executions due to various heterogeneities (hardware, VM, performance, and workload level). Existing approaches handle one or more of these heterogeneities; however, they do not account for the varying performance of storage disks. In this paper, we propose a prediction-based method for placing data blocks in virtual clusters to minimize the number of non-local executions. This is accomplished by applying a linear regression algorithm to determine the performance of disk storage on each physical machine hosting a virtual cluster. This allows us to place data blocks and execute map tasks where the data blocks are located. Furthermore, map tasks are scheduled based on VM performance to reduce job latency and makespan. We simulated our ideas and compared them with the existing schedulers in the Hadoop framework. The results show that the proposed method improves MapReduce performance in terms of job latency and makespan by minimizing non-local executions compared to other methods taken for evaluation.https://www.mdpi.com/2079-9292/12/18/3807cyber-physical systemdata block placementdata localityMapReduce scheduling
spellingShingle Gurwinder Singh
Rathinaraja Jeyaraj
Anil Sharma
Anand Paul
A Novel Data Management Scheme in Cloud for Micromachines
Electronics
cyber-physical system
data block placement
data locality
MapReduce scheduling
title A Novel Data Management Scheme in Cloud for Micromachines
title_full A Novel Data Management Scheme in Cloud for Micromachines
title_fullStr A Novel Data Management Scheme in Cloud for Micromachines
title_full_unstemmed A Novel Data Management Scheme in Cloud for Micromachines
title_short A Novel Data Management Scheme in Cloud for Micromachines
title_sort novel data management scheme in cloud for micromachines
topic cyber-physical system
data block placement
data locality
MapReduce scheduling
url https://www.mdpi.com/2079-9292/12/18/3807
work_keys_str_mv AT gurwindersingh anoveldatamanagementschemeincloudformicromachines
AT rathinarajajeyaraj anoveldatamanagementschemeincloudformicromachines
AT anilsharma anoveldatamanagementschemeincloudformicromachines
AT anandpaul anoveldatamanagementschemeincloudformicromachines
AT gurwindersingh noveldatamanagementschemeincloudformicromachines
AT rathinarajajeyaraj noveldatamanagementschemeincloudformicromachines
AT anilsharma noveldatamanagementschemeincloudformicromachines
AT anandpaul noveldatamanagementschemeincloudformicromachines