Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform

This research proposes the methodology to extract common configuration parameter set by applying multiple benchmarking applications include TeraSort, TestDFSIO, and MrBench on the Hadoop distributed file system. The parameter search space conceptually conducted named Ω(x) to hold status of all param...

Full description

Bibliographic Details
Main Authors: Jongyeop Kim, Abhilash Kancharla, Jongho Seol, Indy Park, Nohpill Park
Format: Article
Language:English
Published: Springer 2018-09-01
Series:International Journal of Networked and Distributed Computing (IJNDC)
Subjects:
Online Access:https://www.atlantis-press.com/article/125905550/view
_version_ 1797936893052059648
author Jongyeop Kim
Abhilash Kancharla
Jongho Seol
Indy Park
Nohpill Park
author_facet Jongyeop Kim
Abhilash Kancharla
Jongho Seol
Indy Park
Nohpill Park
author_sort Jongyeop Kim
collection DOAJ
description This research proposes the methodology to extract common configuration parameter set by applying multiple benchmarking applications include TeraSort, TestDFSIO, and MrBench on the Hadoop distributed file system. The parameter search space conceptually conducted named Ω(x) to hold status of all parameter values and its evaluation results for every stage to eventually reduce benchmarking cost. In the process of determining parameter set for each stage, one parameter and its associated values selected which is reduced system performance in terms of overall execution time difference that are measured by multiple applications on a Hadoop cluster. The experimental results demonstrate the proposed extended greedy manner provide a feasible benchmark model for the multiple MapReduce tasks. This model classified several candidate parameter value sets that can be reduced the overall execution time by 27% of the values against Hadoop default settings. Moreover, we propose e-heuristic greedy with alternative parameter selection model to evaluate second candidate parameter value which will lead global optimum by returning back to the previous stage if local minimum is not found at the current stage compare to the previous ones.
first_indexed 2024-04-10T18:36:18Z
format Article
id doaj.art-646938abffdb4ef196de34c8f2c2ff46
institution Directory Open Access Journal
issn 2211-7946
language English
last_indexed 2024-04-10T18:36:18Z
publishDate 2018-09-01
publisher Springer
record_format Article
series International Journal of Networked and Distributed Computing (IJNDC)
spelling doaj.art-646938abffdb4ef196de34c8f2c2ff462023-02-02T01:10:48ZengSpringerInternational Journal of Networked and Distributed Computing (IJNDC)2211-79462018-09-016410.2991/ijndc.2018.6.4.1Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data PlatformJongyeop KimAbhilash KancharlaJongho SeolIndy ParkNohpill ParkThis research proposes the methodology to extract common configuration parameter set by applying multiple benchmarking applications include TeraSort, TestDFSIO, and MrBench on the Hadoop distributed file system. The parameter search space conceptually conducted named Ω(x) to hold status of all parameter values and its evaluation results for every stage to eventually reduce benchmarking cost. In the process of determining parameter set for each stage, one parameter and its associated values selected which is reduced system performance in terms of overall execution time difference that are measured by multiple applications on a Hadoop cluster. The experimental results demonstrate the proposed extended greedy manner provide a feasible benchmark model for the multiple MapReduce tasks. This model classified several candidate parameter value sets that can be reduced the overall execution time by 27% of the values against Hadoop default settings. Moreover, we propose e-heuristic greedy with alternative parameter selection model to evaluate second candidate parameter value which will lead global optimum by returning back to the previous stage if local minimum is not found at the current stage compare to the previous ones.https://www.atlantis-press.com/article/125905550/viewBig dataHadoopconfigurationperformance tuning
spellingShingle Jongyeop Kim
Abhilash Kancharla
Jongho Seol
Indy Park
Nohpill Park
Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform
International Journal of Networked and Distributed Computing (IJNDC)
Big data
Hadoop
configuration
performance tuning
title Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform
title_full Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform
title_fullStr Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform
title_full_unstemmed Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform
title_short Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform
title_sort optimized common parameter set extraction framework by multiple benchmarking applications on a big data platform
topic Big data
Hadoop
configuration
performance tuning
url https://www.atlantis-press.com/article/125905550/view
work_keys_str_mv AT jongyeopkim optimizedcommonparametersetextractionframeworkbymultiplebenchmarkingapplicationsonabigdataplatform
AT abhilashkancharla optimizedcommonparametersetextractionframeworkbymultiplebenchmarkingapplicationsonabigdataplatform
AT jonghoseol optimizedcommonparametersetextractionframeworkbymultiplebenchmarkingapplicationsonabigdataplatform
AT indypark optimizedcommonparametersetextractionframeworkbymultiplebenchmarkingapplicationsonabigdataplatform
AT nohpillpark optimizedcommonparametersetextractionframeworkbymultiplebenchmarkingapplicationsonabigdataplatform