Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform
This research proposes the methodology to extract common configuration parameter set by applying multiple benchmarking applications include TeraSort, TestDFSIO, and MrBench on the Hadoop distributed file system. The parameter search space conceptually conducted named Ω(x) to hold status of all param...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2018-09-01
|
Series: | International Journal of Networked and Distributed Computing (IJNDC) |
Subjects: | |
Online Access: | https://www.atlantis-press.com/article/125905550/view |
_version_ | 1797936893052059648 |
---|---|
author | Jongyeop Kim Abhilash Kancharla Jongho Seol Indy Park Nohpill Park |
author_facet | Jongyeop Kim Abhilash Kancharla Jongho Seol Indy Park Nohpill Park |
author_sort | Jongyeop Kim |
collection | DOAJ |
description | This research proposes the methodology to extract common configuration parameter set by applying multiple benchmarking applications include TeraSort, TestDFSIO, and MrBench on the Hadoop distributed file system. The parameter search space conceptually conducted named Ω(x) to hold status of all parameter values and its evaluation results for every stage to eventually reduce benchmarking cost. In the process of determining parameter set for each stage, one parameter and its associated values selected which is reduced system performance in terms of overall execution time difference that are measured by multiple applications on a Hadoop cluster. The experimental results demonstrate the proposed extended greedy manner provide a feasible benchmark model for the multiple MapReduce tasks. This model classified several candidate parameter value sets that can be reduced the overall execution time by 27% of the values against Hadoop default settings. Moreover, we propose e-heuristic greedy with alternative parameter selection model to evaluate second candidate parameter value which will lead global optimum by returning back to the previous stage if local minimum is not found at the current stage compare to the previous ones. |
first_indexed | 2024-04-10T18:36:18Z |
format | Article |
id | doaj.art-646938abffdb4ef196de34c8f2c2ff46 |
institution | Directory Open Access Journal |
issn | 2211-7946 |
language | English |
last_indexed | 2024-04-10T18:36:18Z |
publishDate | 2018-09-01 |
publisher | Springer |
record_format | Article |
series | International Journal of Networked and Distributed Computing (IJNDC) |
spelling | doaj.art-646938abffdb4ef196de34c8f2c2ff462023-02-02T01:10:48ZengSpringerInternational Journal of Networked and Distributed Computing (IJNDC)2211-79462018-09-016410.2991/ijndc.2018.6.4.1Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data PlatformJongyeop KimAbhilash KancharlaJongho SeolIndy ParkNohpill ParkThis research proposes the methodology to extract common configuration parameter set by applying multiple benchmarking applications include TeraSort, TestDFSIO, and MrBench on the Hadoop distributed file system. The parameter search space conceptually conducted named Ω(x) to hold status of all parameter values and its evaluation results for every stage to eventually reduce benchmarking cost. In the process of determining parameter set for each stage, one parameter and its associated values selected which is reduced system performance in terms of overall execution time difference that are measured by multiple applications on a Hadoop cluster. The experimental results demonstrate the proposed extended greedy manner provide a feasible benchmark model for the multiple MapReduce tasks. This model classified several candidate parameter value sets that can be reduced the overall execution time by 27% of the values against Hadoop default settings. Moreover, we propose e-heuristic greedy with alternative parameter selection model to evaluate second candidate parameter value which will lead global optimum by returning back to the previous stage if local minimum is not found at the current stage compare to the previous ones.https://www.atlantis-press.com/article/125905550/viewBig dataHadoopconfigurationperformance tuning |
spellingShingle | Jongyeop Kim Abhilash Kancharla Jongho Seol Indy Park Nohpill Park Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform International Journal of Networked and Distributed Computing (IJNDC) Big data Hadoop configuration performance tuning |
title | Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform |
title_full | Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform |
title_fullStr | Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform |
title_full_unstemmed | Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform |
title_short | Optimized Common Parameter Set Extraction Framework by Multiple Benchmarking Applications on a Big Data Platform |
title_sort | optimized common parameter set extraction framework by multiple benchmarking applications on a big data platform |
topic | Big data Hadoop configuration performance tuning |
url | https://www.atlantis-press.com/article/125905550/view |
work_keys_str_mv | AT jongyeopkim optimizedcommonparametersetextractionframeworkbymultiplebenchmarkingapplicationsonabigdataplatform AT abhilashkancharla optimizedcommonparametersetextractionframeworkbymultiplebenchmarkingapplicationsonabigdataplatform AT jonghoseol optimizedcommonparametersetextractionframeworkbymultiplebenchmarkingapplicationsonabigdataplatform AT indypark optimizedcommonparametersetextractionframeworkbymultiplebenchmarkingapplicationsonabigdataplatform AT nohpillpark optimizedcommonparametersetextractionframeworkbymultiplebenchmarkingapplicationsonabigdataplatform |