Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework
In order to develop post-COVID-19 culinary tourism, high-quality facilities and services are essential. Information technology can contribute through a top-k query-based decision-making system. This study implements top-k queries on a distributed Hadoop MapReduce system to evaluate its capability in...
Asıl Yazarlar: | , , |
---|---|
Materyal Türü: | Makale |
Dil: | English |
Baskı/Yayın Bilgisi: |
EDP Sciences
2023-01-01
|
Seri Bilgileri: | E3S Web of Conferences |
Online Erişim: | https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/102/e3sconf_icimece2023_02033.pdf |
_version_ | 1827373167791833088 |
---|---|
author | Nyoman Putri Utami Ni Wijayanto Heri Gede Putu Wirarama I. |
author_facet | Nyoman Putri Utami Ni Wijayanto Heri Gede Putu Wirarama I. |
author_sort | Nyoman Putri Utami Ni |
collection | DOAJ |
description | In order to develop post-COVID-19 culinary tourism, high-quality facilities and services are essential. Information technology can contribute through a top-k query-based decision-making system. This study implements top-k queries on a distributed Hadoop MapReduce system to evaluate its capability in managing data and selecting culinary tourism potential. Research findings indicate that for the European restaurant data from TripAdvisor with 5 dimensions, both single-node and multi-node (3 nodes) executions exhibit comparable execution times across various data quantities. Conversely, for the European restaurant data from TripAdvisor with 14 dimensions, the use of multi-node (3 nodes) tends to result in longer execution times compared to the single-node approach for larger data quantities. Furthermore, the utilization of multi-node (3 nodes) proves to be more efficient in processing synthetic data with 5 dimensions as the data quantity increases, demonstrating a significant difference in execution times compared to the single-node approach. The study also reveals that across different dimensions, the multi-node (3 nodes) approach generally outperforms the single-node approach in terms of speed. Regarding node variations, processing 20 million data points with 5 dimensions using 6 nodes yields the optimal method with the shortest execution time. By leveraging information technology and a top-k query-based decision-making system, the development of culinary tourism potential can be conducted more efficiently and effectively. The performance of MapReduce in processing culinary tourism potential data can be optimized by employing multi-node execution for large datasets and single-node execution for relatively smaller datasets. |
first_indexed | 2024-03-08T11:11:52Z |
format | Article |
id | doaj.art-e75471484a8848688c3c76a377b6e580 |
institution | Directory Open Access Journal |
issn | 2267-1242 |
language | English |
last_indexed | 2024-03-08T11:11:52Z |
publishDate | 2023-01-01 |
publisher | EDP Sciences |
record_format | Article |
series | E3S Web of Conferences |
spelling | doaj.art-e75471484a8848688c3c76a377b6e5802024-01-26T10:42:38ZengEDP SciencesE3S Web of Conferences2267-12422023-01-014650203310.1051/e3sconf/202346502033e3sconf_icimece2023_02033Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce FrameworkNyoman Putri Utami Ni0Wijayanto Heri1Gede Putu Wirarama I.2dept. Informatics Engineering University of Mataram Jldept. Informatics Engineering University of Mataram Jldept. Informatics Engineering University of Mataram JlIn order to develop post-COVID-19 culinary tourism, high-quality facilities and services are essential. Information technology can contribute through a top-k query-based decision-making system. This study implements top-k queries on a distributed Hadoop MapReduce system to evaluate its capability in managing data and selecting culinary tourism potential. Research findings indicate that for the European restaurant data from TripAdvisor with 5 dimensions, both single-node and multi-node (3 nodes) executions exhibit comparable execution times across various data quantities. Conversely, for the European restaurant data from TripAdvisor with 14 dimensions, the use of multi-node (3 nodes) tends to result in longer execution times compared to the single-node approach for larger data quantities. Furthermore, the utilization of multi-node (3 nodes) proves to be more efficient in processing synthetic data with 5 dimensions as the data quantity increases, demonstrating a significant difference in execution times compared to the single-node approach. The study also reveals that across different dimensions, the multi-node (3 nodes) approach generally outperforms the single-node approach in terms of speed. Regarding node variations, processing 20 million data points with 5 dimensions using 6 nodes yields the optimal method with the shortest execution time. By leveraging information technology and a top-k query-based decision-making system, the development of culinary tourism potential can be conducted more efficiently and effectively. The performance of MapReduce in processing culinary tourism potential data can be optimized by employing multi-node execution for large datasets and single-node execution for relatively smaller datasets.https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/102/e3sconf_icimece2023_02033.pdf |
spellingShingle | Nyoman Putri Utami Ni Wijayanto Heri Gede Putu Wirarama I. Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework E3S Web of Conferences |
title | Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework |
title_full | Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework |
title_fullStr | Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework |
title_full_unstemmed | Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework |
title_short | Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework |
title_sort | top k query for large dataset of restaurant review based on hadoop mapreduce framework |
url | https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/102/e3sconf_icimece2023_02033.pdf |
work_keys_str_mv | AT nyomanputriutamini topkqueryforlargedatasetofrestaurantreviewbasedonhadoopmapreduceframework AT wijayantoheri topkqueryforlargedatasetofrestaurantreviewbasedonhadoopmapreduceframework AT gedeputuwiraramai topkqueryforlargedatasetofrestaurantreviewbasedonhadoopmapreduceframework |