Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework

In order to develop post-COVID-19 culinary tourism, high-quality facilities and services are essential. Information technology can contribute through a top-k query-based decision-making system. This study implements top-k queries on a distributed Hadoop MapReduce system to evaluate its capability in...

Ful tanımlama

Detaylı Bibliyografya
Asıl Yazarlar: Nyoman Putri Utami Ni, Wijayanto Heri, Gede Putu Wirarama I.
Materyal Türü: Makale
Dil:English
Baskı/Yayın Bilgisi: EDP Sciences 2023-01-01
Seri Bilgileri:E3S Web of Conferences
Online Erişim:https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/102/e3sconf_icimece2023_02033.pdf
_version_ 1827373167791833088
author Nyoman Putri Utami Ni
Wijayanto Heri
Gede Putu Wirarama I.
author_facet Nyoman Putri Utami Ni
Wijayanto Heri
Gede Putu Wirarama I.
author_sort Nyoman Putri Utami Ni
collection DOAJ
description In order to develop post-COVID-19 culinary tourism, high-quality facilities and services are essential. Information technology can contribute through a top-k query-based decision-making system. This study implements top-k queries on a distributed Hadoop MapReduce system to evaluate its capability in managing data and selecting culinary tourism potential. Research findings indicate that for the European restaurant data from TripAdvisor with 5 dimensions, both single-node and multi-node (3 nodes) executions exhibit comparable execution times across various data quantities. Conversely, for the European restaurant data from TripAdvisor with 14 dimensions, the use of multi-node (3 nodes) tends to result in longer execution times compared to the single-node approach for larger data quantities. Furthermore, the utilization of multi-node (3 nodes) proves to be more efficient in processing synthetic data with 5 dimensions as the data quantity increases, demonstrating a significant difference in execution times compared to the single-node approach. The study also reveals that across different dimensions, the multi-node (3 nodes) approach generally outperforms the single-node approach in terms of speed. Regarding node variations, processing 20 million data points with 5 dimensions using 6 nodes yields the optimal method with the shortest execution time. By leveraging information technology and a top-k query-based decision-making system, the development of culinary tourism potential can be conducted more efficiently and effectively. The performance of MapReduce in processing culinary tourism potential data can be optimized by employing multi-node execution for large datasets and single-node execution for relatively smaller datasets.
first_indexed 2024-03-08T11:11:52Z
format Article
id doaj.art-e75471484a8848688c3c76a377b6e580
institution Directory Open Access Journal
issn 2267-1242
language English
last_indexed 2024-03-08T11:11:52Z
publishDate 2023-01-01
publisher EDP Sciences
record_format Article
series E3S Web of Conferences
spelling doaj.art-e75471484a8848688c3c76a377b6e5802024-01-26T10:42:38ZengEDP SciencesE3S Web of Conferences2267-12422023-01-014650203310.1051/e3sconf/202346502033e3sconf_icimece2023_02033Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce FrameworkNyoman Putri Utami Ni0Wijayanto Heri1Gede Putu Wirarama I.2dept. Informatics Engineering University of Mataram Jldept. Informatics Engineering University of Mataram Jldept. Informatics Engineering University of Mataram JlIn order to develop post-COVID-19 culinary tourism, high-quality facilities and services are essential. Information technology can contribute through a top-k query-based decision-making system. This study implements top-k queries on a distributed Hadoop MapReduce system to evaluate its capability in managing data and selecting culinary tourism potential. Research findings indicate that for the European restaurant data from TripAdvisor with 5 dimensions, both single-node and multi-node (3 nodes) executions exhibit comparable execution times across various data quantities. Conversely, for the European restaurant data from TripAdvisor with 14 dimensions, the use of multi-node (3 nodes) tends to result in longer execution times compared to the single-node approach for larger data quantities. Furthermore, the utilization of multi-node (3 nodes) proves to be more efficient in processing synthetic data with 5 dimensions as the data quantity increases, demonstrating a significant difference in execution times compared to the single-node approach. The study also reveals that across different dimensions, the multi-node (3 nodes) approach generally outperforms the single-node approach in terms of speed. Regarding node variations, processing 20 million data points with 5 dimensions using 6 nodes yields the optimal method with the shortest execution time. By leveraging information technology and a top-k query-based decision-making system, the development of culinary tourism potential can be conducted more efficiently and effectively. The performance of MapReduce in processing culinary tourism potential data can be optimized by employing multi-node execution for large datasets and single-node execution for relatively smaller datasets.https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/102/e3sconf_icimece2023_02033.pdf
spellingShingle Nyoman Putri Utami Ni
Wijayanto Heri
Gede Putu Wirarama I.
Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework
E3S Web of Conferences
title Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework
title_full Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework
title_fullStr Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework
title_full_unstemmed Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework
title_short Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework
title_sort top k query for large dataset of restaurant review based on hadoop mapreduce framework
url https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/102/e3sconf_icimece2023_02033.pdf
work_keys_str_mv AT nyomanputriutamini topkqueryforlargedatasetofrestaurantreviewbasedonhadoopmapreduceframework
AT wijayantoheri topkqueryforlargedatasetofrestaurantreviewbasedonhadoopmapreduceframework
AT gedeputuwiraramai topkqueryforlargedatasetofrestaurantreviewbasedonhadoopmapreduceframework