Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework

In order to develop post-COVID-19 culinary tourism, high-quality facilities and services are essential. Information technology can contribute through a top-k query-based decision-making system. This study implements top-k queries on a distributed Hadoop MapReduce system to evaluate its capability in...

Ful tanımlama

Detaylı Bibliyografya
Asıl Yazarlar:	Nyoman Putri Utami Ni, Wijayanto Heri, Gede Putu Wirarama I.
Materyal Türü:	Makale
Dil:	English
Baskı/Yayın Bilgisi:	EDP Sciences 2023-01-01
Seri Bilgileri:	E3S Web of Conferences
Online Erişim:	https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/102/e3sconf_icimece2023_02033.pdf

_version_	1827373167791833088
author	Nyoman Putri Utami Ni Wijayanto Heri Gede Putu Wirarama I.
author_facet	Nyoman Putri Utami Ni Wijayanto Heri Gede Putu Wirarama I.
author_sort	Nyoman Putri Utami Ni
collection	DOAJ
description	In order to develop post-COVID-19 culinary tourism, high-quality facilities and services are essential. Information technology can contribute through a top-k query-based decision-making system. This study implements top-k queries on a distributed Hadoop MapReduce system to evaluate its capability in managing data and selecting culinary tourism potential. Research findings indicate that for the European restaurant data from TripAdvisor with 5 dimensions, both single-node and multi-node (3 nodes) executions exhibit comparable execution times across various data quantities. Conversely, for the European restaurant data from TripAdvisor with 14 dimensions, the use of multi-node (3 nodes) tends to result in longer execution times compared to the single-node approach for larger data quantities. Furthermore, the utilization of multi-node (3 nodes) proves to be more efficient in processing synthetic data with 5 dimensions as the data quantity increases, demonstrating a significant difference in execution times compared to the single-node approach. The study also reveals that across different dimensions, the multi-node (3 nodes) approach generally outperforms the single-node approach in terms of speed. Regarding node variations, processing 20 million data points with 5 dimensions using 6 nodes yields the optimal method with the shortest execution time. By leveraging information technology and a top-k query-based decision-making system, the development of culinary tourism potential can be conducted more efficiently and effectively. The performance of MapReduce in processing culinary tourism potential data can be optimized by employing multi-node execution for large datasets and single-node execution for relatively smaller datasets.
first_indexed	2024-03-08T11:11:52Z
format	Article
id	doaj.art-e75471484a8848688c3c76a377b6e580
institution	Directory Open Access Journal
issn	2267-1242
language	English
last_indexed	2024-03-08T11:11:52Z
publishDate	2023-01-01
publisher	EDP Sciences
record_format	Article
series	E3S Web of Conferences
spelling	doaj.art-e75471484a8848688c3c76a377b6e5802024-01-26T10:42:38ZengEDP SciencesE3S Web of Conferences2267-12422023-01-014650203310.1051/e3sconf/202346502033e3sconf_icimece2023_02033Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce FrameworkNyoman Putri Utami Ni0Wijayanto Heri1Gede Putu Wirarama I.2dept. Informatics Engineering University of Mataram Jldept. Informatics Engineering University of Mataram Jldept. Informatics Engineering University of Mataram JlIn order to develop post-COVID-19 culinary tourism, high-quality facilities and services are essential. Information technology can contribute through a top-k query-based decision-making system. This study implements top-k queries on a distributed Hadoop MapReduce system to evaluate its capability in managing data and selecting culinary tourism potential. Research findings indicate that for the European restaurant data from TripAdvisor with 5 dimensions, both single-node and multi-node (3 nodes) executions exhibit comparable execution times across various data quantities. Conversely, for the European restaurant data from TripAdvisor with 14 dimensions, the use of multi-node (3 nodes) tends to result in longer execution times compared to the single-node approach for larger data quantities. Furthermore, the utilization of multi-node (3 nodes) proves to be more efficient in processing synthetic data with 5 dimensions as the data quantity increases, demonstrating a significant difference in execution times compared to the single-node approach. The study also reveals that across different dimensions, the multi-node (3 nodes) approach generally outperforms the single-node approach in terms of speed. Regarding node variations, processing 20 million data points with 5 dimensions using 6 nodes yields the optimal method with the shortest execution time. By leveraging information technology and a top-k query-based decision-making system, the development of culinary tourism potential can be conducted more efficiently and effectively. The performance of MapReduce in processing culinary tourism potential data can be optimized by employing multi-node execution for large datasets and single-node execution for relatively smaller datasets.https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/102/e3sconf_icimece2023_02033.pdf
spellingShingle	Nyoman Putri Utami Ni Wijayanto Heri Gede Putu Wirarama I. Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework E3S Web of Conferences
title	Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework
title_full	Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework
title_fullStr	Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework
title_full_unstemmed	Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework
title_short	Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework
title_sort	top k query for large dataset of restaurant review based on hadoop mapreduce framework
url	https://www.e3s-conferences.org/articles/e3sconf/pdf/2023/102/e3sconf_icimece2023_02033.pdf
work_keys_str_mv	AT nyomanputriutamini topkqueryforlargedatasetofrestaurantreviewbasedonhadoopmapreduceframework AT wijayantoheri topkqueryforlargedatasetofrestaurantreviewbasedonhadoopmapreduceframework AT gedeputuwiraramai topkqueryforlargedatasetofrestaurantreviewbasedonhadoopmapreduceframework

Top-K Query for Large Dataset of Restaurant Review Based-on Hadoop MapReduce Framework

Benzer Materyaller