Time-aware forecasting of search volume categories and actual purchase

The new e-commerce field has attracted businesses of all sizes, retailers, and individuals. Consequently, there is an ongoing necessity for applications that can offer predictions on trending products and optimal selling time. This research suggests aiding businesses in forecasting demand for variou...

Full description

Bibliographic Details
Main Authors: Shahed Abdullhadi, Dana A. Al-Qudah, Bilal Abu-Salih
Format: Article
Language:English
Published: Elsevier 2024-02-01
Series:Heliyon
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S240584402401065X
_version_ 1797304577053163520
author Shahed Abdullhadi
Dana A. Al-Qudah
Bilal Abu-Salih
author_facet Shahed Abdullhadi
Dana A. Al-Qudah
Bilal Abu-Salih
author_sort Shahed Abdullhadi
collection DOAJ
description The new e-commerce field has attracted businesses of all sizes, retailers, and individuals. Consequently, there is an ongoing necessity for applications that can offer predictions on trending products and optimal selling time. This research suggests aiding businesses in forecasting demand for various product categories by employing data mining algorithms on multivariate time series data. To ensure the most recent information, real-time data was gathered through APIs to build the first block in this research. While search volume was derived from the Keywords Everywhere tool, Amazon's search volume was derived from the Helium 10 tool and external features about actual purchased data. The harvested raw datasets went through multiple processes to generate the dataset and were validated. The models XGBoost, Linear Regression, Random Forest, long-short-term memory, and K-nearest neighbor were employed to predict the trends, and the performance is demonstrated using evaluation metrics, namely Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Coefficient of Determination (R2). Overall, Linear Regression outperformed, especially at a correlation coefficient of 0.9, with R2 = 90.688, MAE = 0.038, MSE = 0.003, and RMSE = 0.057. KNN outperformed on correlation coefficient of 0.7, R2 = 85.129, MAE = 0.045, MSE = 0.005, and RMSE = 0.068. XGBoost produced the best results with a correlation coefficient of 0.9, yielding R2 = 85.89, MAE = 0.042, MSE = 0.004, and RMSE = 0.062. Random Forest, on the other hand, achieves peak metrics with a correlation coefficient of 0.6, R2 = 84.854, MAE = 0.041, MSE = 0.004, and RMSE = 0.066.
first_indexed 2024-03-08T00:11:28Z
format Article
id doaj.art-c4eb254898be422aba14eaffb15d653d
institution Directory Open Access Journal
issn 2405-8440
language English
last_indexed 2024-03-08T00:11:28Z
publishDate 2024-02-01
publisher Elsevier
record_format Article
series Heliyon
spelling doaj.art-c4eb254898be422aba14eaffb15d653d2024-02-17T06:39:23ZengElsevierHeliyon2405-84402024-02-01103e25034Time-aware forecasting of search volume categories and actual purchaseShahed Abdullhadi0Dana A. Al-Qudah1Bilal Abu-Salih2King Abdullah II School of Information Technology, The University of Jordan, JordanKing Abdullah II School of Information Technology, The University of Jordan, JordanCorresponding author.; King Abdullah II School of Information Technology, The University of Jordan, JordanThe new e-commerce field has attracted businesses of all sizes, retailers, and individuals. Consequently, there is an ongoing necessity for applications that can offer predictions on trending products and optimal selling time. This research suggests aiding businesses in forecasting demand for various product categories by employing data mining algorithms on multivariate time series data. To ensure the most recent information, real-time data was gathered through APIs to build the first block in this research. While search volume was derived from the Keywords Everywhere tool, Amazon's search volume was derived from the Helium 10 tool and external features about actual purchased data. The harvested raw datasets went through multiple processes to generate the dataset and were validated. The models XGBoost, Linear Regression, Random Forest, long-short-term memory, and K-nearest neighbor were employed to predict the trends, and the performance is demonstrated using evaluation metrics, namely Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Coefficient of Determination (R2). Overall, Linear Regression outperformed, especially at a correlation coefficient of 0.9, with R2 = 90.688, MAE = 0.038, MSE = 0.003, and RMSE = 0.057. KNN outperformed on correlation coefficient of 0.7, R2 = 85.129, MAE = 0.045, MSE = 0.005, and RMSE = 0.068. XGBoost produced the best results with a correlation coefficient of 0.9, yielding R2 = 85.89, MAE = 0.042, MSE = 0.004, and RMSE = 0.062. Random Forest, on the other hand, achieves peak metrics with a correlation coefficient of 0.6, R2 = 84.854, MAE = 0.041, MSE = 0.004, and RMSE = 0.066.http://www.sciencedirect.com/science/article/pii/S240584402401065XTime series forecastingVolume purchaseActual purchaseOnline marketE-commerce
spellingShingle Shahed Abdullhadi
Dana A. Al-Qudah
Bilal Abu-Salih
Time-aware forecasting of search volume categories and actual purchase
Heliyon
Time series forecasting
Volume purchase
Actual purchase
Online market
E-commerce
title Time-aware forecasting of search volume categories and actual purchase
title_full Time-aware forecasting of search volume categories and actual purchase
title_fullStr Time-aware forecasting of search volume categories and actual purchase
title_full_unstemmed Time-aware forecasting of search volume categories and actual purchase
title_short Time-aware forecasting of search volume categories and actual purchase
title_sort time aware forecasting of search volume categories and actual purchase
topic Time series forecasting
Volume purchase
Actual purchase
Online market
E-commerce
url http://www.sciencedirect.com/science/article/pii/S240584402401065X
work_keys_str_mv AT shahedabdullhadi timeawareforecastingofsearchvolumecategoriesandactualpurchase
AT danaaalqudah timeawareforecastingofsearchvolumecategoriesandactualpurchase
AT bilalabusalih timeawareforecastingofsearchvolumecategoriesandactualpurchase