Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms
This study aims to forecast New York and Los Angeles gasoline spot prices on a daily frequency. The dataset includes gasoline prices and a big set of 128 other relevant variables spanning the period from 17 February 2004 to 26 March 2022. These variables were fed to three tree-based machine learning...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-03-01
|
Series: | Energies |
Subjects: | |
Online Access: | https://www.mdpi.com/1996-1073/17/6/1296 |
_version_ | 1797241319928627200 |
---|---|
author | Emmanouil Sofianos Emmanouil Zaganidis Theophilos Papadimitriou Periklis Gogas |
author_facet | Emmanouil Sofianos Emmanouil Zaganidis Theophilos Papadimitriou Periklis Gogas |
author_sort | Emmanouil Sofianos |
collection | DOAJ |
description | This study aims to forecast New York and Los Angeles gasoline spot prices on a daily frequency. The dataset includes gasoline prices and a big set of 128 other relevant variables spanning the period from 17 February 2004 to 26 March 2022. These variables were fed to three tree-based machine learning algorithms: decision trees, random forest, and XGBoost. Furthermore, a variable importance measure (VIM) technique was applied to identify and rank the most important explanatory variables. The optimal model, a trained random forest, achieves a mean absolute percent error (MAPE) in the out-of-sample of 3.23% for the New York and 3.78% for the Los Angeles gasoline spot prices. The first lag, AR (1), of gasoline is the most important variable in both markets; the top five variables are all energy-related. This paper can strengthen the understanding of price determinants and has the potential to inform strategic decisions and policy directions within the energy sector, making it a valuable asset for both industry practitioners and policymakers. |
first_indexed | 2024-04-24T18:21:27Z |
format | Article |
id | doaj.art-8e3232a73151443bba6c4208402590e5 |
institution | Directory Open Access Journal |
issn | 1996-1073 |
language | English |
last_indexed | 2024-04-24T18:21:27Z |
publishDate | 2024-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Energies |
spelling | doaj.art-8e3232a73151443bba6c4208402590e52024-03-27T13:35:22ZengMDPI AGEnergies1996-10732024-03-01176129610.3390/en17061296Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning AlgorithmsEmmanouil Sofianos0Emmanouil Zaganidis1Theophilos Papadimitriou2Periklis Gogas3Bureau d’Economie Théorique et Appliquée (BETA), University of Strasbourg, 67085 Strasbourg, FranceDepartment of Economics, Democritus University of Thrace, 69100 Komotini, GreeceDepartment of Economics, Democritus University of Thrace, 69100 Komotini, GreeceDepartment of Economics, Democritus University of Thrace, 69100 Komotini, GreeceThis study aims to forecast New York and Los Angeles gasoline spot prices on a daily frequency. The dataset includes gasoline prices and a big set of 128 other relevant variables spanning the period from 17 February 2004 to 26 March 2022. These variables were fed to three tree-based machine learning algorithms: decision trees, random forest, and XGBoost. Furthermore, a variable importance measure (VIM) technique was applied to identify and rank the most important explanatory variables. The optimal model, a trained random forest, achieves a mean absolute percent error (MAPE) in the out-of-sample of 3.23% for the New York and 3.78% for the Los Angeles gasoline spot prices. The first lag, AR (1), of gasoline is the most important variable in both markets; the top five variables are all energy-related. This paper can strengthen the understanding of price determinants and has the potential to inform strategic decisions and policy directions within the energy sector, making it a valuable asset for both industry practitioners and policymakers.https://www.mdpi.com/1996-1073/17/6/1296gasolinedecision treerandom forestXGBoostmachine learningforecasting |
spellingShingle | Emmanouil Sofianos Emmanouil Zaganidis Theophilos Papadimitriou Periklis Gogas Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms Energies gasoline decision tree random forest XGBoost machine learning forecasting |
title | Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms |
title_full | Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms |
title_fullStr | Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms |
title_full_unstemmed | Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms |
title_short | Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms |
title_sort | forecasting east and west coast gasoline prices with tree based machine learning algorithms |
topic | gasoline decision tree random forest XGBoost machine learning forecasting |
url | https://www.mdpi.com/1996-1073/17/6/1296 |
work_keys_str_mv | AT emmanouilsofianos forecastingeastandwestcoastgasolinepriceswithtreebasedmachinelearningalgorithms AT emmanouilzaganidis forecastingeastandwestcoastgasolinepriceswithtreebasedmachinelearningalgorithms AT theophilospapadimitriou forecastingeastandwestcoastgasolinepriceswithtreebasedmachinelearningalgorithms AT periklisgogas forecastingeastandwestcoastgasolinepriceswithtreebasedmachinelearningalgorithms |